Bird’s Eye View (BEV) semantic maps have recently garnered a lot of attention as a useful representation of the environment to tackle assisted and autonomous driving tasks. However, most of the existing work focuses on the fully supervised setting, training neural networks on large annotated datasets. In this work, we present RendBEV, a new method to train BEV semantic segmentation networks without direct BEV supervision. We leverage rendering with neural density fields or monocular depth estimation models to shift the supervision to semantic perspective views, where targets can be computed by a 2D semantic segmentation model. Through extensive experimental work on the KITTI-360 and nuScenes datasets, we show that RendBEV enables BEV semantic segmentation with no BEV supervision, and delivers competitive results in this challenging setting. When used as pretraining to then fine-tune on labeled BEV ground truth, our method boosts performance in low-annotation regimes, outperforming models trained from scratch and improving upon competing methods (on nuScenes) or being on-par with them (on KITTI-360).

Monteagudo, H.P., Taccari, L., Pjetri, A., Sambo, F., Salti, S. (2026). RendBEV: Semantic Perspective View Rendering as Supervision for Bird’s Eye View Segmentation. IEEE ACCESS, 14, 12255-12272 [10.1109/access.2026.3656618].

RendBEV: Semantic Perspective View Rendering as Supervision for Bird’s Eye View Segmentation

Monteagudo, Henrique Pineiro
;
Salti, Samuele
2026

Abstract

Bird’s Eye View (BEV) semantic maps have recently garnered a lot of attention as a useful representation of the environment to tackle assisted and autonomous driving tasks. However, most of the existing work focuses on the fully supervised setting, training neural networks on large annotated datasets. In this work, we present RendBEV, a new method to train BEV semantic segmentation networks without direct BEV supervision. We leverage rendering with neural density fields or monocular depth estimation models to shift the supervision to semantic perspective views, where targets can be computed by a 2D semantic segmentation model. Through extensive experimental work on the KITTI-360 and nuScenes datasets, we show that RendBEV enables BEV semantic segmentation with no BEV supervision, and delivers competitive results in this challenging setting. When used as pretraining to then fine-tune on labeled BEV ground truth, our method boosts performance in low-annotation regimes, outperforming models trained from scratch and improving upon competing methods (on nuScenes) or being on-par with them (on KITTI-360).
2026
Monteagudo, H.P., Taccari, L., Pjetri, A., Sambo, F., Salti, S. (2026). RendBEV: Semantic Perspective View Rendering as Supervision for Bird’s Eye View Segmentation. IEEE ACCESS, 14, 12255-12272 [10.1109/access.2026.3656618].
Monteagudo, Henrique Pineiro; Taccari, Leonardo; Pjetri, Aurel; Sambo, Francesco; Salti, Samuele
File in questo prodotto:
File Dimensione Formato  
RendBEV_Semantic_Perspective_View_Rendering_as_Supervision_for_Birds_Eye_View_Segmentation.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 3.63 MB
Formato Adobe PDF
3.63 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040090
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact