CRIS Current Research Information System

Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.

Ballerini F., Zama Ramirez P., Mirabella R., Salti S., Di Stefano L. (2024). Connecting NeRFs, Images, and Text [10.1109/CVPRW63382.2024.00092].

Connecting NeRFs, Images, and Text

Ballerini F.;Zama Ramirez P.;Mirabella R.;Salti S.;Di Stefano L.

2024

Abstract

Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024
			
	Pagina iniziale
	
				866
			
	Pagina finale
	
				876
			
	Collana/Serie
	
				IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CVPRW63382.2024.00092
			
	Citazione
	
				Ballerini F.,  Zama Ramirez P.,  Mirabella R.,  Salti S.,  Di Stefano L. (2024). Connecting NeRFs, Images, and Text [10.1109/CVPRW63382.2024.00092].
			
	Tutti gli autori
	
						Ballerini F.; Zama Ramirez P.; Mirabella R.; Salti S.; Di Stefano L.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/994974

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

ND

social impact