CRIS Current Research Information System

We introduce Stereo Anywhere, a novel stereo-matching framework that combines geometric constraints with robust priors from monocular depth Vision Foundation Models (VFMs). By elegantly coupling these complementary worlds through a dual-branch architecture, we seamlessly integrate stereo matching with learned contextual cues. Following this design, our framework introduces novel cost volume fusion mechanisms that effectively handle critical challenges such as textureless regions, occlusions, and non-Lambertian surfaces. Through our novel optical illusion dataset, MonoTrap, and extensive evaluation across multiple benchmarks, we demonstrate that our synthetic-only trained model achieves state-of-the-art results in zero-shot generalization, significantly outperforming existing solutions while showing remarkable robustness to challenging cases such as mirrors and transparencies.

Bartolomei, L., Tosi, F., Poggi, M., Mattoccia, S. (2025). Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA : IEEE Computer Society [10.1109/cvpr52734.2025.00103].

Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail

Bartolomei, Luca;Tosi, Fabio;Poggi, Matteo;Mattoccia, Stefano

2025

Abstract

We introduce Stereo Anywhere, a novel stereo-matching framework that combines geometric constraints with robust priors from monocular depth Vision Foundation Models (VFMs). By elegantly coupling these complementary worlds through a dual-branch architecture, we seamlessly integrate stereo matching with learned contextual cues. Following this design, our framework introduces novel cost volume fusion mechanisms that effectively handle critical challenges such as textureless regions, occlusions, and non-Lambertian surfaces. Through our novel optical illusion dataset, MonoTrap, and extensive evaluation across multiple benchmarks, we demonstrate that our synthetic-only trained model achieves state-of-the-art results in zero-shot generalization, significantly outperforming existing solutions while showing remarkable robustness to challenging cases such as mirrors and transparencies.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Proceedings of 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
			
	Pagina iniziale
	
				1013
			
	Pagina finale
	
				1027
			
	Collana/Serie
	
				PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
			
	Codice DOI
	
				https://dx.doi.org/10.1109/cvpr52734.2025.00103
			
	Citazione
	
				Bartolomei, L., Tosi, F., Poggi, M., Mattoccia, S. (2025). Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA : IEEE Computer Society [10.1109/cvpr52734.2025.00103].
			
	Tutti gli autori
	
						Bartolomei, Luca; Tosi, Fabio; Poggi, Matteo; Mattoccia, Stefano
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2412.04472v2.pdf accesso aperto Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 49.28 MB Formato Adobe PDF Visualizza/Apri	49.28 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1043931

Citazioni

ND

ND

6

ND

social impact