We propose MaskingDepth, a semi-supervised learning framework for monocular depth estimation. MaskingDepth is designed to enforce consistency between the depths obtained from strongly-augmented images and the pseudo-depths derived from weakly-augmented images, which enables mitigating the reliance on large ground-truth depth quantities. In this framework, we leverage uncertainty estimation to only retain high-confident depth predictions from the weakly-augmented branch as pseudo-depths. We also present a novel data augmentation, dubbed K-way disjoint masking, that takes advantage of a naïve token masking strategy as an augmentation, while avoiding its scale ambiguity problem between depths from weakly-and strongly-augmented branches and risk of missing small-scale objects. Experiments on KITTI and NYU-Depth-v2 datasets demonstrate the effectiveness of each component, its robustness to the use of fewer depth-annotated images, and superior performance compared to other state-of-the-art semi-supervised learning methods for monocular depth estimation.

Baek, J., Kim, G., Park, S., An, H., Poggi, M., Kim, S. (2024). MaskingDepth: Masked Consistency Regularization for Semi-Supervised Monocular Depth Estimation. Institute of Electrical and Electronics Engineers Inc. [10.1109/iros58592.2024.10801719].

MaskingDepth: Masked Consistency Regularization for Semi-Supervised Monocular Depth Estimation

Poggi, Matteo;
2024

Abstract

We propose MaskingDepth, a semi-supervised learning framework for monocular depth estimation. MaskingDepth is designed to enforce consistency between the depths obtained from strongly-augmented images and the pseudo-depths derived from weakly-augmented images, which enables mitigating the reliance on large ground-truth depth quantities. In this framework, we leverage uncertainty estimation to only retain high-confident depth predictions from the weakly-augmented branch as pseudo-depths. We also present a novel data augmentation, dubbed K-way disjoint masking, that takes advantage of a naïve token masking strategy as an augmentation, while avoiding its scale ambiguity problem between depths from weakly-and strongly-augmented branches and risk of missing small-scale objects. Experiments on KITTI and NYU-Depth-v2 datasets demonstrate the effectiveness of each component, its robustness to the use of fewer depth-annotated images, and superior performance compared to other state-of-the-art semi-supervised learning methods for monocular depth estimation.
2024
IEEE International Conference on Intelligent Robots and Systems
2755
2762
Baek, J., Kim, G., Park, S., An, H., Poggi, M., Kim, S. (2024). MaskingDepth: Masked Consistency Regularization for Semi-Supervised Monocular Depth Estimation. Institute of Electrical and Electronics Engineers Inc. [10.1109/iros58592.2024.10801719].
Baek, Jongbeom; Kim, Gyeongnyeon; Park, Seonghoon; An, Honggyu; Poggi, Matteo; Kim, Seungryong
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1010502
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact