Surgical scene understanding in Minimally Invasive Surgery (MIS) is crucial for advancing Computer-Assisted Intervention (CAI) applications, enhancing surgical safety, and improving navigation. This work introduces a novel multi-task learning framework that jointly performs binary surgical tool segmentation and monocular depth estimation in laparoscopic surgical scenes. The framework employs a staged learning strategy: first, leveraging widely available tool segmentation datasets to pre-train the network, followed by multi-task training using pseudo-masks and self-supervised monocular depth estimation. Extensive experiments demonstrate the effectiveness of the proposed framework, achieving competitive performance on depth estimation compared to state-of-the-art methods. Validation on two publicly available datasets highlights its robustness and adaptability across diverse surgical scenarios. These results emphasize the potential of multi-task learning to advance laparoscopic surgical perception. The implementation is available on GitHub.

Mazzocchetti, S., Cercenelli, L., Marcelli, E. (2025). Surgical Instrument Segmentation and Self-Supervised Monocular Depth Estimation in Minimally Invasive Surgery: A Multi-task Learning Approach. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-95838-0_28].

Surgical Instrument Segmentation and Self-Supervised Monocular Depth Estimation in Minimally Invasive Surgery: A Multi-task Learning Approach

Mazzocchetti S.;Cercenelli L.;Marcelli E.
2025

Abstract

Surgical scene understanding in Minimally Invasive Surgery (MIS) is crucial for advancing Computer-Assisted Intervention (CAI) applications, enhancing surgical safety, and improving navigation. This work introduces a novel multi-task learning framework that jointly performs binary surgical tool segmentation and monocular depth estimation in laparoscopic surgical scenes. The framework employs a staged learning strategy: first, leveraging widely available tool segmentation datasets to pre-train the network, followed by multi-task training using pseudo-masks and self-supervised monocular depth estimation. Extensive experiments demonstrate the effectiveness of the proposed framework, achieving competitive performance on depth estimation compared to state-of-the-art methods. Validation on two publicly available datasets highlights its robustness and adaptability across diverse surgical scenarios. These results emphasize the potential of multi-task learning to advance laparoscopic surgical perception. The implementation is available on GitHub.
2025
Lecture Notes in Computer Science
283
292
Mazzocchetti, S., Cercenelli, L., Marcelli, E. (2025). Surgical Instrument Segmentation and Self-Supervised Monocular Depth Estimation in Minimally Invasive Surgery: A Multi-task Learning Approach. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-95838-0_28].
Mazzocchetti, S.; Cercenelli, L.; Marcelli, E.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1026832
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact