Recent advancements have shown the potential of leveraging both point clouds and images to localize anomalies. Nevertheless, their applicability in industrial manufacturing is often constrained by significant drawbacks, such as the use of memory banks, which lead to a substantial increase in terms of memory footprint and inference time. We propose a novel light and fast framework that learns to map features from one modality to the other on nominal samples and detect anomalies by pinpointing inconsistencies between observed and mapped features. Extensive experiments show that our approach achieves state-of-the-art detection and segmentation performance, in both the standard and few-shot settings, on the MVTec 3D-AD dataset while achieving faster inference and occupying less memory than previous multimodal AD methods. Furthermore, we propose a layer pruning technique to improve memory and time efficiency with a marginal sacrifice in performance.

Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping / Alex Costanzino; Pierluigi Zama Ramirez; Giuseppe Lisanti; Luigi Di Stefano. - ELETTRONICO. - (In stampa/Attività in corso), pp. 1-10. (Intervento presentato al convegno 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) tenutosi a Seattle WA, USA nel 17 - 21 June 2024).

Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping

Alex Costanzino;Pierluigi Zama Ramirez;Giuseppe Lisanti;Luigi Di Stefano
In corso di stampa

Abstract

Recent advancements have shown the potential of leveraging both point clouds and images to localize anomalies. Nevertheless, their applicability in industrial manufacturing is often constrained by significant drawbacks, such as the use of memory banks, which lead to a substantial increase in terms of memory footprint and inference time. We propose a novel light and fast framework that learns to map features from one modality to the other on nominal samples and detect anomalies by pinpointing inconsistencies between observed and mapped features. Extensive experiments show that our approach achieves state-of-the-art detection and segmentation performance, in both the standard and few-shot settings, on the MVTec 3D-AD dataset while achieving faster inference and occupying less memory than previous multimodal AD methods. Furthermore, we propose a layer pruning technique to improve memory and time efficiency with a marginal sacrifice in performance.
In corso di stampa
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
1
10
Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping / Alex Costanzino; Pierluigi Zama Ramirez; Giuseppe Lisanti; Luigi Di Stefano. - ELETTRONICO. - (In stampa/Attività in corso), pp. 1-10. (Intervento presentato al convegno 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) tenutosi a Seattle WA, USA nel 17 - 21 June 2024).
Alex Costanzino; Pierluigi Zama Ramirez; Giuseppe Lisanti; Luigi Di Stefano
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/968592
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact