Nowadays, we are witnessing the wide diffusion of active depth sensors. However, their different building technologies and the small-scale single-sensor datasets negatively affect the generalization capabilities and performance of deep learning approaches based on depth data, especially when used for recognition purposes. In this paper, we present a systematic comparison on the use of depth data for the deep face recognition task, focusing the analysis on different data representations, pre-processing steps, and normalization techniques. Depth and normal images, voxels, and point clouds are computed from depth maps and tested with several well-known deep architectures. Extensive intra- and cross-dataset experiments, performed on four public databases, suggest that representations and methods based on normal images and point clouds perform and generalize better than other 2D and 3D alternatives. Moreover, we propose an extremely challenging dataset, namely MultiSFace, to specifically analyze the influence of the depth map quality and the acquisition distance on the face recognition accuracy.

A Systematic Comparison of Depth Map Representations for Face Recognition

Guido Borghi;Davide Maltoni;
2021

Abstract

Nowadays, we are witnessing the wide diffusion of active depth sensors. However, their different building technologies and the small-scale single-sensor datasets negatively affect the generalization capabilities and performance of deep learning approaches based on depth data, especially when used for recognition purposes. In this paper, we present a systematic comparison on the use of depth data for the deep face recognition task, focusing the analysis on different data representations, pre-processing steps, and normalization techniques. Depth and normal images, voxels, and point clouds are computed from depth maps and tested with several well-known deep architectures. Extensive intra- and cross-dataset experiments, performed on four public databases, suggest that representations and methods based on normal images and point clouds perform and generalize better than other 2D and 3D alternatives. Moreover, we propose an extremely challenging dataset, namely MultiSFace, to specifically analyze the influence of the depth map quality and the acquisition distance on the face recognition accuracy.
SENSORS
Stefano Pini; Guido Borghi; Roberto Vezzani; Davide Maltoni; Rita Cucchiara
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/791440
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 3
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact