Voice Activity Detection (VAD) refers to the task of identifying human voice activity in noisy settings, playing a crucial role in fields like speech recognition and audio surveillance. However, most VAD research focuses on English, leaving other languages, such as Italian, under-explored. This study aims to evaluate and enhance VAD systems for Italian speech, with the goal of finding a solution for the speech segmentation component of the Digital Linguistic Biomarkers (DLBs) extraction pipeline for early mental disorder diagnosis. We experimented with various VAD systems and proposed an ensemble VAD system. Our ensemble system shows improvements in speech event detection. This advancement lays a robust foundation for more accurate early detection of mental health issues using DLBs in Italian.

Zhang, S., Gagliardi, G., Tamburini, F. (2024). Voice Activity Detection on Italian Language. Aachen : CEUR Workshop Proceedings (CEUR-WS.org).

Voice Activity Detection on Italian Language

Zhang S.;Gagliardi G.;Tamburini F.
2024

Abstract

Voice Activity Detection (VAD) refers to the task of identifying human voice activity in noisy settings, playing a crucial role in fields like speech recognition and audio surveillance. However, most VAD research focuses on English, leaving other languages, such as Italian, under-explored. This study aims to evaluate and enhance VAD systems for Italian speech, with the goal of finding a solution for the speech segmentation component of the Digital Linguistic Biomarkers (DLBs) extraction pipeline for early mental disorder diagnosis. We experimented with various VAD systems and proposed an ensemble VAD system. Our ensemble system shows improvements in speech event detection. This advancement lays a robust foundation for more accurate early detection of mental health issues using DLBs in Italian.
2024
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)
1
6
Zhang, S., Gagliardi, G., Tamburini, F. (2024). Voice Activity Detection on Italian Language. Aachen : CEUR Workshop Proceedings (CEUR-WS.org).
Zhang, S.; Gagliardi, G.; Tamburini, F.
File in questo prodotto:
File Dimensione Formato  
111_main_long.pdf

accesso aperto

Descrizione: Contributo in Atti di Convegno
Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 1.04 MB
Formato Adobe PDF
1.04 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1000570
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact