This work presents the design of a computer-assisted transcription system for speech-language therapists and an evaluation of its core-module: the NLP pipeline. This pipeline combines a tokenizer, a lemmatizer, a part-of-speech tagger and a spellchecker to perform a semi-automatic annotation of speech transcriptions. The implemented module has been evaluated on a corpus of spoken interaction of children with Developmental Language Disorder (DLD) with the caregiver. Results are promising in automatic error detection (F-measure of 0.547 against a Ground Truth of 0.616) but low in automatic error correction, and confirm the effectiveness within an assisted transcription tool.
Gagliardi Gloria, G.L. (2020). An NLP pipeline as assisted transcription tool for speech therapists. Paris : ELRA - European Language Resources Association.
An NLP pipeline as assisted transcription tool for speech therapists
Gagliardi Gloria;Ravelli Andrea Amelio
2020
Abstract
This work presents the design of a computer-assisted transcription system for speech-language therapists and an evaluation of its core-module: the NLP pipeline. This pipeline combines a tokenizer, a lemmatizer, a part-of-speech tagger and a spellchecker to perform a semi-automatic annotation of speech transcriptions. The implemented module has been evaluated on a corpus of spoken interaction of children with Developmental Language Disorder (DLD) with the caregiver. Results are promising in automatic error detection (F-measure of 0.547 against a Ground Truth of 0.616) but low in automatic error correction, and confirm the effectiveness within an assisted transcription tool.File | Dimensione | Formato | |
---|---|---|---|
Gagliardi-Gregori-Ravelli2020-RaPID3.pdf
accesso aperto
Descrizione: Capitolo
Tipo:
Postprint
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
1.61 MB
Formato
Adobe PDF
|
1.61 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.