In recent years, non-native speech has been a topic of continuous research interest in theoretical literature, applied linguistics, and speech technology. A significant number of studies in second- language acquisition covered production and perception of non-native speech, often targeting English (Flege et al., 1995; Piske et al., 2001; MacKay et al., 2006; Moyer, 2013; Sereno et al. 2016, etc.). Nonetheless, in the last decades, various studies have also addressed non-native Italian (Marotta & Boula de Mareuil, 2010; Gili Fivela, 2012; Pellegrino, 2012; etc.). This paper, which draws on an ongoing Ph.D. project on foreign accent, explores the complexity of collecting and analysing spoken non-native Italian. At the same time, it investigates whether the outcomes of this type of research may have applications in speech processing and understanding, or in the field of applied linguistics, essentially in teaching Italian as a foreign language. The material considered here derives from the Corpus Audio di Italiano L2 (CorAIt) which totals up to eight hours of read and spontaneous speech; apart from a control dataset of native Italian speech, the samples are uttered by speakers whose first languages are either Russian, English, German, French, Romanian, or Spanish (Combei, 2017). Recording a speech database of over 120 speakers has confirmed to be demanding, mostly because the variables that had to be taken into account for the data collection (i.e. gender, age of Italian onset, length of time spent in Italy, Italian learning method, and proficiency level in Italian) added to the challenge of attaining balancedness and representativeness. Next in order, potential participants had to be recruited as volunteers and they would not have been offered any reward for their involvement, so this might have influenced on their motivations to take part in the experiment, and on their performance. Despite some imbalance issues, this database may be employed for investigating how native and non-native pronunciations differ, training and testing accent classification systems, adapting pronunciation dictionaries for automatic speech recognition, or developing applications for educational purposes (i.e. computer or mobile assisted language learning) and forensic sciences (i.e. linguistic profiling tasks). In order to assess the effect of the speaker-dependent sociocultural and sociolinguistic variables on their samples of non-native productions, some segmental features (i.e. vowel duration, and the formant dynamics of F1, F2, F3, and F4) and suprasegmental features (i.e. speech rate and articulation rate) have been extracted. The ongoing analyses aim to determine whether any group-specific pattern could emerge from the data.

Claudia Roberta Combei (2018). Challenges and benefits of collecting and analysing spoken non-native Italian. Zara : Sveučilište u Zadru.

Challenges and benefits of collecting and analysing spoken non-native Italian

Claudia Roberta Combei
2018

Abstract

In recent years, non-native speech has been a topic of continuous research interest in theoretical literature, applied linguistics, and speech technology. A significant number of studies in second- language acquisition covered production and perception of non-native speech, often targeting English (Flege et al., 1995; Piske et al., 2001; MacKay et al., 2006; Moyer, 2013; Sereno et al. 2016, etc.). Nonetheless, in the last decades, various studies have also addressed non-native Italian (Marotta & Boula de Mareuil, 2010; Gili Fivela, 2012; Pellegrino, 2012; etc.). This paper, which draws on an ongoing Ph.D. project on foreign accent, explores the complexity of collecting and analysing spoken non-native Italian. At the same time, it investigates whether the outcomes of this type of research may have applications in speech processing and understanding, or in the field of applied linguistics, essentially in teaching Italian as a foreign language. The material considered here derives from the Corpus Audio di Italiano L2 (CorAIt) which totals up to eight hours of read and spontaneous speech; apart from a control dataset of native Italian speech, the samples are uttered by speakers whose first languages are either Russian, English, German, French, Romanian, or Spanish (Combei, 2017). Recording a speech database of over 120 speakers has confirmed to be demanding, mostly because the variables that had to be taken into account for the data collection (i.e. gender, age of Italian onset, length of time spent in Italy, Italian learning method, and proficiency level in Italian) added to the challenge of attaining balancedness and representativeness. Next in order, potential participants had to be recruited as volunteers and they would not have been offered any reward for their involvement, so this might have influenced on their motivations to take part in the experiment, and on their performance. Despite some imbalance issues, this database may be employed for investigating how native and non-native pronunciations differ, training and testing accent classification systems, adapting pronunciation dictionaries for automatic speech recognition, or developing applications for educational purposes (i.e. computer or mobile assisted language learning) and forensic sciences (i.e. linguistic profiling tasks). In order to assess the effect of the speaker-dependent sociocultural and sociolinguistic variables on their samples of non-native productions, some segmental features (i.e. vowel duration, and the formant dynamics of F1, F2, F3, and F4) and suprasegmental features (i.e. speech rate and articulation rate) have been extracted. The ongoing analyses aim to determine whether any group-specific pattern could emerge from the data.
2018
Zadarski Lingvisticki Forum
11
12
Claudia Roberta Combei (2018). Challenges and benefits of collecting and analysing spoken non-native Italian. Zara : Sveučilište u Zadru.
Claudia Roberta Combei
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/791277
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact