ParlaTO: corpus del parlato di Torino

Cerruti, Massimo; Ballarè, Silvia

This paper aims at introducing ParlaTO, a newly built corpus of spontaneous speech. The corpus is based on a collection of semi-structured interviews conducted in Turin with speakers of various origins and socioeconomic backgrounds, divided by age groups. Italian is by far the most represented language, while Italo-Romance dialects and immigrant languages occur more rarely and are mainly drawn on in bilingual discourse practices. In Section 1, we will give an overview of the main features of the corpus. Among its major advantages is the possibility to access a rich set of speakers’ metadata, concerning the socioeconomic status as well as the geographic origin of the informants, which is meant to allow for sociolinguistic analyses. In Section 2, we will describe the structure of the corpus and discuss the methodology we adopted to build the resource, paying special attention to data collection and data transcription. In Section 3, we will comment on some short excerpts of the interviews in order to exemplify a number of features characterizing the varieties of Italian and the bilingual interactions that can be found in the corpus. Finally, in Section 4, we will outline some prospects for future developments.

Massimo Cerruti, Silvia Ballarè (2020). ParlaTO: corpus del parlato di Torino. BOLLETTINO DELL'ATLANTE LINGUISTICO ITALIANO, 44, 171-196.