In this paper we deal with the spatial distribution of 16 linguistic features known to vary between Bosnian, Croatian, Montenegrin, and Serbian. We perform our analyses on a dataset of geo-encoded Twitter status messages collected in the period from mid-2013 to the end of 2016. We perform two types of analyses. The first one finds boundaries in the spatial distribution of the linguistic variable levels through the kernel density estimation smoothing technique. These boundaries are then plotted over the state borders for a visual comparison. The second analysis deals with linguistic distance between the states. The groupings of linguistic variables and countries are calculated given the state borders and the Jensen-Shannon divergence between distributions of the 16 variables within each state. This analysis is completed with a measure of variable consistency for each country. These analyses are intended to show the extent to which current state borders correspond to linguistic boundaries. They suggest that Croatia and Serbia still represent the two extremes, reflecting a history of normative divergences, while Bosnia-Herzegovina and Montenegro, depending on the variable, lean to one or the other side.
Nikola Ljubešić, Maja Miličević Petrović, Tanja Samardžić (2018). Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue. JOURNAL OF LINGUISTIC GEOGRAPHY, 6(2), 100-124 [10.1017/jlg.2018.9].
Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue
Maja Miličević Petrović;
2018
Abstract
In this paper we deal with the spatial distribution of 16 linguistic features known to vary between Bosnian, Croatian, Montenegrin, and Serbian. We perform our analyses on a dataset of geo-encoded Twitter status messages collected in the period from mid-2013 to the end of 2016. We perform two types of analyses. The first one finds boundaries in the spatial distribution of the linguistic variable levels through the kernel density estimation smoothing technique. These boundaries are then plotted over the state borders for a visual comparison. The second analysis deals with linguistic distance between the states. The groupings of linguistic variables and countries are calculated given the state borders and the Jensen-Shannon divergence between distributions of the 16 variables within each state. This analysis is completed with a measure of variable consistency for each country. These analyses are intended to show the extent to which current state borders correspond to linguistic boundaries. They suggest that Croatia and Serbia still represent the two extremes, reflecting a history of normative divergences, while Bosnia-Herzegovina and Montenegro, depending on the variable, lean to one or the other side.File | Dimensione | Formato | |
---|---|---|---|
borders_and_boundaries_in_bosnian_croatian_montenegrin_and_serbian_twitter_data_to_the_rescue.pdf
accesso riservato
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per accesso riservato
Dimensione
5.49 MB
Formato
Adobe PDF
|
5.49 MB | Adobe PDF | Visualizza/Apri Contatta l'autore |
Borders_and_boundaries_in_BCMS_final_text.pdf
accesso aperto
Tipo:
Postprint
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
8.85 MB
Formato
Adobe PDF
|
8.85 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.