In this paper we deal with the spatial distribution of 16 linguistic features known to vary between Bosnian, Croatian, Montenegrin, and Serbian. We perform our analyses on a dataset of geo-encoded Twitter status messages collected in the period from mid-2013 to the end of 2016. We perform two types of analyses. The first one finds boundaries in the spatial distribution of the linguistic variable levels through the kernel density estimation smoothing technique. These boundaries are then plotted over the state borders for a visual comparison. The second analysis deals with linguistic distance between the states. The groupings of linguistic variables and countries are calculated given the state borders and the Jensen-Shannon divergence between distributions of the 16 variables within each state. This analysis is completed with a measure of variable consistency for each country. These analyses are intended to show the extent to which current state borders correspond to linguistic boundaries. They suggest that Croatia and Serbia still represent the two extremes, reflecting a history of normative divergences, while Bosnia-Herzegovina and Montenegro, depending on the variable, lean to one or the other side.

Nikola Ljubešić, Maja Miličević Petrović, Tanja Samardžić (2018). Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue. JOURNAL OF LINGUISTIC GEOGRAPHY, 6(2), 100-124 [10.1017/jlg.2018.9].

Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue

Maja Miličević Petrović;
2018

Abstract

In this paper we deal with the spatial distribution of 16 linguistic features known to vary between Bosnian, Croatian, Montenegrin, and Serbian. We perform our analyses on a dataset of geo-encoded Twitter status messages collected in the period from mid-2013 to the end of 2016. We perform two types of analyses. The first one finds boundaries in the spatial distribution of the linguistic variable levels through the kernel density estimation smoothing technique. These boundaries are then plotted over the state borders for a visual comparison. The second analysis deals with linguistic distance between the states. The groupings of linguistic variables and countries are calculated given the state borders and the Jensen-Shannon divergence between distributions of the 16 variables within each state. This analysis is completed with a measure of variable consistency for each country. These analyses are intended to show the extent to which current state borders correspond to linguistic boundaries. They suggest that Croatia and Serbia still represent the two extremes, reflecting a history of normative divergences, while Bosnia-Herzegovina and Montenegro, depending on the variable, lean to one or the other side.
2018
Nikola Ljubešić, Maja Miličević Petrović, Tanja Samardžić (2018). Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue. JOURNAL OF LINGUISTIC GEOGRAPHY, 6(2), 100-124 [10.1017/jlg.2018.9].
Nikola Ljubešić; Maja Miličević Petrović; Tanja Samardžić
File in questo prodotto:
File Dimensione Formato  
borders_and_boundaries_in_bosnian_croatian_montenegrin_and_serbian_twitter_data_to_the_rescue.pdf

accesso riservato

Tipo: Versione (PDF) editoriale
Licenza: Licenza per accesso riservato
Dimensione 5.49 MB
Formato Adobe PDF
5.49 MB Adobe PDF   Visualizza/Apri   Contatta l'autore
Borders_and_boundaries_in_BCMS_final_text.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 8.85 MB
Formato Adobe PDF
8.85 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/775292
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact