This paper outlines the procedure envisaged for the creation of KSKS (Korpus srp- skog kao stranog jezika), the first corpus of Serbian as a foreign language. Available texts are described first: KSKS will comprise different types of learner production, for the most part short essays written during exams, as part of classroom activities or for homework. Within the first stage of corpus construction, learner texts are currently being digitised – each text undergoes scanning and manual transcription; information about the learners is also coded (e.g. their age and gender, mother tongue, and proficiency level in Serbian). The second stage will consist in annotating texts with linguistics information (part-of- speech tags and lemmas), as well as in implementing an error annotation scheme that is currently being developed and in adding normalised forms to enable standard language queries that will simultaneously provide results complying with the standard and those departing from it. In the final stage of its development the corpus will be made available online via a graphical user interface. The purpose of KSKS is to assemble a substantial amount of learner production data and to put that data at the disposal of researchers and teachers interested in studying the acquisition of Serbian as a foreign language, or in us- ing authentic learner material in their teaching activities.

Maja Miličević (2016). Korpus srpskog kao stranog jezika (KSKS): Opis građe i plan izrade. Belgrade : Faculty of Philology, Univeristy of Belgrade.

Korpus srpskog kao stranog jezika (KSKS): Opis građe i plan izrade

Maja Miličević
2016

Abstract

This paper outlines the procedure envisaged for the creation of KSKS (Korpus srp- skog kao stranog jezika), the first corpus of Serbian as a foreign language. Available texts are described first: KSKS will comprise different types of learner production, for the most part short essays written during exams, as part of classroom activities or for homework. Within the first stage of corpus construction, learner texts are currently being digitised – each text undergoes scanning and manual transcription; information about the learners is also coded (e.g. their age and gender, mother tongue, and proficiency level in Serbian). The second stage will consist in annotating texts with linguistics information (part-of- speech tags and lemmas), as well as in implementing an error annotation scheme that is currently being developed and in adding normalised forms to enable standard language queries that will simultaneously provide results complying with the standard and those departing from it. In the final stage of its development the corpus will be made available online via a graphical user interface. The purpose of KSKS is to assemble a substantial amount of learner production data and to put that data at the disposal of researchers and teachers interested in studying the acquisition of Serbian as a foreign language, or in us- ing authentic learner material in their teaching activities.
2016
Srpski kao strani jezik u teoriji i praksi III
279
289
Maja Miličević (2016). Korpus srpskog kao stranog jezika (KSKS): Opis građe i plan izrade. Belgrade : Faculty of Philology, Univeristy of Belgrade.
Maja Miličević
File in questo prodotto:
File Dimensione Formato  
milicevic_ksks_2016.pdf

accesso aperto

Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza: Licenza per accesso libero gratuito
Dimensione 2.26 MB
Formato Adobe PDF
2.26 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/775841
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact