This paper outlines the procedure envisaged for the creation of KSKS (Korpus srp- skog kao stranog jezika), the first corpus of Serbian as a foreign language. Available texts are described first: KSKS will comprise different types of learner production, for the most part short essays written during exams, as part of classroom activities or for homework. Within the first stage of corpus construction, learner texts are currently being digitised – each text undergoes scanning and manual transcription; information about the learners is also coded (e.g. their age and gender, mother tongue, and proficiency level in Serbian). The second stage will consist in annotating texts with linguistics information (part-of- speech tags and lemmas), as well as in implementing an error annotation scheme that is currently being developed and in adding normalised forms to enable standard language queries that will simultaneously provide results complying with the standard and those departing from it. In the final stage of its development the corpus will be made available online via a graphical user interface. The purpose of KSKS is to assemble a substantial amount of learner production data and to put that data at the disposal of researchers and teachers interested in studying the acquisition of Serbian as a foreign language, or in us- ing authentic learner material in their teaching activities.
Maja Miličević (2016). Korpus srpskog kao stranog jezika (KSKS): Opis građe i plan izrade. Belgrade : Faculty of Philology, Univeristy of Belgrade.
Korpus srpskog kao stranog jezika (KSKS): Opis građe i plan izrade
Maja Miličević
2016
Abstract
This paper outlines the procedure envisaged for the creation of KSKS (Korpus srp- skog kao stranog jezika), the first corpus of Serbian as a foreign language. Available texts are described first: KSKS will comprise different types of learner production, for the most part short essays written during exams, as part of classroom activities or for homework. Within the first stage of corpus construction, learner texts are currently being digitised – each text undergoes scanning and manual transcription; information about the learners is also coded (e.g. their age and gender, mother tongue, and proficiency level in Serbian). The second stage will consist in annotating texts with linguistics information (part-of- speech tags and lemmas), as well as in implementing an error annotation scheme that is currently being developed and in adding normalised forms to enable standard language queries that will simultaneously provide results complying with the standard and those departing from it. In the final stage of its development the corpus will be made available online via a graphical user interface. The purpose of KSKS is to assemble a substantial amount of learner production data and to put that data at the disposal of researchers and teachers interested in studying the acquisition of Serbian as a foreign language, or in us- ing authentic learner material in their teaching activities.| File | Dimensione | Formato | |
|---|---|---|---|
|
milicevic_ksks_2016.pdf
accesso aperto
Tipo:
Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza:
Licenza per accesso libero gratuito
Dimensione
2.26 MB
Formato
Adobe PDF
|
2.26 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


