An extended, revised form of Tim Buckwalter's Arabic lexical and morphological resource AraMorph, eXtended Revised AraMorph (henceforth XRAM), is presented which addresses a number of weaknesses and inconsistencies of the original model by allowing a wider coverage of real-world Classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, (iii) semi-automatic increment of lexical coverage through extraction of lexical and morphological information from existing lexical resources. Testing of XRAM through a front-end Python module showed a remarkable success level.

Giuliano Lancioni, Valeria Pettinari, Laura Garofalo, Marta Campanelli, Ivana Pepe, Simona Olivieri, et al. (2018). Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: the eXtended Revised AraMorph (XRAM). Singapore : World Scientific Publishing Co.

Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: the eXtended Revised AraMorph (XRAM)

Ilaria Cicola
2018

Abstract

An extended, revised form of Tim Buckwalter's Arabic lexical and morphological resource AraMorph, eXtended Revised AraMorph (henceforth XRAM), is presented which addresses a number of weaknesses and inconsistencies of the original model by allowing a wider coverage of real-world Classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, (iii) semi-automatic increment of lexical coverage through extraction of lexical and morphological information from existing lexical resources. Testing of XRAM through a front-end Python module showed a remarkable success level.
2018
Computational Linguistics, Speech and Image Processing for Arabic Language
155
168
Giuliano Lancioni, Valeria Pettinari, Laura Garofalo, Marta Campanelli, Ivana Pepe, Simona Olivieri, et al. (2018). Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: the eXtended Revised AraMorph (XRAM). Singapore : World Scientific Publishing Co.
Giuliano Lancioni; Valeria Pettinari; Laura Garofalo; Marta Campanelli; Ivana Pepe; Simona Olivieri; Ilaria Cicola
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/956581
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact