Overlapping structures in XML are not symptoms of a misunderstanding of the intrinsic characteristics of a text document nor evidence of extreme scholarly requirements far beyond those needed by the most common XML-based applications. On the contrary, overlaps have started to appear in a large number of incredibly popular applications hidden under the guise of syntactical tricks to the basic hierarchy of the XML data format. Unfortunately, syntactical tricks have the drawback that the affected structures require complicated workarounds to support even the simplest query or usage. In this article, we present Extremely Annotational Resource Description Framework (RDF) Markup (EARMARK), an approach to overlapping markup that simplifies and streamlines the management of multiple hierarchies on the same content, and provides an approach to sophisticated queries and usages over such structures without the need of ad-hoc applications, simply by using Semantic Web tools and languages. We compare how relevant tasks (e.g., the identification of the contribution of an author in a word processor document) are of some substantial complexity when using the original data format and become more or less trivial when using EARMARK. We finally evaluate positively the memory and disk requirements of EARMARK documents in comparison to Open Office and Microsoft Word XML-based formats.

Di Iorio A., Peroni S., Vitali F. (2011). A Semantic Web Approach To Everyday Overlapping Markup. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 62, 1696-1716 [10.1002/asi.21591].

A Semantic Web Approach To Everyday Overlapping Markup

DI IORIO, ANGELO;PERONI, SILVIO;VITALI, FABIO
2011

Abstract

Overlapping structures in XML are not symptoms of a misunderstanding of the intrinsic characteristics of a text document nor evidence of extreme scholarly requirements far beyond those needed by the most common XML-based applications. On the contrary, overlaps have started to appear in a large number of incredibly popular applications hidden under the guise of syntactical tricks to the basic hierarchy of the XML data format. Unfortunately, syntactical tricks have the drawback that the affected structures require complicated workarounds to support even the simplest query or usage. In this article, we present Extremely Annotational Resource Description Framework (RDF) Markup (EARMARK), an approach to overlapping markup that simplifies and streamlines the management of multiple hierarchies on the same content, and provides an approach to sophisticated queries and usages over such structures without the need of ad-hoc applications, simply by using Semantic Web tools and languages. We compare how relevant tasks (e.g., the identification of the contribution of an author in a word processor document) are of some substantial complexity when using the original data format and become more or less trivial when using EARMARK. We finally evaluate positively the memory and disk requirements of EARMARK documents in comparison to Open Office and Microsoft Word XML-based formats.
2011
Di Iorio A., Peroni S., Vitali F. (2011). A Semantic Web Approach To Everyday Overlapping Markup. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 62, 1696-1716 [10.1002/asi.21591].
Di Iorio A.; Peroni S.; Vitali F.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/112578
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 6
social impact