This article is focused on the complexity of finding and analyzing the totality of educational in- formation shared by the University of Bologna on its website during the last twenty years. It specifically em - phasizes some issues related to the use of the Wayback Machine, the most important international web ar- chive, and the need for a different research tool which would guarantee more solid analyses of the corpus. This tool could initially be characterized by the use of standard Natural Language Processing techniques (such as tokenization, stop-words removing, parsing, etc.) but we also have to take into consideration more complex solutions, such as text mining analyses, WordNet integration and an ontological representation of knowledge. Thanks to approaches like the one here presented, future historians will be able to efficiently study the evolution of a website.
Federico Nanni (2014). Managing Educational Information on University Websites: a proposal for Unibo.it. CLEUP - Coop. Libraria Editrice Univ. di Padova.
Managing Educational Information on University Websites: a proposal for Unibo.it
NANNI, FEDERICO
2014
Abstract
This article is focused on the complexity of finding and analyzing the totality of educational in- formation shared by the University of Bologna on its website during the last twenty years. It specifically em - phasizes some issues related to the use of the Wayback Machine, the most important international web ar- chive, and the need for a different research tool which would guarantee more solid analyses of the corpus. This tool could initially be characterized by the use of standard Natural Language Processing techniques (such as tokenization, stop-words removing, parsing, etc.) but we also have to take into consideration more complex solutions, such as text mining analyses, WordNet integration and an ontological representation of knowledge. Thanks to approaches like the one here presented, future historians will be able to efficiently study the evolution of a website.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.