The digital transformation of the scientific publishing industry has led to dramatic improvements in content discoverability and information analytics. Unfortunately, these improvements have not been uniform across research areas. The scientific literature in the arts, humanities and social sciences (AHSS) still lags behind, in part due to the scale of analog backlogs, the persisting importance of national languages, and a publisher ecosystem made of many, small or medium enterprises. We propose a bottom-up approach to support publishers in creating and maintaining their own publication knowledge graphs in the open domain. We do so by releasing a pipeline able to extract structured information from the bibliographies and indexes of AHSS publications, disambiguate, normalize and export it as linked data. We test the proposed pipeline on Brill's Classics collection, and release an implementation in open source for further use and improvement.

Natallia Kokash, Matteo Romanello, Ernest Suyver, Giovanni Colavizza (2023). From Books to Knowledge Graphs. JOURNAL OF DATA MINING AND DIGITAL HUMANITIES, 2023, 1-23 [10.46298/jdmdh.9380].

From Books to Knowledge Graphs

Giovanni Colavizza
2023

Abstract

The digital transformation of the scientific publishing industry has led to dramatic improvements in content discoverability and information analytics. Unfortunately, these improvements have not been uniform across research areas. The scientific literature in the arts, humanities and social sciences (AHSS) still lags behind, in part due to the scale of analog backlogs, the persisting importance of national languages, and a publisher ecosystem made of many, small or medium enterprises. We propose a bottom-up approach to support publishers in creating and maintaining their own publication knowledge graphs in the open domain. We do so by releasing a pipeline able to extract structured information from the bibliographies and indexes of AHSS publications, disambiguate, normalize and export it as linked data. We test the proposed pipeline on Brill's Classics collection, and release an implementation in open source for further use and improvement.
2023
Natallia Kokash, Matteo Romanello, Ernest Suyver, Giovanni Colavizza (2023). From Books to Knowledge Graphs. JOURNAL OF DATA MINING AND DIGITAL HUMANITIES, 2023, 1-23 [10.46298/jdmdh.9380].
Natallia Kokash; Matteo Romanello; Ernest Suyver; Giovanni Colavizza
File in questo prodotto:
File Dimensione Formato  
2204.10766.pdf

accesso aperto

Descrizione: Articolo
Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 797.86 kB
Formato Adobe PDF
797.86 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/948805
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact