Early modern printed gazettes relied on a system of news exchange and text reuse largely based on handwritten sources. The reconstruction of this information exchange system is possible by detecting reused texts. We present a method to individuate text borrowings within noisy OCRed texts from printed gazettes based on string kernels and local text alignment. We apply our methods on a corpus of Italian gazettes for the year 1648. Beside unveiling substantial overlaps in news sources, we are able to assess the editorial policy of different gazettes and account for a multi-faceted system of text reuse.
Colavizza G., Infelise M., Kaplan F. (2015). Mapping the early modern news flow: An enquiry by robust text reuse detection. Springer Verlag [10.1007/978-3-319-15168-7_31].
Mapping the early modern news flow: An enquiry by robust text reuse detection
Colavizza G.;
2015
Abstract
Early modern printed gazettes relied on a system of news exchange and text reuse largely based on handwritten sources. The reconstruction of this information exchange system is possible by detecting reused texts. We present a method to individuate text borrowings within noisy OCRed texts from printed gazettes based on string kernels and local text alignment. We apply our methods on a corpus of Italian gazettes for the year 1648. Beside unveiling substantial overlaps in news sources, we are able to assess the editorial policy of different gazettes and account for a multi-faceted system of text reuse.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


