In this paper we describe an activity of information integration performed on databases with patent data and company indicators. In particular, we present a detailed case study on company name matching. We show how to choose and tune existing methods to work on the domain object of this paper, and describe an efficient implementation to process large volumes of data. The integration activity involves the application of approximate string matching techniques. Then, we show the experimental results obtained on real data sets, highlighting the pros and cons of approximate string matching in this specific domain, and analyze the impact of domain knowledge on the results of the matching activity.

Integration of Patent and Company Databases

MAGNANI, MATTEO;MONTESI, DANILO
2007

Abstract

In this paper we describe an activity of information integration performed on databases with patent data and company indicators. In particular, we present a detailed case study on company name matching. We show how to choose and tune existing methods to work on the domain object of this paper, and describe an efficient implementation to process large volumes of data. The integration activity involves the application of approximate string matching techniques. Then, we show the experimental results obtained on real data sets, highlighting the pros and cons of approximate string matching in this specific domain, and analyze the impact of domain knowledge on the results of the matching activity.
Proceedings of the 11th International Database Engineering and Applications Symposium (IDEAS)
163
171
M. Magnani; D. Montesi
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/55533
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact