CRIS Current Research Information System

Searching in a large data set those strings that are more similar, according to the edit distance, to a given one is a time-consuming process. In this paper we investigate the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings. Using the bag distance as an approximation of the edit distance, we show an improvement in performance up to 90% with respect to the basic case. This, along with the fact that our solution is independent on both the distance used in the pre-test and on the underlying metric index, demonstrates that metric indices are a powerful solution, not only for many modern application areas, as multimedia, data mining and pattern recognition, but also for the string matching problem.

Bartolini I., Ciaccia P., Patella M. (2002). String matching with metric trees using an approximate distance. Springer Verlag [10.1007/3-540-45735-6_24].

String matching with metric trees using an approximate distance

Bartolini I.;Ciaccia P.;Patella M.

2002

Abstract

Searching in a large data set those strings that are more similar, according to the edit distance, to a given one is a time-consuming process. In this paper we investigate the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings. Using the bag distance as an approximation of the edit distance, we show an improvement in performance up to 90% with respect to the basic case. This, along with the fact that our solution is independent on both the distance used in the pre-test and on the underlying metric index, demonstrates that metric indices are a powerful solution, not only for many modern application areas, as multimedia, data mining and pattern recognition, but also for the string matching problem.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2002
			
	Titolo del volume
	
				String Processing and Information Retrieval
			
	Pagina iniziale
	
				271
			
	Pagina finale
	
				283
			
	Collana/Serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/3-540-45735-6_24
			
	Citazione
	
				Bartolini I.,  Ciaccia P.,  Patella M. (2002). String matching with metric trees using an approximate distance. Springer Verlag [10.1007/3-540-45735-6_24].
			
	Tutti gli autori
	
						Bartolini I.; Ciaccia P.; Patella M.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/918344

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

39

ND

ND

social impact