CRIS Current Research Information System

A recent trend in algorithm design consists of augmenting classic data structures with machine learning models, which are better suited to reveal and exploit patterns and trends in the input data so to achieve outstanding practical improvements in space occupancy and time efficiency. This is especially known in the context of indexing data structures where, despite few attempts in evaluating their asymptotic efficiency, theoretical results are yet missing in showing that learned indexes are provably better than classic indexes, such as B+ -trees and their variants. In this paper, we present the first mathematically-grounded answer to this open problem. We obtain this result by discovering and exploiting a link between the original problem and a mean exit time problem over a proper stochastic process which, we show, is related to the space and time occupancy of those learned indexes. Our general result is then specialised to five well-known distributions: Uniform, Lognormal, Pareto, Exponential, and Gamma; and it is corroborated in precision and robustness by a large set of experiments

Ferragina, P., Lillo, F., Vinciguerra, G. (2020). Why Are Learned Indexes So Effective?.

Why Are Learned Indexes So Effective?

Paolo Ferragina;Fabrizio Lillo;Giorgio Vinciguerra

2020

Abstract

A recent trend in algorithm design consists of augmenting classic data structures with machine learning models, which are better suited to reveal and exploit patterns and trends in the input data so to achieve outstanding practical improvements in space occupancy and time efficiency. This is especially known in the context of indexing data structures where, despite few attempts in evaluating their asymptotic efficiency, theoretical results are yet missing in showing that learned indexes are provably better than classic indexes, such as B+ -trees and their variants. In this paper, we present the first mathematically-grounded answer to this open problem. We obtain this result by discovering and exploiting a link between the original problem and a mean exit time problem over a proper stochastic process which, we show, is related to the space and time occupancy of those learned indexes. Our general result is then specialised to five well-known distributions: Uniform, Lognormal, Pareto, Exponential, and Gamma; and it is corroborated in precision and robustness by a large set of experiments

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Titolo del volume
	
				37th International Conference on Machine Learning (ICML 2020)
			
	Pagina iniziale
	
				3123
			
	Pagina finale
	
				3132
			
	Collana/Serie
	
				PROCEEDINGS OF MACHINE LEARNING RESEARCH
			
	Citazione
	
				Ferragina, P., Lillo, F., Vinciguerra, G. (2020). Why Are Learned Indexes So Effective?.
			
	Tutti gli autori
	
						Ferragina, Paolo; Lillo, Fabrizio; Vinciguerra, Giorgio
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ferragina20a (2).pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per accesso libero gratuito Dimensione 1.02 MB Formato Adobe PDF Visualizza/Apri	1.02 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/797365

Citazioni

ND

23

7

social impact