CRIS Current Research Information System

Recent approaches adopt multimodel databases (MMDBs) to natively handle the variety issues arising from the increasing amounts of heterogeneous data (structured, semi-structured, graph- based, etc.) made available. However, when it comes to analyzing these data, traditional data warehouses (DWs) and OLAP systems fall short because they rely on relational Database Management Systems (DBMSs) for storage and querying, thus constraining data variety into the rigidity of a structured schema. This paper provides a preliminary investigation of the performance of an MMDB when used to store multidimensional data for OLAP analysis. A multimodel DW would store each of its elements according to its native model; among the benefits we envision for this solution, that of bridging the architectural gap between data lakes and DWs, that of reducing the cost for ETL data transformations, and that of ensuring better flexibility, extensibility, and evolvability thanks to the use of schemaless models. To support our investigation we present an implementation, based on the UniBench benchmark dataset, that extends a star schema with JSON, XML, spatial, and key-value data; we also define a sample OLAP workload and use it to test the performance of our solution and compare it with that of a classical star schema. As expected, the full-relational implementation performs better, but we believe that this gap could be balanced by the benefits of multimodel in dealing with variety. Finally, we give our perspective view of the research on this topic.

To Each His Own: Accommodating Data Variety by a Multimodel Star Schema / Sandro Bimonte, Yassine Hifdi, Mohammed Maliari, Patrick Marcel, Stefano Rizzi. - ELETTRONICO. - 2572:(2020), pp. 66-73. (Intervento presentato al convegno 22nd International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP 2020) tenutosi a Copenhagen, Denmark nel March 30, 2020).

To Each His Own: Accommodating Data Variety by a Multimodel Star Schema

Sandro Bimonte;Yassine Hifdi;Mohammed Maliari;Patrick Marcel;Stefano Rizzi

2020

Abstract

Recent approaches adopt multimodel databases (MMDBs) to natively handle the variety issues arising from the increasing amounts of heterogeneous data (structured, semi-structured, graph- based, etc.) made available. However, when it comes to analyzing these data, traditional data warehouses (DWs) and OLAP systems fall short because they rely on relational Database Management Systems (DBMSs) for storage and querying, thus constraining data variety into the rigidity of a structured schema. This paper provides a preliminary investigation of the performance of an MMDB when used to store multidimensional data for OLAP analysis. A multimodel DW would store each of its elements according to its native model; among the benefits we envision for this solution, that of bridging the architectural gap between data lakes and DWs, that of reducing the cost for ETL data transformations, and that of ensuring better flexibility, extensibility, and evolvability thanks to the use of schemaless models. To support our investigation we present an implementation, based on the UniBench benchmark dataset, that extends a star schema with JSON, XML, spatial, and key-value data; we also define a sample OLAP workload and use it to test the performance of our solution and compare it with that of a classical star schema. As expected, the full-relational implementation performs better, but we believe that this gap could be balanced by the benefits of multimodel in dealing with variety. Finally, we give our perspective view of the research on this topic.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2020
		
	Titolo del volume
	
			Proceedings of the 22nd International Workshop on Design, Optimization,Languages and Analytical Processing of Big Data co-located with EDBT/ICDT2020 Joint Conference, DOLAP@EDBT/ICDT 2020
		
	Pagina iniziale
	
			66
		
	Pagina finale
	
			73
		
	Collana/Serie
	
			CEUR WORKSHOP PROCEEDINGS
		
	Citazione
	
			To Each His Own: Accommodating Data Variety by a Multimodel Star Schema / Sandro Bimonte, Yassine Hifdi, Mohammed Maliari, Patrick Marcel, Stefano Rizzi. - ELETTRONICO. - 2572:(2020), pp. 66-73. (Intervento presentato al  convegno 22nd International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP 2020) tenutosi a Copenhagen, Denmark nel March 30, 2020).
		
	Tutti gli autori
	
			Sandro Bimonte, Yassine Hifdi, Mohammed Maliari, Patrick Marcel, Stefano Rizzi
		
	Appare nelle tipologie:
	
			4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
dolap20-M3D.pdf accesso aperto Tipo: Versione (PDF) editoriale Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.15 MB Formato Adobe PDF Visualizza/Apri	1.15 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/752105

Citazioni

ND

2

ND

social impact