CRIS Current Research Information System

In this paper, we describe the design of a machine learning-based classifier, tailored to predict whether a water meter will fail or need a replacement. Our initial attempt to train a recurrent deep neural network (RNN), based on the use of 15 million of readings gathered from 1 million of mechanical water meters, spread throughout Northern Italy, led to non-positive results. We learned this was due to a lack of specific attention devoted to the quality of the analyzed data. We, hence, developed a novel methodology, based on a new semantics which we enforced on the training data. This allowed us to extract only those samples which are representative of the complex phenomenon of defective water meters. Adopting such a methodology, the accuracy of our RNN exceeded the 80% threshold. We simultaneously realized that the new training dataset differed significantly, in statistical terms, from the initial dataset, leading to an apparent paradox. Thus, with our contribution, we have demonstrated how to reconcile such a paradox, showing that our classifier can help detecting defective meters, while simplifying replacement procedures.

Is bigger always better? A controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures / Roccetti, M., Delnevo, G., Casini, L., Cappiello, G.. - In: JOURNAL OF BIG DATA. - ISSN 2196-1115. - STAMPA. - 6:1(2019), pp. 70.1-70.23. [10.1186/s40537-019-0235-y]

Is bigger always better? A controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures

Roccetti M.;Delnevo G.;Casini L.;Cappiello G.

2019

Abstract

In this paper, we describe the design of a machine learning-based classifier, tailored to predict whether a water meter will fail or need a replacement. Our initial attempt to train a recurrent deep neural network (RNN), based on the use of 15 million of readings gathered from 1 million of mechanical water meters, spread throughout Northern Italy, led to non-positive results. We learned this was due to a lack of specific attention devoted to the quality of the analyzed data. We, hence, developed a novel methodology, based on a new semantics which we enforced on the training data. This allowed us to extract only those samples which are representative of the complex phenomenon of defective water meters. Adopting such a methodology, the accuracy of our RNN exceeded the 80% threshold. We simultaneously realized that the new training dataset differed significantly, in statistical terms, from the initial dataset, leading to an apparent paradox. Thus, with our contribution, we have demonstrated how to reconcile such a paradox, showing that our classifier can help detecting defective meters, while simplifying replacement procedures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2019
		
	Rivista
	
			JOURNAL OF BIG DATA
		
	Codice DOI
	
			https://dx.doi.org/10.1186/s40537-019-0235-y
		
	Citazione
	
			Is bigger always better? A controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures / Roccetti, M., Delnevo, G., Casini, L., Cappiello, G.. - In: JOURNAL OF BIG DATA. - ISSN 2196-1115. - STAMPA. - 6:1(2019), pp. 70.1-70.23. [10.1186/s40537-019-0235-y]
		
	Tutti gli autori
	
			Roccetti, M., Delnevo, G., Casini, L., Cappiello, G.
		
	Appare nelle tipologie:
	
			1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Roccetti2019_Article_IsBiggerAlwaysBetterAControver.pdf accesso aperto Tipo: Versione (PDF) editoriale Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.53 MB Formato Adobe PDF Visualizza/Apri	1.53 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/697121

Citazioni

ND

45

29

social impact