A paradox in ML design: Less data for a smarter water metering cognification experience

Roccetti, M.; Delnevo, G.; Casini, L.; Zagni, N.; Cappiello, G.

doi:10.1145/3342428.3342685

Many data scientists are currently pointing out that the amount of Machine Learning (ML) research that will cross into practice will depend, not just on the ability of the specialized algorithms used to scrutinize positive/negative examples, but also on the quality of the data exploited for training those algorithms. Our experience, while training a neural network with a huge dataset comprised of over fifteen million water meter readings, confirms such conjecture. In this paper, we report on the actions we took to extrapolate from that database just those data that could correctly represent the complex statistical phenomenon in play. With an adequate re-organization of those data, we got an interesting, yet controversial, result. On the one hand, we improved the accuracy on the prediction when a water meter fails/needs disassembly based on a history of water consumption measurements, thus making smarter a meter maintenance process; on the other hand, all this came with the paradox of a (statistical) transformation of the initial dataset: while we alleviate a problem with a restructured and better interpretable data model, we simultaneously change the replicated form of those data.

Roccetti, M. (2019). A paradox in ML design: Less data for a smarter water metering cognification experience. Nw York : ACM [10.1145/3342428.3342685].

A paradox in ML design: Less data for a smarter water metering cognification experience

Roccetti M.;Delnevo G.;Casini L.;Zagni N.;Cappiello G.

2019

Abstract

Many data scientists are currently pointing out that the amount of Machine Learning (ML) research that will cross into practice will depend, not just on the ability of the specialized algorithms used to scrutinize positive/negative examples, but also on the quality of the data exploited for training those algorithms. Our experience, while training a neural network with a huge dataset comprised of over fifteen million water meter readings, confirms such conjecture. In this paper, we report on the actions we took to extrapolate from that database just those data that could correctly represent the complex statistical phenomenon in play. With an adequate re-organization of those data, we got an interesting, yet controversial, result. On the one hand, we improved the accuracy on the prediction when a water meter fails/needs disassembly based on a history of water consumption measurements, thus making smarter a meter maintenance process; on the other hand, all this came with the paradox of a (statistical) transformation of the initial dataset: while we alleviate a problem with a restructured and better interpretable data model, we simultaneously change the replicated form of those data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Titolo del volume
	
				ACM International Conference Proceeding Series
			
	Pagina iniziale
	
				201
			
	Pagina finale
	
				206
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3342428.3342685
			
	Citazione
	
				Roccetti, M. (2019). A paradox in ML design: Less data for a smarter water metering cognification experience. Nw York : ACM [10.1145/3342428.3342685].
			
	Tutti gli autori
	
						Roccetti, M.,  Delnevo, G.,  Casini, L.,  Zagni, N.,  Cappiello, G.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/701775

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

7

1

CRIS Current Research Information System