Train model identification can enhance the structural monitoring of railway infrastructures by providing contextual information about train passages. While approaches relying on timetables are impractical due to delays, camera-based solutions present challenges related to deployment costs and privacy concerns. In this paper, we propose RATTLE, a self-contained framework for train tracking and identification based on audio signal fingerprinting. We have developed a prototype IoT system tailored for train tracking and ground truth assessment, enabling the acquisition of a real-world dataset spanning four months of measurements. Then, we conducted a comparative analysis of several traditional Machine Learning (ML) and Deep Learning (DL) algorithms for audio features classification, mel spectrogram classification, and image classification (serving as baselines). Our findings highlight that mel-trained CNN algorithms achieve high accuracy (97%) comparable to the best video-based DL solution, while substantially reducing model size. Furthermore, we explored the potential for migrating the classification task to the edge through quantisation techniques.
Ciabattini, L., Sciullo, L., Esposito, A., Zyrianoff, I., Di Felice, M. (2024). RATTLE: Train Identification Through Audio Fingerprinting. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/smartcomp61445.2024.00028].
RATTLE: Train Identification Through Audio Fingerprinting
Ciabattini, Leonardo;Sciullo, Luca
;Esposito, Alfonso;Zyrianoff, Ivan;Di Felice, Marco
2024
Abstract
Train model identification can enhance the structural monitoring of railway infrastructures by providing contextual information about train passages. While approaches relying on timetables are impractical due to delays, camera-based solutions present challenges related to deployment costs and privacy concerns. In this paper, we propose RATTLE, a self-contained framework for train tracking and identification based on audio signal fingerprinting. We have developed a prototype IoT system tailored for train tracking and ground truth assessment, enabling the acquisition of a real-world dataset spanning four months of measurements. Then, we conducted a comparative analysis of several traditional Machine Learning (ML) and Deep Learning (DL) algorithms for audio features classification, mel spectrogram classification, and image classification (serving as baselines). Our findings highlight that mel-trained CNN algorithms achieve high accuracy (97%) comparable to the best video-based DL solution, while substantially reducing model size. Furthermore, we explored the potential for migrating the classification task to the edge through quantisation techniques.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.