We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms. Specifically, our architecture combines i) a Haar waveletbased tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages i) and ii) are completely prespecified, only the linear classifier is learned from data. We apply the proposed architecture to outdoor scene and aerial image semantic segmentation and show that the accuracy of our architecture is competitive with conventional pixel classification CNNs. Furthermore, we demonstrate that the proposed architecture is data efficient in the sense of matching the accuracy of pixel classification CNNs when trained on a much smaller data set.
Deep structured features for semantic segmentation / Tschannen, Michael; Cavigelli, Lukas; Mentzer, Fabian; Wiatowski, Thomas; Benini, Luca. - ELETTRONICO. - 2017-:(2017), pp. 8081169.61-8081169.65. (Intervento presentato al convegno 25th European Signal Processing Conference, EUSIPCO 2017 tenutosi a Kos International Convention Center, grc nel 2017) [10.23919/EUSIPCO.2017.8081169].
Deep structured features for semantic segmentation
Benini, Luca
2017
Abstract
We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms. Specifically, our architecture combines i) a Haar waveletbased tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages i) and ii) are completely prespecified, only the linear classifier is learned from data. We apply the proposed architecture to outdoor scene and aerial image semantic segmentation and show that the accuracy of our architecture is competitive with conventional pixel classification CNNs. Furthermore, we demonstrate that the proposed architecture is data efficient in the sense of matching the accuracy of pixel classification CNNs when trained on a much smaller data set.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.