Object detection and recognition are challenging computer vision tasks receiving great attention due to the large number of applications. This work focuses on the detection/recognition of products in supermarket shelves; this framework has a number of practical applications such as providing additional product/price information to the user or guiding visually impaired customers during shopping. The automatic creation of planograms (i.e., actual layout of products on shelves) is also useful for commercial analysis and management of large stores. Although in many object detection/recognition contexts it can be assumed that training images are representative of the real operational conditions, in our scenario such assumption is not realistic because the only training images available are acquired in well-controlled conditions. This gap between the training and test data makes the object detection and recognition tasks far more complex and requires very robust techniques. In this paper we prove that good results can be obtained by exploiting color and texture information in a multi-stage process: pre-selection, fine-selection and post processing. For fine-selection we compared a classical Bag of Words technique with a more recent Deep Neural Networks approach and found interesting outcomes. Extensive experiments on datasets of varying complexity are discussed to highlight the main issues characterizing this problem, and to guide toward the practical development of a real application.
Franco, A., Maltoni, D., Papi, S. (2017). Grocery product detection and recognition. EXPERT SYSTEMS WITH APPLICATIONS, 81, 163-176 [10.1016/j.eswa.2017.02.050].
Grocery product detection and recognition
FRANCO, ANNALISA;MALTONI, DAVIDE;PAPI, SERENA
2017
Abstract
Object detection and recognition are challenging computer vision tasks receiving great attention due to the large number of applications. This work focuses on the detection/recognition of products in supermarket shelves; this framework has a number of practical applications such as providing additional product/price information to the user or guiding visually impaired customers during shopping. The automatic creation of planograms (i.e., actual layout of products on shelves) is also useful for commercial analysis and management of large stores. Although in many object detection/recognition contexts it can be assumed that training images are representative of the real operational conditions, in our scenario such assumption is not realistic because the only training images available are acquired in well-controlled conditions. This gap between the training and test data makes the object detection and recognition tasks far more complex and requires very robust techniques. In this paper we prove that good results can be obtained by exploiting color and texture information in a multi-stage process: pre-selection, fine-selection and post processing. For fine-selection we compared a classical Bag of Words technique with a more recent Deep Neural Networks approach and found interesting outcomes. Extensive experiments on datasets of varying complexity are discussed to highlight the main issues characterizing this problem, and to guide toward the practical development of a real application.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.