Object detection and recognition are challenging computer vision tasks receiving great attention due to the large number of applications. This work focuses on the detection/recognition of products in supermarket shelves; this framework has a number of practical applications such as providing additional product/price information to the user or guiding visually impaired customers during shopping. The automatic creation of planograms (i.e., actual layout of products on shelves) is also useful for commercial analysis and management of large stores. Although in many object detection/recognition contexts it can be assumed that training images are representative of the real operational conditions, in our scenario such assumption is not realistic because the only training images available are acquired in well-controlled conditions. This gap between the training and test data makes the object detection and recognition tasks far more complex and requires very robust techniques. In this paper we prove that good results can be obtained by exploiting color and texture information in a multi-stage process: pre-selection, fine-selection and post processing. For fine-selection we compared a classical Bag of Words technique with a more recent Deep Neural Networks approach and found interesting outcomes. Extensive experiments on datasets of varying complexity are discussed to highlight the main issues characterizing this problem, and to guide toward the practical development of a real application.

Grocery product detection and recognition

FRANCO, ANNALISA;MALTONI, DAVIDE;PAPI, SERENA
2017

Abstract

Object detection and recognition are challenging computer vision tasks receiving great attention due to the large number of applications. This work focuses on the detection/recognition of products in supermarket shelves; this framework has a number of practical applications such as providing additional product/price information to the user or guiding visually impaired customers during shopping. The automatic creation of planograms (i.e., actual layout of products on shelves) is also useful for commercial analysis and management of large stores. Although in many object detection/recognition contexts it can be assumed that training images are representative of the real operational conditions, in our scenario such assumption is not realistic because the only training images available are acquired in well-controlled conditions. This gap between the training and test data makes the object detection and recognition tasks far more complex and requires very robust techniques. In this paper we prove that good results can be obtained by exploiting color and texture information in a multi-stage process: pre-selection, fine-selection and post processing. For fine-selection we compared a classical Bag of Words technique with a more recent Deep Neural Networks approach and found interesting outcomes. Extensive experiments on datasets of varying complexity are discussed to highlight the main issues characterizing this problem, and to guide toward the practical development of a real application.
2017
Franco, Annalisa; Maltoni, Davide; Papi, Serena
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/588325
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 43
  • ???jsp.display-item.citation.isi??? 36
social impact