Semantichashingisatechniquetorepresenthigh-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent represen- tations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally. In this paper, we propose a novel method for supervision (self- supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can signifi- cantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled sam- ples decreases. In this scenario, the proposed self-supervised approach out- performs the classical approaches and yields similar performance in fully- supervised settings.

Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing

Stefano Lodi
Membro del Collaboration Group
;
Claudio Sartori
Supervision
2021

Abstract

Semantichashingisatechniquetorepresenthigh-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent represen- tations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally. In this paper, we propose a novel method for supervision (self- supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can signifi- cantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled sam- ples decreases. In this scenario, the proposed self-supervised approach out- performs the classical approaches and yields similar performance in fully- supervised settings.
2021
Progress in Pattern Recognition, Image Analysis, Computer Vision,and Applications - 25th Iberoamerican Congress, {CIARP} 2021, Porto,Portugal, May 10-13, 2021, Revised Selected Papers
258
268
Ricardo Ñanculef, Francisco Alejandro Mena, Antonio Macaluso, Stefano Lodi, Claudio Sartori
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/876189
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact