Semantichashingisatechniquetorepresenthigh-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent represen- tations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally. In this paper, we propose a novel method for supervision (self- supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can signifi- cantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled sam- ples decreases. In this scenario, the proposed self-supervised approach out- performs the classical approaches and yields similar performance in fully- supervised settings.
Ñanculef, R., Alejandro Mena, F., Macaluso, A., Lodi, S., Sartori, C. (2021). Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing. Springer [10.1007/978-3-030-93420-0_25].
Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing
Stefano LodiMembro del Collaboration Group
;Claudio SartoriSupervision
2021
Abstract
Semantichashingisatechniquetorepresenthigh-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent represen- tations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally. In this paper, we propose a novel method for supervision (self- supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can signifi- cantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled sam- ples decreases. In this scenario, the proposed self-supervised approach out- performs the classical approaches and yields similar performance in fully- supervised settings.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.