Self-supervised representation learning extracts meaningful features from data without explicit supervision, building a space with desired properties. Contrastive learning has emerged as the predominant approach to clustering similar data points and separating dissimilar ones within the embedding space. Although creating different views of the same data (e.g., cropping, rotation) emphasizes similarities without labels, current methods struggle to define negative examples. Several algorithms only consider positive examples or integrate dissimilarity measures into their loss functions by computing average distances within the same batch. However, they do not capture nuanced differences effectively, risking collapsing data points in a single location. In this paper, we propose a novel technique, termed ``Refined Triplet Sampling'' (ReTSam), to generate synthetic negative vectors for contrastive learning. Mechanically, for each element in the batch, we identify its k-nearest neighbors and designate the centroid as a hard negative for a triplet loss methodology. We test ReTSam on two widely used image datasets, namely CIFAR-10 and SVHN, considering content-based image retrieval and classification tasks. Our findings demonstrate that, despite its simplicity, ReTSam not only promotes the learning of similarity but also significantly improves that of dissimilarity (with a +5% increase in Mean Average Precision on CIFAR10), resulting in superior performance in practical scenarios.

Goyo, M., Frisoni, G., Moro, G., Sartori, C. (2024). Refining Triplet Sampling for Improved Self-Supervised Representation Learning.

Refining Triplet Sampling for Improved Self-Supervised Representation Learning

Giacomo Frisoni
Co-primo
;
Gianluca Moro
Co-primo
;
Claudio Sartori
Co-primo
2024

Abstract

Self-supervised representation learning extracts meaningful features from data without explicit supervision, building a space with desired properties. Contrastive learning has emerged as the predominant approach to clustering similar data points and separating dissimilar ones within the embedding space. Although creating different views of the same data (e.g., cropping, rotation) emphasizes similarities without labels, current methods struggle to define negative examples. Several algorithms only consider positive examples or integrate dissimilarity measures into their loss functions by computing average distances within the same batch. However, they do not capture nuanced differences effectively, risking collapsing data points in a single location. In this paper, we propose a novel technique, termed ``Refined Triplet Sampling'' (ReTSam), to generate synthetic negative vectors for contrastive learning. Mechanically, for each element in the batch, we identify its k-nearest neighbors and designate the centroid as a hard negative for a triplet loss methodology. We test ReTSam on two widely used image datasets, namely CIFAR-10 and SVHN, considering content-based image retrieval and classification tasks. Our findings demonstrate that, despite its simplicity, ReTSam not only promotes the learning of similarity but also significantly improves that of dissimilarity (with a +5% increase in Mean Average Precision on CIFAR10), resulting in superior performance in practical scenarios.
2024
Proceedings of the 32nd Symposium of Advanced Database Systems, Villasimius, Italy, June 23rd to 26th, 2024
227
246
Goyo, M., Frisoni, G., Moro, G., Sartori, C. (2024). Refining Triplet Sampling for Improved Self-Supervised Representation Learning.
Goyo, Manuel; Frisoni, Giacomo; Moro, Gianluca; Sartori, Claudio
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1009865
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact