Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets

Del Moro, Mirko; Tudosie, Serban Cristian; Vannoni, Francesco; Galassi, Andrea; Ruggeri, Federico

doi:10.1007/978-3-031-42448-9_1

Fashion e-commerce platforms are becoming increasingly popular. However, scanning, rendering, and captioning fashion items are still done mostly manually. In this work, we address the task of generating a textual description of a fashion item from an image portraying it. We carry out an extensive study with several neural architectures based on InceptionV3. We consider two existing fashion image captioning datasets, FACAD and InFashAI. We also curate a novel dataset, Fashion-Cap, that contains more than 290,000 images and 40,000 corresponding captions. In our analysis, we observe significant differences between the three datasets’ captions, with Fashion-Cap having higher quality captions. To the best of our knowledge, this is the most extensive experimental study in fashion image captioning to date. Our experimental results show that our dataset is less challenging than FACAD but more than InFashAI, which confirms our insights, suggesting that it could be a valuable benchmark for this domain.

Del Moro, M., Tudosie, S.C., Vannoni, F., Galassi, A., Ruggeri, F. (2023). Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets. Cham : Springer [10.1007/978-3-031-42448-9_1].

Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets

Del Moro, Mirko^Co-primo;Tudosie, Serban Cristian^Co-primo;Vannoni, Francesco^Co-primo;Galassi, Andrea;Ruggeri, Federico

2023

Abstract

Fashion e-commerce platforms are becoming increasingly popular. However, scanning, rendering, and captioning fashion items are still done mostly manually. In this work, we address the task of generating a textual description of a fashion item from an image portraying it. We carry out an extensive study with several neural architectures based on InceptionV3. We consider two existing fashion image captioning datasets, FACAD and InFashAI. We also curate a novel dataset, Fashion-Cap, that contains more than 290,000 images and 40,000 corresponding captions. In our analysis, we observe significant differences between the three datasets’ captions, with Fashion-Cap having higher quality captions. To the best of our knowledge, this is the most extensive experimental study in fashion image captioning to date. Our experimental results show that our dataset is less challenging than FACAD but more than InFashAI, which confirms our insights, suggesting that it could be a valuable benchmark for this domain.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Titolo del volume
	
				Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023.
			
	Pagina iniziale
	
				3
			
	Pagina finale
	
				14
			
	Collana/Serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-031-42448-9_1
			
	Citazione
	
				Del Moro, M., Tudosie, S.C., Vannoni, F., Galassi, A., Ruggeri, F. (2023). Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets. Cham : Springer [10.1007/978-3-031-42448-9_1].
			
	Tutti gli autori
	
						Del Moro, Mirko; Tudosie, Serban Cristian; Vannoni, Francesco; Galassi, Andrea; Ruggeri, Federico
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
_23_CLEF__Image_Captioning_for_Fashion.pdf Open Access dal 12/09/2024 Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 1.22 MB Formato Adobe PDF Visualizza/Apri	1.22 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/941313

Citazioni

ND

1

0

1

CRIS Current Research Information System

Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets

Del Moro, Mirko^Co-primo;Tudosie, Serban Cristian^Co-primo;Vannoni, Francesco^Co-primo;Galassi, Andrea;Ruggeri, Federico

Co-primo

Co-primo

Co-primo

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

social impact

CRIS Current Research Information System

Inception Models for Fashion Image Captioning: An Extensive Study on Multiple Datasets

Del Moro, MirkoCo-primo;Tudosie, Serban CristianCo-primo;Vannoni, FrancescoCo-primo;Galassi, Andrea;Ruggeri, Federico

Co-primo

Co-primo

Co-primo

2023

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Del Moro, Mirko^Co-primo;Tudosie, Serban Cristian^Co-primo;Vannoni, Francesco^Co-primo;Galassi, Andrea;Ruggeri, Federico

Scheda breve

Scheda completa

Scheda completa (DC)