This paper investigates the application of word embeddings to derive semantic classes for Italian adjectives. Adjectives were clustered using UMAP for dimensionality reduction and K-means for clustering. Semantic categories such as “Relational”, “Descriptive”, “Evaluative”, “Membership”, and “Physical/HealthRelated” were tested by employing predefined prototypical adjectives for each class. The precision and recall of the classification were analyzed, revealing high accuracy for some classes (e.g., “Evaluative”), but challenges in distinguishing more nuanced categories such as “Descriptive”. Furthermore, cluster overlaps were visualized using KDE and quantified using KNN, , highlighting semantic intermingling between groups, especially between the “Descriptive” and “Evaluative” categories. Finally, a comparison with Wordnet’s adjective categories was provided.

Lacic, I. (2025). Deriving semantic classes of Italian adjectives via word embeddings: a large-scale investigation. Weesp : Global Wordnet Association [10.18653/v1/2025.gwc-1.34].

Deriving semantic classes of Italian adjectives via word embeddings: a large-scale investigation

Ivan Lacic
Primo
2025

Abstract

This paper investigates the application of word embeddings to derive semantic classes for Italian adjectives. Adjectives were clustered using UMAP for dimensionality reduction and K-means for clustering. Semantic categories such as “Relational”, “Descriptive”, “Evaluative”, “Membership”, and “Physical/HealthRelated” were tested by employing predefined prototypical adjectives for each class. The precision and recall of the classification were analyzed, revealing high accuracy for some classes (e.g., “Evaluative”), but challenges in distinguishing more nuanced categories such as “Descriptive”. Furthermore, cluster overlaps were visualized using KDE and quantified using KNN, , highlighting semantic intermingling between groups, especially between the “Descriptive” and “Evaluative” categories. Finally, a comparison with Wordnet’s adjective categories was provided.
2025
Proceedings of the 13th Global Wordnet Conference
275
284
Lacic, I. (2025). Deriving semantic classes of Italian adjectives via word embeddings: a large-scale investigation. Weesp : Global Wordnet Association [10.18653/v1/2025.gwc-1.34].
Lacic, Ivan
File in questo prodotto:
File Dimensione Formato  
2025.gwc-1.34.pdf

accesso aperto

Descrizione: Contributo in Atti di Convegno
Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 12.67 MB
Formato Adobe PDF
12.67 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1029532
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact