Nowadays social networks are becoming an essential ingredient of our life, the faster way to share ideas and to influence people. Interaction within social networks tends to take place within communities, sets of social accounts which share friendships, ideas, interests and passions; detecting digital communities is of increasing relevance, from a social and economical point of view. In this paper, we analyze the problem of community detection from a content analysis perspective: we argue that the content produced in social interaction is a very distinctive feature of a community, hence it can be effectively used for community detection. We analyze the problem from a textual perspective using only syntactic and semantic features, including high level latent features that we denote as topics. We show that, by inspecting the content used by tweets, we can achieve very efficient classifiers and predictors of account membership within a given community. We describe the features that best constitute a vocabulary, then we provide their comparative evaluation and select the best features for the task, and finally we illustrate an application of our approach to some concrete community detection scenarios, such as Italian politics and targeted advertising.

Ramponi Giorgia, Brambilla Marco, Ceri Stefano, Daniel Florian, Di Giovanni Marco (2020). Content-based characterization of online social communities. INFORMATION PROCESSING & MANAGEMENT, 57(6), 1-11 [10.1016/j.ipm.2019.102133].

Content-based characterization of online social communities

Di Giovanni Marco
2020

Abstract

Nowadays social networks are becoming an essential ingredient of our life, the faster way to share ideas and to influence people. Interaction within social networks tends to take place within communities, sets of social accounts which share friendships, ideas, interests and passions; detecting digital communities is of increasing relevance, from a social and economical point of view. In this paper, we analyze the problem of community detection from a content analysis perspective: we argue that the content produced in social interaction is a very distinctive feature of a community, hence it can be effectively used for community detection. We analyze the problem from a textual perspective using only syntactic and semantic features, including high level latent features that we denote as topics. We show that, by inspecting the content used by tweets, we can achieve very efficient classifiers and predictors of account membership within a given community. We describe the features that best constitute a vocabulary, then we provide their comparative evaluation and select the best features for the task, and finally we illustrate an application of our approach to some concrete community detection scenarios, such as Italian politics and targeted advertising.
2020
Ramponi Giorgia, Brambilla Marco, Ceri Stefano, Daniel Florian, Di Giovanni Marco (2020). Content-based characterization of online social communities. INFORMATION PROCESSING & MANAGEMENT, 57(6), 1-11 [10.1016/j.ipm.2019.102133].
Ramponi Giorgia; Brambilla Marco; Ceri Stefano; Daniel Florian; Di Giovanni Marco
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/854392
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 3
social impact