Machine learning (ML) is making its way into the source code analysis. Most of the time, this happens with the help of Natural Language Processing (NLP) techniques. However, NLP techniques often represent their input as a sequence of tokens. This assumption is reasonable when processing text because the words related to the same object usually follow each other. However, in source code, this assumption can be inadequate simply because of the source code execution nature. Graphs can be much more adequate for representing source code. They can capture the dependency structure of a program. Due to the recent advances in the area of machine learning on graphs, researchers started to explore the graph-based representation of software in the scope of machine learning applications. There is no single way to represent a program in the form of a graph. For this reason, researchers explored different alternatives, such as function call graphs (FCG), data flow graphs (DFG), control flow graphs (CFG), or their mixtures. In this survey, we overview approaches for representing software as graphs and how these representations help to solve machine learning tasks.

Romanov V, Ivanov V, Succi G (2020). Approaches for Representing Software as Graphs for Machine Learning Applications. IEEE [10.1109/ICS51289.2020.00109].

Approaches for Representing Software as Graphs for Machine Learning Applications

Succi G
2020

Abstract

Machine learning (ML) is making its way into the source code analysis. Most of the time, this happens with the help of Natural Language Processing (NLP) techniques. However, NLP techniques often represent their input as a sequence of tokens. This assumption is reasonable when processing text because the words related to the same object usually follow each other. However, in source code, this assumption can be inadequate simply because of the source code execution nature. Graphs can be much more adequate for representing source code. They can capture the dependency structure of a program. Due to the recent advances in the area of machine learning on graphs, researchers started to explore the graph-based representation of software in the scope of machine learning applications. There is no single way to represent a program in the form of a graph. For this reason, researchers explored different alternatives, such as function call graphs (FCG), data flow graphs (DFG), control flow graphs (CFG), or their mixtures. In this survey, we overview approaches for representing software as graphs and how these representations help to solve machine learning tasks.
2020
2020 International Computer Symposium (ICS)
529
534
Romanov V, Ivanov V, Succi G (2020). Approaches for Representing Software as Graphs for Machine Learning Applications. IEEE [10.1109/ICS51289.2020.00109].
Romanov V; Ivanov V; Succi G
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/892525
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact