CRIS Current Research Information System

Any source code can be represented as a graph. This kind of representation allows capturing the interaction between the elements of a program, such as functions, variables, etc. Modeling these interactions can enable us to infer the purpose of a code snippet, a function, or even an entire program. Lately, more and more work appear, where source code is represented in the form of a graph. One of the difficulties in evaluating the usefulness of such representation is the lack of a proper dataset and an evaluation metric. Our contribution is in preparing a dataset that represents programs written in Python and Java source codes in the form of dependency and function call graphs. In this dataset, multiple projects are analyzed and united into a single graph. The nodes of the graph represent the functions, variables, classes, methods, interfaces, etc. Nodes for functions carry information about how these functions are constructed internally, and where they are called from. Such graphs enable training hierarchical vector representations for source code. Moreover, some functions come with textual descriptions (docstrings), which allows learning useful tasks such as API search and generation of documentation.

Representing Programs with Dependency and Function Call Graphs for Learning Hierarchical Embeddings / Romanov V; Ivanov V; Succi G. - ELETTRONICO. - 2:(2020), pp. 360-366. (Intervento presentato al convegno 22nd International Conference on Enterprise Information Systems (ICEIS 2020) tenutosi a Online streaming nel May 5-7, 2020) [10.5220/0009511803600366].

Representing Programs with Dependency and Function Call Graphs for Learning Hierarchical Embeddings

Romanov V;Ivanov V;Succi G

2020

Abstract

Any source code can be represented as a graph. This kind of representation allows capturing the interaction between the elements of a program, such as functions, variables, etc. Modeling these interactions can enable us to infer the purpose of a code snippet, a function, or even an entire program. Lately, more and more work appear, where source code is represented in the form of a graph. One of the difficulties in evaluating the usefulness of such representation is the lack of a proper dataset and an evaluation metric. Our contribution is in preparing a dataset that represents programs written in Python and Java source codes in the form of dependency and function call graphs. In this dataset, multiple projects are analyzed and united into a single graph. The nodes of the graph represent the functions, variables, classes, methods, interfaces, etc. Nodes for functions carry information about how these functions are constructed internally, and where they are called from. Such graphs enable training hierarchical vector representations for source code. Moreover, some functions come with textual descriptions (docstrings), which allows learning useful tasks such as API search and generation of documentation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2020
		
	Titolo del volume
	
			Proceedings of the 22nd International Conference on Enterprise Information Systems (ICEIS)
		
	Pagina iniziale
	
			360
		
	Pagina finale
	
			366
		
	Codice DOI
	
			https://dx.doi.org/10.5220/0009511803600366
		
	Citazione
	
			Representing Programs with Dependency and Function Call Graphs for Learning Hierarchical Embeddings / Romanov V; Ivanov V; Succi G. - ELETTRONICO. - 2:(2020), pp. 360-366. (Intervento presentato al  convegno 22nd International Conference on Enterprise Information Systems (ICEIS 2020) tenutosi a Online streaming nel May 5-7, 2020) [10.5220/0009511803600366].
		
	Tutti gli autori
	
			Romanov V; Ivanov V; Succi G
		
	Appare nelle tipologie:
	
			4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Succi.C304.RepresentingProgramswithDependencyandFunctionCallGraphsforLearningHierarchicalEmbeddings.pdf accesso aperto Tipo: Versione (PDF) editoriale Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND) Dimensione 400.58 kB Formato Adobe PDF Visualizza/Apri	400.58 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/892511

Citazioni

ND

2

1

social impact