We present an empirical evaluation of Large Language Models (LLMs) in understanding semantic-preserving code transformations such as copy propagation and constant folding. Our results show that LLMs fail to recognize semantic equivalence in approximately 41% of cases without additional context, and in 29% of cases even when provided with a simple, generic context. To improve performance, we propose to integrate LLMs with code optimization tools - both to enhance training and to support deeper program comprehension.
Laneve, C., Spano, A., Ressi, D., Rossi, S., Bugliesi, M. (2025). Assessing Code Understanding in LLMs. GEWERBESTRASSE : SPRINGER INTERNATIONAL PUBLISHING AG [10.1007/978-3-031-95497-9_13].
Assessing Code Understanding in LLMs
Laneve C.;
2025
Abstract
We present an empirical evaluation of Large Language Models (LLMs) in understanding semantic-preserving code transformations such as copy propagation and constant folding. Our results show that LLMs fail to recognize semantic equivalence in approximately 41% of cases without additional context, and in 29% of cases even when provided with a simple, generic context. To improve performance, we propose to integrate LLMs with code optimization tools - both to enhance training and to support deeper program comprehension.| File | Dimensione | Formato | |
|---|---|---|---|
|
Paper_Journal_CSV_STTT_2025_Accepted.pdf
embargo fino al 10/07/2026
Tipo:
Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza:
Licenza per accesso libero gratuito
Dimensione
604.87 kB
Formato
Adobe PDF
|
604.87 kB | Adobe PDF | Visualizza/Apri Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


