In this paper we study the problem-solving ability of the Large Language Model known as GPT-3 (codename DaVinci), by considering its performance in solving tasks proposed in the “Bebras International Challenge on Informatics and Computational Thinking”. In our experiment, GPT-3 was able to answer with a majority of correct answers about one third of the Bebras tasks we submitted to it. The linguistic fluency of GPT-3 is impressive and, at a first reading, its explanations sound coherent, on-topic and authoritative; however the answers it produced are in fact erratic and the explanations often questionable or plainly wrong. The tasks in which the system performs better are those that describe a procedure, asking to execute it on a specific instance of the problem. Tasks solvable with simple, one-step deductive reasoning are more likely to obtain better answers and explanations. Synthesis tasks, or tasks that require a more complex logical consistency get the most incorrect answers.

Davinci Goes to Bebras: A Study on the Problem Solving Ability of GPT-3

Lodi Michael;
2023

Abstract

In this paper we study the problem-solving ability of the Large Language Model known as GPT-3 (codename DaVinci), by considering its performance in solving tasks proposed in the “Bebras International Challenge on Informatics and Computational Thinking”. In our experiment, GPT-3 was able to answer with a majority of correct answers about one third of the Bebras tasks we submitted to it. The linguistic fluency of GPT-3 is impressive and, at a first reading, its explanations sound coherent, on-topic and authoritative; however the answers it produced are in fact erratic and the explanations often questionable or plainly wrong. The tasks in which the system performs better are those that describe a procedure, asking to execute it on a specific instance of the problem. Tasks solvable with simple, one-step deductive reasoning are more likely to obtain better answers and explanations. Synthesis tasks, or tasks that require a more complex logical consistency get the most incorrect answers.
2023
Proceedings of the 15th International Conference on Computer Supported Education (CSEDU 2023)
59
69
Bellettini Carlo; Lodi Michael; Lonati Violetta; Monga Mattia; Morpurgo Anna
File in questo prodotto:
File Dimensione Formato  
120075.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 352.26 kB
Formato Adobe PDF
352.26 kB Adobe PDF Visualizza/Apri
csedu-postprint.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 157.25 kB
Formato Adobe PDF
157.25 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/942139
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact