Recent developments in artificial intelligence (AI) and large-language models (LLM) promote collaboration of humans and AI-based agents. However, the use of AI has risks, e.g., related to over-reliance and possibly unintended consequences stemming from structural issues and mistakes from both sides. Given that AI is a tool with intrinsic strengths and weaknesses, there are also responsibilities on the human side regarding how the tool is used. For the human-AI system to be effective, both actors should understand the limitations and risks of both players and adopt strategies to mitigate them. Therefore, in this position paper, we propose a model and process for continual bidirectional assessment and co-development of human-AI systems. Though research has mostly focussed on the evaluation of AI agents, we especially focus on the human. Through an analogy with software testing, we propose a “human-under-test” schema, where the AI agent proactively inspects the human user to identify potential issues (e.g., in knowledge, expectations, or process consistency) that might negatively affect the collaboration.
Casadei, R., Delnevo, G., Mirri, S. (2025). Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems. CEUR-WS.
Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems
Casadei R.;Delnevo G.;Mirri S.
2025
Abstract
Recent developments in artificial intelligence (AI) and large-language models (LLM) promote collaboration of humans and AI-based agents. However, the use of AI has risks, e.g., related to over-reliance and possibly unintended consequences stemming from structural issues and mistakes from both sides. Given that AI is a tool with intrinsic strengths and weaknesses, there are also responsibilities on the human side regarding how the tool is used. For the human-AI system to be effective, both actors should understand the limitations and risks of both players and adopt strategies to mitigate them. Therefore, in this position paper, we propose a model and process for continual bidirectional assessment and co-development of human-AI systems. Though research has mostly focussed on the evaluation of AI agents, we especially focus on the human. Through an analogy with software testing, we propose a “human-under-test” schema, where the AI agent proactively inspects the human user to identify potential issues (e.g., in knowledge, expectations, or process consistency) that might negatively affect the collaboration.| File | Dimensione | Formato | |
|---|---|---|---|
|
Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems.pdf
accesso aperto
Tipo:
Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
404.86 kB
Formato
Adobe PDF
|
404.86 kB | Adobe PDF | Visualizza/Apri |
|
paper2.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.17 MB
Formato
Adobe PDF
|
1.17 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



