CRIS Current Research Information System

Recent developments in artificial intelligence (AI) and large-language models (LLM) promote collaboration of humans and AI-based agents. However, the use of AI has risks, e.g., related to over-reliance and possibly unintended consequences stemming from structural issues and mistakes from both sides. Given that AI is a tool with intrinsic strengths and weaknesses, there are also responsibilities on the human side regarding how the tool is used. For the human-AI system to be effective, both actors should understand the limitations and risks of both players and adopt strategies to mitigate them. Therefore, in this position paper, we propose a model and process for continual bidirectional assessment and co-development of human-AI systems. Though research has mostly focussed on the evaluation of AI agents, we especially focus on the human. Through an analogy with software testing, we propose a “human-under-test” schema, where the AI agent proactively inspects the human user to identify potential issues (e.g., in knowledge, expectations, or process consistency) that might negatively affect the collaboration.

Casadei, R., Delnevo, G., Mirri, S. (2025). Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems. CEUR-WS.

Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems

Casadei R.;Delnevo G.;Mirri S.

2025

Abstract

Recent developments in artificial intelligence (AI) and large-language models (LLM) promote collaboration of humans and AI-based agents. However, the use of AI has risks, e.g., related to over-reliance and possibly unintended consequences stemming from structural issues and mistakes from both sides. Given that AI is a tool with intrinsic strengths and weaknesses, there are also responsibilities on the human side regarding how the tool is used. For the human-AI system to be effective, both actors should understand the limitations and risks of both players and adopt strategies to mitigate them. Therefore, in this position paper, we propose a model and process for continual bidirectional assessment and co-development of human-AI systems. Though research has mostly focussed on the evaluation of AI agents, we especially focus on the human. Through an analogy with software testing, we propose a “human-under-test” schema, where the AI agent proactively inspects the human user to identify potential issues (e.g., in knowledge, expectations, or process consistency) that might negatively affect the collaboration.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Proceedings of the 1st Workshop on Human-AI Collaborative Systems co-located with 28th European Conference on Artificial Intelligence (ECAI 2025)
			
	Pagina iniziale
	
				13
			
	Pagina finale
	
				22
			
	Collana/Serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Citazione
	
				Casadei, R., Delnevo, G., Mirri, S. (2025). Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems. CEUR-WS.
			
	Tutti gli autori
	
						Casadei, R.; Delnevo, G.; Mirri, S.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems.pdf accesso aperto Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 404.86 kB Formato Adobe PDF Visualizza/Apri	404.86 kB	Adobe PDF	Visualizza/Apri
paper2.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.17 MB Formato Adobe PDF Visualizza/Apri	1.17 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1036630

Citazioni

ND

0

ND

ND

social impact