In multi-agent systems and intelligent environments, agents often rely on symbolic knowledge to reason, interact, and make decisions in a transparent and trustworthy manner. Ensuring the quality of such symbolic knowledge is crucial, especially when it is automatically extracted from opaque models through explainable AI techniques. However, the literature still lacks comprehensive and unbiased evaluation metrics that jointly account for predictive accuracy, human interpretability, and semantic completeness — three pillars of effective knowledge for agents. In this work, we introduce WInd, a novel and flexible scoring metric designed to assess the overall quality of symbolic knowledge in agent-based systems. WInd combines performance, readability, and completeness into a unified score, and further enables task-oriented customisation through the integration of user feedback. Its formulation supports automated knowledge tuning and facilitates knowledge sharing and comparison among agents with diverse goals and perspectives. We present the formal definition of WInd and provide a thorough comparative analysis against existing, yet limited, metrics. Our findings show that WInd offers a principled and adaptable framework for evaluating symbolic knowledge quality, paving the way for more autonomous, collaborative, and cognitively grounded intelligent agents.
Sabbatini, F., Calegari, R. (2025). Symbolic Knowledge Quality Evaluation with WInd. CEUR-WS.
Symbolic Knowledge Quality Evaluation with WInd
Calegari R.
2025
Abstract
In multi-agent systems and intelligent environments, agents often rely on symbolic knowledge to reason, interact, and make decisions in a transparent and trustworthy manner. Ensuring the quality of such symbolic knowledge is crucial, especially when it is automatically extracted from opaque models through explainable AI techniques. However, the literature still lacks comprehensive and unbiased evaluation metrics that jointly account for predictive accuracy, human interpretability, and semantic completeness — three pillars of effective knowledge for agents. In this work, we introduce WInd, a novel and flexible scoring metric designed to assess the overall quality of symbolic knowledge in agent-based systems. WInd combines performance, readability, and completeness into a unified score, and further enables task-oriented customisation through the integration of user feedback. Its formulation supports automated knowledge tuning and facilitates knowledge sharing and comparison among agents with diverse goals and perspectives. We present the formal definition of WInd and provide a thorough comparative analysis against existing, yet limited, metrics. Our findings show that WInd offers a principled and adaptable framework for evaluating symbolic knowledge quality, paving the way for more autonomous, collaborative, and cognitively grounded intelligent agents.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


