The controversy surrounding COMPAS exposes a significant gap between computer science and social science in understanding bias, highlighting the need to align computational fairness metrics with humanistic interpretations. In response, we introduce the EVENS benchmark to align the Equality versus Equity Notion Spectrum in LLMs. Our contributions include constructing an equality-equity notion spectrum and generating a corresponding dataset of key fairness scenarios, evaluating models' initial stances and test stance adjustments under external legal regulations and internal organizational regulations using Retrieval-Augmented Generation (RAG), introducing Chain-of-Thought (CoT) prompting to guide fairness reasoning, and adding an uncertain choice to assess its impact. Our findings indicate that LLMs initially favor equality over equity. Incorporating legal and organizational regulations of equity through RAG can reduce proportional equality in most models and enhance equity recognition in GPT4o significantly. CoT improves the equity reasoning of Chinese models but may also rationalize existing biases, and the uncertain option promotes more cautious responses. The links to the code and datasets: https://github.com/CrexCheng/EVEN.

Chen, Q., Cheng, R., Liu, Y., Xie, Z., Zhao, K., Li, P., et al. (2026). EVENS: Equality versus Equity Notion Spectrum of LLMs. Association for Computing Machinery, Inc [10.1145/3769126.3769262].

EVENS: Equality versus Equity Notion Spectrum of LLMs

Chen Q.;Cheng R.;Xie Z.;Rotolo A.;
2026

Abstract

The controversy surrounding COMPAS exposes a significant gap between computer science and social science in understanding bias, highlighting the need to align computational fairness metrics with humanistic interpretations. In response, we introduce the EVENS benchmark to align the Equality versus Equity Notion Spectrum in LLMs. Our contributions include constructing an equality-equity notion spectrum and generating a corresponding dataset of key fairness scenarios, evaluating models' initial stances and test stance adjustments under external legal regulations and internal organizational regulations using Retrieval-Augmented Generation (RAG), introducing Chain-of-Thought (CoT) prompting to guide fairness reasoning, and adding an uncertain choice to assess its impact. Our findings indicate that LLMs initially favor equality over equity. Incorporating legal and organizational regulations of equity through RAG can reduce proportional equality in most models and enhance equity recognition in GPT4o significantly. CoT improves the equity reasoning of Chinese models but may also rationalize existing biases, and the uncertain option promotes more cautious responses. The links to the code and datasets: https://github.com/CrexCheng/EVEN.
2026
20th International Conference on Artificial Intelligence and Law, ICAIL 2025 - Proceedings of the Conference
374
378
Chen, Q., Cheng, R., Liu, Y., Xie, Z., Zhao, K., Li, P., et al. (2026). EVENS: Equality versus Equity Notion Spectrum of LLMs. Association for Computing Machinery, Inc [10.1145/3769126.3769262].
Chen, Q.; Cheng, R.; Liu, Y.; Xie, Z.; Zhao, K.; Li, P.; Antonino, R.; Liu, Y.; Shen, W.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1043591
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact