Neuroendocrine neoplasms (NENs) are rare and heterogeneous malignancies requiring multidisciplinary management. Large language models (LLMs) are emerging as decision-support tools, but their role in therapeutic decision-making is largely unexplored. ARTEMIS was a pilot cross-sectional study comparing three configurations—a baseline GPT, a customised GPT with static domain knowledge (GPTs), and a retrieval-augmented GPT (RAG)—against a panel of nine Italian NEN experts using twenty simulated, non-surgical cases. The primary endpoint was non-inferiority for systemic therapy recommendations; secondary endpoints included completeness, explicit uncertainty, parsimony of additional tests, costs, and variability metrics. RAG and GPTs achieved 70.0% agreement versus the expert benchmark (63.8%), meeting the exploratory –10% non-inferiority margin but not the stricter –5% threshold. Baseline GPT reached 60.0% and was not non-inferior. All AI systems consistently produced complete recommendations and expressed uncertainty more often than experts; RAG tended to propose fewer additional tests and lower associated costs. Experts showed greater variability than AI systems, and Ki-67 correlated with disagreement, indicating biological aggressiveness as a source of uncertainty. This exploratory study suggests that LLMs can approximate expert therapeutic reasoning under controlled conditions, but concordance remains limited and external validation in real-world settings is needed before clinical use.
Lamberti, G., Panzuto, F., Massironi, S., Cives, M., La Salvia, A., Spada, F., et al. (2026). ARTEMIS: a pilot study comparing AI-based and expert therapeutic decisions in simulated clinical cases of neuroendocrine neoplasms. NPJ DIGITAL MEDICINE, 9(1), 1-10 [10.1038/s41746-025-02274-x].
ARTEMIS: a pilot study comparing AI-based and expert therapeutic decisions in simulated clinical cases of neuroendocrine neoplasms
Lamberti G.Primo
;Andrini E.;Ricci C.;Campana D.Ultimo
2026
Abstract
Neuroendocrine neoplasms (NENs) are rare and heterogeneous malignancies requiring multidisciplinary management. Large language models (LLMs) are emerging as decision-support tools, but their role in therapeutic decision-making is largely unexplored. ARTEMIS was a pilot cross-sectional study comparing three configurations—a baseline GPT, a customised GPT with static domain knowledge (GPTs), and a retrieval-augmented GPT (RAG)—against a panel of nine Italian NEN experts using twenty simulated, non-surgical cases. The primary endpoint was non-inferiority for systemic therapy recommendations; secondary endpoints included completeness, explicit uncertainty, parsimony of additional tests, costs, and variability metrics. RAG and GPTs achieved 70.0% agreement versus the expert benchmark (63.8%), meeting the exploratory –10% non-inferiority margin but not the stricter –5% threshold. Baseline GPT reached 60.0% and was not non-inferior. All AI systems consistently produced complete recommendations and expressed uncertainty more often than experts; RAG tended to propose fewer additional tests and lower associated costs. Experts showed greater variability than AI systems, and Ki-67 correlated with disagreement, indicating biological aggressiveness as a source of uncertainty. This exploratory study suggests that LLMs can approximate expert therapeutic reasoning under controlled conditions, but concordance remains limited and external validation in real-world settings is needed before clinical use.| File | Dimensione | Formato | |
|---|---|---|---|
|
s41746-025-02274-x.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
672.28 kB
Formato
Adobe PDF
|
672.28 kB | Adobe PDF | Visualizza/Apri |
|
41746_2025_2274_MOESM1_ESM (1).pdf
accesso aperto
Tipo:
File Supplementare
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
84.18 kB
Formato
Adobe PDF
|
84.18 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


