Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns. This study seeks to address these issues by exploiting a privacy by design architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements. We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.
Magnini, M., Aguzzi, G., Montagna, S. (2025). Open-source small language models for personal medical assistant chatbots. INTELLIGENCE-BASED MEDICINE, 11, 1-9 [10.1016/j.ibmed.2024.100197].
Open-source small language models for personal medical assistant chatbots
Matteo Magnini;Gianluca Aguzzi;Sara Montagna
2025
Abstract
Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns. This study seeks to address these issues by exploiting a privacy by design architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements. We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.| File | Dimensione | Formato | |
|---|---|---|---|
|
1-s2.0-S2666521224000644-main-6.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
885.61 kB
Formato
Adobe PDF
|
885.61 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


