CRIS Current Research Information System

Biomedical Named Entity Recognition (BioNER) faces significant challenges in real-world applications due to limited annotated data and the constant emergence of new entity types, making zero-shot learning capabilities crucial. While Large Language Models (LLMs) possess extensive domain knowledge necessary for specialized fields like biomedicine, their computational costs often make them impractical. To address these challenges, we introduce OpenBioNER, a lightweight BERT-based cross-encoder architecture that can identify any biomedical entity using only its description, eliminating the need for retraining on new, unseen entity types. Through comprehensive evaluation on established biomedical benchmarks, we demonstrate that OpenBioNER surpasses state-of-the-art baselines, including specialized 7B NER LLMs and GPT-4o, achieving up to 10% higher F1 scores while using 110M parameters only. Moreover, OpenBioNER outperforms existing small-scale models that match textual spans with entity types rather than descriptions, both in terms of accuracy and computational efficiency.

Cocchieri, A., Frisoni, G., Martinez Galindo, M., Moro, G., Tagliavini, G., Candoli, F. (2025). OpenBioNER: Lightweight Open-Domain Biomedical Named Entity Recognition Through Entity Type Description.

OpenBioNER: Lightweight Open-Domain Biomedical Named Entity Recognition Through Entity Type Description

Alessio Cocchieri^Co-primo;Giacomo Frisoni^Co-primo;Marcos Martinez Galindo^Secondo;Gianluca Moro^Co-primo;Giuseppe Tagliavini^Penultimo;Francesco Candoli^Ultimo

2025

Abstract

Biomedical Named Entity Recognition (BioNER) faces significant challenges in real-world applications due to limited annotated data and the constant emergence of new entity types, making zero-shot learning capabilities crucial. While Large Language Models (LLMs) possess extensive domain knowledge necessary for specialized fields like biomedicine, their computational costs often make them impractical. To address these challenges, we introduce OpenBioNER, a lightweight BERT-based cross-encoder architecture that can identify any biomedical entity using only its description, eliminating the need for retraining on new, unseen entity types. Through comprehensive evaluation on established biomedical benchmarks, we demonstrate that OpenBioNER surpasses state-of-the-art baselines, including specialized 7B NER LLMs and GPT-4o, achieving up to 10% higher F1 scores while using 110M parameters only. Moreover, OpenBioNER outperforms existing small-scale models that match textual spans with entity types rather than descriptions, both in terms of accuracy and computational efficiency.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Findings of the Association for Computational Linguistics: NAACL 2025
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				20
			
	Citazione
	
				Cocchieri, A., Frisoni, G., Martinez Galindo, M., Moro, G., Tagliavini, G., Candoli, F. (2025). OpenBioNER: Lightweight Open-Domain Biomedical Named Entity Recognition Through Entity Type Description.
			
	Tutti gli autori
	
						Cocchieri, Alessio; Frisoni, Giacomo; Martinez Galindo, Marcos; Moro, Gianluca; Tagliavini, Giuseppe; Candoli, Francesco
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2025.findings-naacl.47.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 680.11 kB Formato Adobe PDF Visualizza/Apri	680.11 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1009721

Citazioni

ND

ND

ND

social impact