Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description- Based Code Selection

Puts, Sander; Zegers, Catharina M. L.; Dekker, Andre; BERMEJO DELGADO, Inigo

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/45888

Full metadata record

DC Field	Value	Language
dc.contributor.author	Puts, Sander	-
dc.contributor.author	Zegers, Catharina M. L.	-
dc.contributor.author	Dekker, Andre	-
dc.contributor.author	BERMEJO DELGADO, Inigo	-
dc.date.accessioned	2025-04-22T10:05:03Z	-
dc.date.available	2025-04-22T10:05:03Z	-
dc.date.issued	2025	-
dc.date.submitted	2025-04-18T09:57:38Z	-
dc.identifier.citation	JMIR Formative Research, 9 (Art N° e60095)	-
dc.identifier.uri	http://hdl.handle.net/1942/45888	-
dc.description.abstract	Background: The International Classification of Diseases (ICD), developed by the World Health Organization, standardizes health condition coding to support health care policy, research, and billing, but artificial intelligence automation, while promising, still underperforms compared with human accuracy and lacks the explainability needed for adoption in medical settings. Objective: The potential of large language models for assisting medical coders in the ICD-10 coding was explored through the development of a computer-assisted coding system. This study aimed to augment human coding by initially identifying lead terms and using retrieval-augmented generation (RAG)-based methods for computer-assisted coding enhancement. Methods: The explainability dataset from the CodiEsp challenge (CodiEsp-X) was used, featuring 1000 Spanish clinical cases annotated with ICD-10 codes. A new dataset, CodiEsp-X-lead, was generated using GPT-4 to replace full-textual evidence annotations with lead term annotations. A Robustly Optimized BERT (Bidirectional Encoder Representations from Transformers) Pretraining Approach transformer model was fine-tuned for named entity recognition to extract lead terms. GPT-4 was subsequently employed to generate code descriptions from the extracted textual evidence. Using a RAG approach, ICD codes were assigned to the lead terms by querying a vector database of ICD code descriptions with OpenAI's text-embedding-ada-002 model. Results: The fine-tuned Robustly Optimized BERT Pretraining Approach achieved an overall F1-score of 0.80 for ICD lead term extraction on the new CodiEsp-X-lead dataset. GPT-4-generated code descriptions reduced retrieval failures in the RAG approach by approximately 5% for both diagnoses and procedures. However, the overall explainability F1-score for the CodiEsp-X task was limited to 0.305, significantly lower than the state-of-the-art F1-score of 0.633. The diminished performance was partly due to the reliance on code descriptions, as some ICD codes lacked descriptions, and the approach did not fully align with the medical coder's workflow. Conclusions: While lead term extraction showed promising results, the subsequent RAG-based code assignment using GPT-4 and code descriptions was less effective. Future research should focus on refining the approach to more closely mimic the medical coder's workflow, potentially integrating the alphabetic index and official coding guidelines, rather than relying solely on code descriptions. This alignment may enhance system accuracy and better support medical coders in practice.	-
dc.description.sponsorship	The authors acknowledge the use of ChatGPT (GPT-4 and GPT-4o, OpenAI, 2023/2024) to assist in improving the language and readability of this manuscript. ChatGPT was specifically employed to rephrase sentences for clarity and enhance the overall articulation of the text. The authors carefully reviewed and refined the rephrased content to ensure that the final text accurately reflected their intended meaning and original ideas.	-
dc.language.iso	en	-
dc.publisher	JMIR PUBLICATIONS, INC	-
dc.rights	Sander Puts, Catharina M L Zegers, Andre Dekker, Iñigo Bermejo. Originally published in JMIR Formative Research (https://formative.jmir.org), 11.02.2025. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.	-
dc.subject.other	International Classification of Diseases	-
dc.subject.other	ICD-10	-
dc.subject.other	computer-assisted-coding	-
dc.subject.other	GPT-4	-
dc.subject.other	coding	-
dc.subject.other	term extraction	-
dc.subject.other	code analysis	-
dc.subject.other	computer assisted coding	-
dc.subject.other	transformer model	-
dc.subject.other	artificial intelligence	-
dc.subject.other	AI automation	-
dc.subject.other	retrieval-augmented generation	-
dc.subject.other	RAG	-
dc.subject.other	large language model	-
dc.subject.other	LLM	-
dc.subject.other	Bidirectional Encoder Representations from	-
dc.title	Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description- Based Code Selection	-
dc.type	Journal Contribution	-
dc.identifier.volume	9	-
local.format.pages	12	-
local.bibliographicCitation.jcat	A1	-
dc.description.notes	Puts, S (corresponding author), Maastricht Univ, GROW Res Inst Oncol & Reprod, Med Ctr, Dept Radiat Oncol Maastro, POB 616, NL-6200 MD Maastricht, Netherlands.	-
dc.description.notes	putssander@gmail.com	-
local.publisher.place	130 QUEENS QUAY East, Unit 1100, TORONTO, ON M5A 0P6, CANADA	-
local.type.refereed	Refereed	-
local.type.specified	Article	-
local.bibliographicCitation.artnr	e60095	-
dc.identifier.doi	10.2196/60095	-
dc.identifier.pmid	39935026	-
dc.identifier.isi	001454741600019	-
dc.contributor.orcid	Dekker, Andre/0000-0002-0422-7996	-
local.provider.type	wosris	-
local.description.affiliation	[Puts, Sander; Zegers, Catharina M. L.; Dekker, Andre; Bermejo, Inigo] Maastricht Univ, GROW Res Inst Oncol & Reprod, Med Ctr, Dept Radiat Oncol Maastro, POB 616, NL-6200 MD Maastricht, Netherlands.	-
local.description.affiliation	[Bermejo, Inigo] Hasselt Univ, Data Sci Inst DSI, Hasselt, Belgium.	-
local.uhasselt.international	yes	-
item.fullcitation	Puts, Sander; Zegers, Catharina M. L.; Dekker, Andre & BERMEJO DELGADO, Inigo (2025) Developing an ICD-10 Coding Assistant: Pilot Study Using RoBERTa and GPT-4 for Term Extraction and Description- Based Code Selection. In: JMIR Formative Research, 9 (Art N° e60095).	-
item.contributor	Puts, Sander	-
item.contributor	Zegers, Catharina M. L.	-
item.contributor	Dekker, Andre	-
item.contributor	BERMEJO DELGADO, Inigo	-
item.fulltext	With Fulltext	-
item.accessRights	Open Access	-
crisitem.journal.eissn	2561-326X	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
xx.pdf	Published version	782.16 kB	Adobe PDF	View/Open

Show simple item record

SCOPUS^TM
Citations

2

checked on Oct 26, 2025

WEB OF SCIENCE^TM
Citations

1

checked on Oct 25, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM