Publications

Publications

Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes

Abstract

Financial costs are a major concern in the healthcare system, with medical billing and coding playing a key role in facilitating transactions and financing procedures. Billing involves filing claims with insurance companies and requires scrutiny of clinical summaries and electronic health records to correctly match diagnoses, prescriptions, and procedures to standardized codes. Accuracy in assigning International Classification of Diseases (ICD) codes is critical to proper reimbursement of care. Incorrect codes waste time and resources, and cause administrative and financial problems for hospitals, insurance companies and patients. Manual medical coding is a labor-intensive and error-prone process that creates additional administrative burden and inconvenience for hospitals, insurance companies, and patients. To simplify the process, clinical records are often processed to automatically identify and extract clinical concepts and corresponding ICD codes. Deep learning and natural language processing techniques have shown promise in a variety of tasks but applying them to medical coding has been challenging. Accurate coding requires a deep understanding of medical terminology, context, and guidelines that may be difficult to capture with traditional deep learning methods. Although deep learning shows promise in healthcare, its specific impact on ICD coding is not fully understood, and translating scalable deep learning methods into practical improvements in ICD coding remains a challenge. Evaluating deep learning models under the scenarios of real-world coding and comparing them to established practice is critical to determining their true effectiveness. In this work, we address the automation of ICD coding by highlighting pitfalls and contrasting different perspectives. We investigated automatic ICD coding using baseline machine learning models, with a focus on identifying ICD-9 codes in discharge notes from Medical Information Mart for Intensive Care (MIMIC) database. A thorough evaluation of different models and approaches is crucial to avoid over-reliance on any method. Our findings show that simpler methods can achieve comparable results to deep learning models while still requiring fewer computational resources.

URL

https://ojs.luminescience.cn/JDH/article/view/194

DOI

10.55976/jdh.22023119463-81

LaTeX

@article{JhaAlmagroTissot2023JDH, 
   title   = {Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes}, 
   volume  = {2}, 
   url     = {https://ojs.luminescience.cn/JDH/article/view/194}, 
   DOI     = {10.55976/jdh.22023119463-81}, 
   number  = {1}, 
   journal = {Journal of Digital Health}, 
   author  = {Jha, Jessica and Almagro, Mario and Tissot, Hegler}, 
   year    = {2023}, 
   month   = {Oct.}, 
   pages   = {63–81} 
}