Performance Analysis of Clinical Abbreviation Disambiguation Using Machine Learning Techniques

Muhammad Shahbaz, Sajida Parveen, Aziz Guergachi, and Karim Keshavjee

Keywords

MACHINE LEARNING, DATA MINING, HEALTH INFORMATICS, TEXT MINING

Abstract

Clinical notes are the primary source of communication in healthcare domain and frequently contain ambiguous clinical abbreviations that make information extraction and reasoning over text much more challenging task. Clinical abbreviation disambiguation primarily determines the appropriate expanded form of an ambiguous clinical abbreviation activated by its context. Such information is very crucial for the improvement of precision in Clinical Natural Language Processing (CNLP) systems that discover implicit non trivial clinical information from narrative clinical text data in numerous settings. This paper explores the use of Naïve Bayes and Support Vector Machine (SVM) to truly understand the generalizability for clinical abbreviation disambiguation along with stability evaluation under different documents size obtained from the University of Minnesota-affiliated (UMN) Fairview Health Services in the Twin Cities. Experimental result shows that, SVM substantially achieve better performance than Naïve Bayes and behave robustly over different documents size.

Important Links:

Go Back