Abbreviation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1384152 Experts worldwide ranked by ideXlab platform

Jun'ichi Tsujii - One of the best experts on this subject based on the ideXlab platform.

  • building a high quality sense inventory for improved Abbreviation disambiguation
    Bioinformatics, 2010
    Co-Authors: N Okazaki, Sophia Ananiadou, Jun'ichi Tsujii
    Abstract:

    Motivation: The ultimate goal of Abbreviation management is to disambiguate every occurrence of an Abbreviation into its expanded form (concept or sense). To collect expanded forms for Abbreviations, previous studies have recognized Abbreviations and their expanded forms in parenthetical expressions of bio-medical texts. However, expanded forms extracted by Abbreviation recognition are mixtures of concepts/senses and their term variations. Consequently, a list of expanded forms should be structured into a sense inventory, which provides possible concepts or senses for Abbreviation disambiguation. Results: A sense inventory is a key to robust management of Abbreviations. Therefore, we present a supervised approach for clustering expanded forms. The experimental result reports 0.915 F1 score in clustering expanded forms. We then investigate the possibility of conflicts of protein and gene names with Abbreviations. Finally, an experiment of Abbreviation disambiguation on the sense inventory yielded 0.984 accuracy and 0.986 F1 score using the dataset obtained from MEDLINE abstracts. Availability: The sense inventory and disambiguator of Abbreviations are accessible at http://www.nactem.ac.uk/software/acromine/ and http://www.nactem.ac.uk/software/acromine_disambiguation/ Contact: okazaki@chokkan.org

  • Building a high-quality sense inventory for improved Abbreviation disambiguation
    Bioinformatics (Oxford England), 2010
    Co-Authors: N Okazaki, Sophia Ananiadou, Jun'ichi Tsujii
    Abstract:

    The ultimate goal of Abbreviation management is to disambiguate every occurrence of an Abbreviation into its expanded form (concept or sense). To collect expanded forms for Abbreviations, previous studies have recognized Abbreviations and their expanded forms in parenthetical expressions of bio-medical texts. However, expanded forms extracted by Abbreviation recognition are mixtures of concepts/senses and their term variations. Consequently, a list of expanded forms should be structured into a sense inventory, which provides possible concepts or senses for Abbreviation disambiguation. A sense inventory is a key to robust management of Abbreviations. Therefore, we present a supervised approach for clustering expanded forms. The experimental result reports 0.915 F1 score in clustering expanded forms. We then investigate the possibility of conflicts of protein and gene names with Abbreviations. Finally, an experiment of Abbreviation disambiguation on the sense inventory yielded 0.984 accuracy and 0.986 F1 score using the dataset obtained from MEDLINE abstracts. The sense inventory and disambiguator of Abbreviations are accessible at http://www.nactem.ac.uk/software/acromine/ and http://www.nactem.ac.uk/software/acromine_disambiguation/.

  • a discriminative alignment model for Abbreviation recognition
    International Conference on Computational Linguistics, 2008
    Co-Authors: Naoaki Okazaki, Sophia Ananiadou, Jun'ichi Tsujii
    Abstract:

    This paper presents a discriminative alignment model for extracting Abbreviations and their full forms appearing in actual text. The task of Abbreviation recognition is formalized as a sequential alignment problem, which finds the optimal alignment (origins of Abbreviation letters) between two strings (Abbreviation and full form). We design a large amount of finegrained features that directly express the events where letters produce or do not produce Abbreviations. We obtain the optimal combination of features on an aligned Abbreviation corpus by using the maximum entropy framework. The experimental results show the usefulness of the alignment model and corpus for improving Abbreviation recognition.

  • IJCNLP - A Discriminative Approach to Japanese Abbreviation Extraction
    2008
    Co-Authors: Naoaki Okazaki, Mitsuru Ishizuka, Jun'ichi Tsujii
    Abstract:

    This paper addresses the difficulties in recognizing Japanese Abbreviations through the use of previous approaches, examining actual usages of parenthetical expressions in newspaper articles. In order to bridge the gap between Japanese Abbreviations and their full forms, we present a discriminative approach to Abbreviation recognition. More specifically, we formalize the Abbreviation recognition task as a binary classification problem in which a classifier determines a positive (Abbreviation) or negative (nonAbbreviation) class, given a candidate of Abbreviation definition. The proposed method achieved 95.7% accuracy, 90.0% precision, and 87.6% recall on the evaluation corpus containing 7,887 (1,430 Abbreviations and 6,457 non-Abbreviation) instances of parenthetical expressions.

  • a discriminative approach to japanese Abbreviation extraction
    International Joint Conference on Natural Language Processing, 2008
    Co-Authors: Naoaki Okazaki, Mitsuru Ishizuka, Jun'ichi Tsujii
    Abstract:

    This paper addresses the difficulties in recognizing Japanese Abbreviations through the use of previous approaches, examining actual usages of parenthetical expressions in newspaper articles. In order to bridge the gap between Japanese Abbreviations and their full forms, we present a discriminative approach to Abbreviation recognition. More specifically, we formalize the Abbreviation recognition task as a binary classification problem in which a classifier determines a positive (Abbreviation) or negative (nonAbbreviation) class, given a candidate of Abbreviation definition. The proposed method achieved 95.7% accuracy, 90.0% precision, and 87.6% recall on the evaluation corpus containing 7,887 (1,430 Abbreviations and 6,457 non-Abbreviation) instances of parenthetical expressions.

Ergin Soysal - One of the best experts on this subject based on the ideXlab platform.

  • a long journey to short Abbreviations developing an open source framework for clinical Abbreviation recognition and disambiguation card
    Journal of the American Medical Informatics Association, 2017
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Trent S Rosenbloom, Lulu Wang, Carmelo Blanquicett, Ergin Soysal
    Abstract:

    Objective The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems’ capability to handle Abbreviations in clinical narratives. Methods We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. Results and Conclusion CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache’s clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap’s performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems.

  • A long journey to short Abbreviations: developing an open-source framework for clinical Abbreviation recognition and disambiguation (CARD).
    Journal of the American Medical Informatics Association : JAMIA, 2016
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Lulu Wang, Carmelo Blanquicett, S. Trent Rosenbloom, Ergin Soysal
    Abstract:

    The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle Abbreviations in clinical narratives. We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/Abbreviation.htm . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Joshua C Denny - One of the best experts on this subject based on the ideXlab platform.

  • a long journey to short Abbreviations developing an open source framework for clinical Abbreviation recognition and disambiguation card
    Journal of the American Medical Informatics Association, 2017
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Trent S Rosenbloom, Lulu Wang, Carmelo Blanquicett, Ergin Soysal
    Abstract:

    Objective The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems’ capability to handle Abbreviations in clinical narratives. Methods We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. Results and Conclusion CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache’s clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap’s performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems.

  • A long journey to short Abbreviations: developing an open-source framework for clinical Abbreviation recognition and disambiguation (CARD).
    Journal of the American Medical Informatics Association : JAMIA, 2016
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Lulu Wang, Carmelo Blanquicett, S. Trent Rosenbloom, Ergin Soysal
    Abstract:

    The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle Abbreviations in clinical narratives. We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/Abbreviation.htm . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  • a preliminary study of clinical Abbreviation disambiguation in real time
    Applied Clinical Informatics, 2015
    Co-Authors: Joshua C Denny, S T Rosenbloom, Randolph A Miller, Dario A Giuse, Min Song
    Abstract:

    Objective: To save time, healthcare providers frequently use Abbreviations while authoring clinical documents. Nevertheless, Abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients, and natural language processing (NLP) systems. Most current clinical NLP systems “post-process” notes long after clinicians enter them into electronic health record systems (EHRs). Such post-processing cannot guarantee 100% accuracy in Abbreviation identification and disambiguation, since multiple alternative interpretations exist. Methods: Authors describe a prototype system for real-time Clinical Abbreviation Recognition and Disambiguation (rCARD) – i.e., a system that interacts with authors during note generation to verify correct Abbreviation senses. The rCARD system design anticipates future integration with web-based clinical documentation systems to improve quality of healthcare records. When clinicians enter documents, rCARD will automatically recognize each Abbreviation. For Abbreviations with multiple possible senses, rCARD will show a ranked list of possible meanings with the best predicted sense at the top. The prototype application embodies three word sense disambiguation (WSD) methods to predict the correct senses of Abbreviations. We then conducted three experments to evaluate rCARD, including 1) a performance evaluation of different WSD methods; 2) a time evaluation of real-time WSD methods; and 3) a user study of typing clinical sentences with Abbreviations using rCARD. Results: Using 4,721 sentences containing 25 commonly observed, highly ambiguous clinical Abbreviations, our evaluation showed that the best profile-based method implemented in rCARD achieved a reasonable WSD accuracy of 88.8% (comparable to SVM – 89.5%) and the cost of time for the different WSD methods are also acceptable (ranging from 0.630 to 1.649 milliseconds within the same network). The preliminary user study also showed that the extra time costs by rCARD were about 5% of total document entry time and users did not feel a significant delay when using rCARD for clinical document entry. Conclusion: The study indicates that it is feasible to integrate a real-time, NLP-enabled Abbreviation recognition and disambiguation module with clinical documentation systems.

Randolph A Miller - One of the best experts on this subject based on the ideXlab platform.

  • a long journey to short Abbreviations developing an open source framework for clinical Abbreviation recognition and disambiguation card
    Journal of the American Medical Informatics Association, 2017
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Trent S Rosenbloom, Lulu Wang, Carmelo Blanquicett, Ergin Soysal
    Abstract:

    Objective The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems’ capability to handle Abbreviations in clinical narratives. Methods We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. Results and Conclusion CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache’s clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap’s performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems.

  • A long journey to short Abbreviations: developing an open-source framework for clinical Abbreviation recognition and disambiguation (CARD).
    Journal of the American Medical Informatics Association : JAMIA, 2016
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Lulu Wang, Carmelo Blanquicett, S. Trent Rosenbloom, Ergin Soysal
    Abstract:

    The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle Abbreviations in clinical narratives. We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/Abbreviation.htm . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  • a preliminary study of clinical Abbreviation disambiguation in real time
    Applied Clinical Informatics, 2015
    Co-Authors: Joshua C Denny, S T Rosenbloom, Randolph A Miller, Dario A Giuse, Min Song
    Abstract:

    Objective: To save time, healthcare providers frequently use Abbreviations while authoring clinical documents. Nevertheless, Abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients, and natural language processing (NLP) systems. Most current clinical NLP systems “post-process” notes long after clinicians enter them into electronic health record systems (EHRs). Such post-processing cannot guarantee 100% accuracy in Abbreviation identification and disambiguation, since multiple alternative interpretations exist. Methods: Authors describe a prototype system for real-time Clinical Abbreviation Recognition and Disambiguation (rCARD) – i.e., a system that interacts with authors during note generation to verify correct Abbreviation senses. The rCARD system design anticipates future integration with web-based clinical documentation systems to improve quality of healthcare records. When clinicians enter documents, rCARD will automatically recognize each Abbreviation. For Abbreviations with multiple possible senses, rCARD will show a ranked list of possible meanings with the best predicted sense at the top. The prototype application embodies three word sense disambiguation (WSD) methods to predict the correct senses of Abbreviations. We then conducted three experments to evaluate rCARD, including 1) a performance evaluation of different WSD methods; 2) a time evaluation of real-time WSD methods; and 3) a user study of typing clinical sentences with Abbreviations using rCARD. Results: Using 4,721 sentences containing 25 commonly observed, highly ambiguous clinical Abbreviations, our evaluation showed that the best profile-based method implemented in rCARD achieved a reasonable WSD accuracy of 88.8% (comparable to SVM – 89.5%) and the cost of time for the different WSD methods are also acceptable (ranging from 0.630 to 1.649 milliseconds within the same network). The preliminary user study also showed that the extra time costs by rCARD were about 5% of total document entry time and users did not feel a significant delay when using rCARD for clinical document entry. Conclusion: The study indicates that it is feasible to integrate a real-time, NLP-enabled Abbreviation recognition and disambiguation module with clinical documentation systems.

Dario A Giuse - One of the best experts on this subject based on the ideXlab platform.

  • a long journey to short Abbreviations developing an open source framework for clinical Abbreviation recognition and disambiguation card
    Journal of the American Medical Informatics Association, 2017
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Trent S Rosenbloom, Lulu Wang, Carmelo Blanquicett, Ergin Soysal
    Abstract:

    Objective The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems’ capability to handle Abbreviations in clinical narratives. Methods We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. Results and Conclusion CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache’s clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap’s performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems.

  • A long journey to short Abbreviations: developing an open-source framework for clinical Abbreviation recognition and disambiguation (CARD).
    Journal of the American Medical Informatics Association : JAMIA, 2016
    Co-Authors: Joshua C Denny, Randolph A Miller, Dario A Giuse, Lulu Wang, Carmelo Blanquicett, S. Trent Rosenbloom, Ergin Soysal
    Abstract:

    The goal of this study was to develop a practical framework for recognizing and disambiguating clinical Abbreviations, thereby improving current clinical natural language processing (NLP) systems' capability to handle Abbreviations in clinical narratives. We developed an open-source framework for clinical Abbreviation recognition and disambiguation (CARD) that leverages our previously developed methods, including: (1) machine learning based approaches to recognize Abbreviations from a clinical corpus, (2) clustering-based semiautomated methods to generate possible senses of Abbreviations, and (3) profile-based word sense disambiguation methods for clinical Abbreviations. We applied CARD to clinical corpora from Vanderbilt University Medical Center (VUMC) and generated 2 comprehensive sense inventories for Abbreviations in discharge summaries and clinic visit notes. Furthermore, we developed a wrapper that integrates CARD with MetaMap, a widely used general clinical NLP system. CARD detected 27 317 and 107 303 distinct Abbreviations from discharge summaries and clinic visit notes, respectively. Two sense inventories were constructed for the 1000 most frequent Abbreviations in these 2 corpora. Using the sense inventories created from discharge summaries, CARD achieved an F1 score of 0.755 for identifying and disambiguating all Abbreviations in a corpus from the VUMC discharge summaries, which is superior to MetaMap and Apache's clinical Text Analysis Knowledge Extraction System (cTAKES). Using additional external corpora, we also demonstrated that the MetaMap-CARD wrapper improved MetaMap's performance in recognizing disorder entities in clinical notes. The CARD framework, 2 sense inventories, and the wrapper for MetaMap are publicly available at https://sbmi.uth.edu/ccb/resources/Abbreviation.htm . We believe the CARD framework can be a valuable resource for improving Abbreviation identification in clinical NLP systems. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  • a preliminary study of clinical Abbreviation disambiguation in real time
    Applied Clinical Informatics, 2015
    Co-Authors: Joshua C Denny, S T Rosenbloom, Randolph A Miller, Dario A Giuse, Min Song
    Abstract:

    Objective: To save time, healthcare providers frequently use Abbreviations while authoring clinical documents. Nevertheless, Abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients, and natural language processing (NLP) systems. Most current clinical NLP systems “post-process” notes long after clinicians enter them into electronic health record systems (EHRs). Such post-processing cannot guarantee 100% accuracy in Abbreviation identification and disambiguation, since multiple alternative interpretations exist. Methods: Authors describe a prototype system for real-time Clinical Abbreviation Recognition and Disambiguation (rCARD) – i.e., a system that interacts with authors during note generation to verify correct Abbreviation senses. The rCARD system design anticipates future integration with web-based clinical documentation systems to improve quality of healthcare records. When clinicians enter documents, rCARD will automatically recognize each Abbreviation. For Abbreviations with multiple possible senses, rCARD will show a ranked list of possible meanings with the best predicted sense at the top. The prototype application embodies three word sense disambiguation (WSD) methods to predict the correct senses of Abbreviations. We then conducted three experments to evaluate rCARD, including 1) a performance evaluation of different WSD methods; 2) a time evaluation of real-time WSD methods; and 3) a user study of typing clinical sentences with Abbreviations using rCARD. Results: Using 4,721 sentences containing 25 commonly observed, highly ambiguous clinical Abbreviations, our evaluation showed that the best profile-based method implemented in rCARD achieved a reasonable WSD accuracy of 88.8% (comparable to SVM – 89.5%) and the cost of time for the different WSD methods are also acceptable (ranging from 0.630 to 1.649 milliseconds within the same network). The preliminary user study also showed that the extra time costs by rCARD were about 5% of total document entry time and users did not feel a significant delay when using rCARD for clinical document entry. Conclusion: The study indicates that it is feasible to integrate a real-time, NLP-enabled Abbreviation recognition and disambiguation module with clinical documentation systems.