Affiliations 

  • 1 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Selangor, Malaysia; Department of Computer Science, Sokoto State University, Sokoto, Nigeria. Electronic address: gs63125@student.upm.edu.my
  • 2 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Selangor, Malaysia. Electronic address: k_azhar@upm.edu.my
  • 3 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Selangor, Malaysia. Electronic address: nurfadhlina@upm.edu.my
  • 4 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Selangor, Malaysia. Electronic address: m_yunus@upm.edu.my
J Biomed Inform, 2024 Mar;151:104603.
PMID: 38331081 DOI: 10.1016/j.jbi.2024.104603

Abstract

BACKGROUND: An adverse drug event (ADE) is any unfavorable effect that occurs due to the use of a drug. Extracting ADEs from unstructured clinical notes is essential to biomedical text extraction research because it helps with pharmacovigilance and patient medication studies.

OBJECTIVE: From the considerable amount of clinical narrative text, natural language processing (NLP) researchers have developed methods for extracting ADEs and their related attributes. This work presents a systematic review of current methods.

METHODOLOGY: Two biomedical databases have been searched from June 2022 until December 2023 for relevant publications regarding this review, namely the databases PubMed and Medline. Similarly, we searched the multi-disciplinary databases IEEE Xplore, Scopus, ScienceDirect, and the ACL Anthology. We adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement guidelines and recommendations for reporting systematic reviews in conducting this review. Initially, we obtained 5,537 articles from the search results from the various databases between 2015 and 2023. Based on predefined inclusion and exclusion criteria for article selection, 100 publications have undergone full-text review, of which we consider 82 for our analysis.

RESULTS: We determined the general pattern for extracting ADEs from clinical notes, with named entity recognition (NER) and relation extraction (RE) being the dual tasks considered. Researchers that tackled both NER and RE simultaneously have approached ADE extraction as a "pipeline extraction" problem (n = 22), as a "joint task extraction" problem (n = 7), and as a "multi-task learning" problem (n = 6), while others have tackled only NER (n = 27) or RE (n = 20). We further grouped the reviews based on the approaches for data extraction, namely rule-based (n = 8), machine learning (n = 11), deep learning (n = 32), comparison of two or more approaches (n = 11), hybrid (n = 12) and large language models (n = 8). The most used datasets are MADE 1.0, TAC 2017 and n2c2 2018.

CONCLUSION: Extracting ADEs is crucial, especially for pharmacovigilance studies and patient medications. This survey showcases advances in ADE extraction research, approaches, datasets, and state-of-the-art performance in them. Challenges and future research directions are highlighted. We hope this review will guide researchers in gaining background knowledge and developing more innovative ways to address the challenges.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.