OBJECTIVE: From the considerable amount of clinical narrative text, natural language processing (NLP) researchers have developed methods for extracting ADEs and their related attributes. This work presents a systematic review of current methods.
METHODOLOGY: Two biomedical databases have been searched from June 2022 until December 2023 for relevant publications regarding this review, namely the databases PubMed and Medline. Similarly, we searched the multi-disciplinary databases IEEE Xplore, Scopus, ScienceDirect, and the ACL Anthology. We adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement guidelines and recommendations for reporting systematic reviews in conducting this review. Initially, we obtained 5,537 articles from the search results from the various databases between 2015 and 2023. Based on predefined inclusion and exclusion criteria for article selection, 100 publications have undergone full-text review, of which we consider 82 for our analysis.
RESULTS: We determined the general pattern for extracting ADEs from clinical notes, with named entity recognition (NER) and relation extraction (RE) being the dual tasks considered. Researchers that tackled both NER and RE simultaneously have approached ADE extraction as a "pipeline extraction" problem (n = 22), as a "joint task extraction" problem (n = 7), and as a "multi-task learning" problem (n = 6), while others have tackled only NER (n = 27) or RE (n = 20). We further grouped the reviews based on the approaches for data extraction, namely rule-based (n = 8), machine learning (n = 11), deep learning (n = 32), comparison of two or more approaches (n = 11), hybrid (n = 12) and large language models (n = 8). The most used datasets are MADE 1.0, TAC 2017 and n2c2 2018.
CONCLUSION: Extracting ADEs is crucial, especially for pharmacovigilance studies and patient medications. This survey showcases advances in ADE extraction research, approaches, datasets, and state-of-the-art performance in them. Challenges and future research directions are highlighted. We hope this review will guide researchers in gaining background knowledge and developing more innovative ways to address the challenges.
METHODS: In this systematic review and individual participant data meta-analysis, we updated a search of PubMed (MEDLINE), Embase, the Cochrane Library, and conference abstracts for publications from Jan 1, 2011, to March 12, 2018, done in a previous systematic review to include the period up to Aug 2, 2019. We screened the reference lists of identified pieces and contacted experts in the field. We included prospective cross-sectional, observational studies and randomised trials among adult and adolescent (age ≥10 years) ambulatory people living with HIV, irrespective of signs and symptoms of tuberculosis. We extracted study-level data using a standardised data extraction form, and we requested individual participant data from study authors. We aimed to compare the W4SS with alternative screening tests and strategies and the WHO-recommended algorithm (ie, W4SS followed by Xpert) with Xpert for all in terms of diagnostic accuracy (sensitivity and specificity), overall and in key subgroups (eg, by antiretroviral therapy [ART] status). The reference standard was culture. This study is registered with PROSPERO, CRD42020155895.
FINDINGS: We identified 25 studies, and obtained data from 22 studies (including 15 666 participants; 4347 [27·7%] of 15 663 participants with data were on ART). W4SS sensitivity was 82% (95% CI 72-89) and specificity was 42% (29-57). C-reactive protein (≥10 mg/L) had similar sensitivity to (77% [61-88]), but higher specificity (74% [61-83]; n=3571) than, W4SS. Cough (lasting ≥2 weeks), haemoglobin (<10 g/dL), body-mass index (<18·5 kg/m2), and lymphadenopathy had high specificities (80-90%) but low sensitivities (29-43%). The WHO-recommended algorithm had a sensitivity of 58% (50-66) and a specificity of 99% (98-100); Xpert for all had a sensitivity of 68% (57-76) and a specificity of 99% (98-99). In the one study that assessed both, the sensitivity of sputum Xpert Ultra was higher than sputum Xpert (73% [62-81] vs 57% [47-67]) and specificities were similar (98% [96-98] vs 99% [98-100]). Among outpatients on ART (4309 [99·1%] of 4347 people on ART), W4SS sensitivity was 53% (35-71) and specificity was 71% (51-85). In this population, a parallel strategy (two tests done at the same time) of W4SS with any chest x-ray abnormality had higher sensitivity (89% [70-97]) and lower specificity (33% [17-54]; n=2670) than W4SS alone; at a tuberculosis prevalence of 5%, this strategy would require 379 more rapid diagnostic tests per 1000 people living with HIV than W4SS but detect 18 more tuberculosis cases. Among outpatients not on ART (11 160 [71·8%] of 15 541 outpatients), W4SS sensitivity was 85% (76-91) and specificity was 37% (25-51). C-reactive protein (≥10 mg/L) alone had a similar sensitivity to (83% [79-86]), but higher specificity (67% [60-73]; n=3187) than, W4SS and a sequential strategy (both test positive) of W4SS then C-reactive protein (≥5 mg/L) had a similar sensitivity to (84% [75-90]), but higher specificity than (64% [57-71]; n=3187), W4SS alone; at 10% tuberculosis prevalence, these strategies would require 272 and 244 fewer rapid diagnostic tests per 1000 people living with HIV than W4SS but miss two and one more tuberculosis cases, respectively.
INTERPRETATION: C-reactive protein reduces the need for further rapid diagnostic tests without compromising sensitivity and has been included in the updated WHO tuberculosis screening guidelines. However, C-reactive protein data were scarce for outpatients on ART, necessitating future research regarding the utility of C-reactive protein in this group. Chest x-ray can be useful in outpatients on ART when combined with W4SS. The WHO-recommended algorithm has suboptimal sensitivity; Xpert for all offers slight sensitivity gains and would have major resource implications.
FUNDING: World Health Organization.