METHODS: We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.
RESULTS: 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).
CONCLUSIONS: PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.
METHODS: Data accrued for an IPDMA on HADS-D diagnostic accuracy were analysed. We fit binomial generalized linear mixed models to compare odds of major depression classification for the Structured Clinical Interview for DSM (SCID), Composite International Diagnostic Interview (CIDI), and Mini International Neuropsychiatric Interview (MINI), controlling for HADS-D scores and participant characteristics with and without an interaction term between interview and HADS-D scores.
RESULTS: There were 15,856 participants (1942 [12%] with major depression) from 73 studies, including 15,335 (97%) non-psychiatric medical patients, 164 (1%) partners of medical patients, and 357 (2%) healthy adults. The MINI (27 studies, 7345 participants, 1066 major depression cases) classified participants as having major depression more often than the CIDI (10 studies, 3023 participants, 269 cases) (adjusted odds ratio [aOR] = 1.70 (0.84, 3.43)) and the semi-structured SCID (36 studies, 5488 participants, 607 cases) (aOR = 1.52 (1.01, 2.30)). The odds ratio for major depression classification with the CIDI was less likely to increase as HADS-D scores increased than for the SCID (interaction aOR = 0.92 (0.88, 0.96)).
CONCLUSION: Compared to the SCID, the MINI may diagnose more participants as having major depression, and the CIDI may be less responsive to symptom severity.
OBJECTIVE: To use an individual participant data meta-analysis to evaluate the accuracy of two PHQ-9 diagnostic algorithms for detecting major depression and compare accuracy between the algorithms and the standard PHQ-9 cutoff score of ≥10.
METHODS: Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, Web of Science (January 1, 2000, to February 7, 2015). Eligible studies that classified current major depression status using a validated diagnostic interview.
RESULTS: Data were included for 54 of 72 identified eligible studies (n participants = 16,688, n cases = 2,091). Among studies that used a semi-structured interview, pooled sensitivity and specificity (95% confidence interval) were 0.57 (0.49, 0.64) and 0.95 (0.94, 0.97) for the original algorithm and 0.61 (0.54, 0.68) and 0.95 (0.93, 0.96) for a modified algorithm. Algorithm sensitivity was 0.22-0.24 lower compared to fully structured interviews and 0.06-0.07 lower compared to the Mini International Neuropsychiatric Interview. Specificity was similar across reference standards. For PHQ-9 cutoff of ≥10 compared to semi-structured interviews, sensitivity and specificity (95% confidence interval) were 0.88 (0.82-0.92) and 0.86 (0.82-0.88).
CONCLUSIONS: The cutoff score approach appears to be a better option than a PHQ-9 algorithm for detecting major depression.