Intended for healthcare professionals

CCBYNC Open access
Research

Accuracy of the Edinburgh Postnatal Depression Scale (EPDS) for screening to detect major depression among pregnant and postpartum women: systematic review and meta-analysis of individual participant data

BMJ 2020; 371 doi: https://doi.org/10.1136/bmj.m4022 (Published 11 November 2020) Cite this as: BMJ 2020;371:m4022
  1. Brooke Levis, postdoctoral research fellow1 2 3,
  2. Zelalem Negeri, postdoctoral research fellow1 2,
  3. Ying Sun, research coordinator1,
  4. Andrea Benedetti, associate professor2 4 5,
  5. Brett D Thombs, professor1 2 5 6 7 8 9
  6. on behalf of the DEPRESsion Screening Data (DEPRESSD) EPDS Group
    1. 1Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    2. 2Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
    3. 3Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK
    4. 4Respiratory Epidemiology and Clinical Research Unit, McGill University Health Centre, Montreal, Quebec, Canada
    5. 5Department of Medicine, McGill University, Montreal, Quebec, Canada
    6. 6Department of Psychiatry, McGill University, Montreal, Quebec, Canada
    7. 7Department of Psychology, McGill University, Montreal, Quebec, Canada
    8. 8Department of Educational and Counselling Psychology, McGill University, Montreal, Quebec, Canada
    9. 9Biomedical Ethics Unit, McGill University, Montreal, Quebec, Canada
    1. Correspondence to: B D Thombs, Jewish General Hospital, 4333 Cote Ste Catherine Road, Montreal, Quebec, Canada, H3T 1E4 brett.thombs{at}mcgill.ca
    • Accepted 10 September 2020

    Abstract

    Objective To evaluate the Edinburgh Postnatal Depression Scale (EPDS) for screening to detect major depression in pregnant and postpartum women.

    Design Individual participant data meta-analysis.

    Data sources Medline, Medline In-Process and Other Non-Indexed Citations, PsycINFO, and Web of Science (from inception to 3 October 2018).

    Eligibility criteria for selecting studies Eligible datasets included EPDS scores and major depression classification based on validated diagnostic interviews. Bivariate random effects meta-analysis was used to estimate EPDS sensitivity and specificity compared with semi-structured, fully structured (Mini International Neuropsychiatric Interview (MINI) excluded), and MINI diagnostic interviews separately using individual participant data. One stage meta-regression was used to examine accuracy by reference standard categories and participant characteristics.

    Results Individual participant data were obtained from 58 of 83 eligible studies (70%; 15 557 of 22 788 eligible participants (68%), 2069 with major depression). Combined sensitivity and specificity was maximised at a cut-off value of 11 or higher across reference standards. Among studies with a semi-structured interview (36 studies, 9066 participants, 1330 with major depression), sensitivity and specificity were 0.85 (95% confidence interval 0.79 to 0.90) and 0.84 (0.79 to 0.88) for a cut-off value of 10 or higher, 0.81 (0.75 to 0.87) and 0.88 (0.85 to 0.91) for a cut-off value of 11 or higher, and 0.66 (0.58 to 0.74) and 0.95 (0.92 to 0.96) for a cut-off value of 13 or higher, respectively. Accuracy was similar across reference standards and subgroups, including for pregnant and postpartum women.

    Conclusions An EPDS cut-off value of 11 or higher maximised combined sensitivity and specificity; a cut-off value of 13 or higher was less sensitive but more specific. To identify pregnant and postpartum women with higher symptom levels, a cut-off of 13 or higher could be used. Lower cut-off values could be used if the intention is to avoid false negatives and identify most patients who meet diagnostic criteria.

    Registration PROSPERO (CRD42015024785).

    Introduction

    Depression is common in pregnant and postpartum women and is associated with adverse outcomes for the mother, developing child, mother-infant relationship, and intimate partner relationship.12 Depression screening could potentially improve detection and management of perinatal depression. Depression screening involves the use of self-report depression symptom questionnaires to identify women above a preidentified cut-off value for further evaluation to determine whether depression is present.34 In the United Kingdom, the National Institute for Health and Care Excellence guidelines5 suggest that healthcare providers consider asking pregnant or postpartum women the two Whooley questions,6 and administering the Edinburgh Postnatal Depression Scale (EPDS) or Patient Health Questionnaire-9 screening questionnaires as part of a full assessment if depression is suspected. The guidelines do not recommend administering a screening tool to all women. The UK National Screening Committee7 and Canadian Task Force on Preventive Health Care8 recommend against screening owing to concerns about false positives, possible harms, and the lack of evidence from well conducted trials that screening improves mental health outcomes. However, the United States Preventive Services Task Force (USPSTF)9 and Australian national guidelines10 recommend depression screening in pregnant and postpartum women, although the USPSTF notes that “screening should be implemented with adequate systems in place to ensure accurate diagnosis, effective treatment, and appropriate follow-up.” Depression screening is sometimes promoted in low and middle income countries, but it is not known whether it would improve mental health in those settings.2

    The 10 item EPDS is the most commonly used depression screening tool in perinatal care; cut-off values of 10 or higher and 13 or higher are most often used to identify women who might have depression.1112131415 The USPSTF recommends screening pregnant and postpartum women with the EPDS, but does not specify a cut-off value.9 The systematic review conducted to support the USPSTF guideline reported the range of accuracy estimates for EPDS cut-off values of 10 or higher (14 studies) and 13 or higher (17 studies) in 23 primary studies, but did not include a meta-analysis.1415 An existing meta-analysis that has examined EPDS screening accuracy searched databases through February 2007 and found that combined sensitivity and specificity to detect major depression in postpartum women was greater for a cut-off value of 12 or higher (sensitivity 0.86, specificity 0.87, 15 studies) than for a cut-off value of 10 or higher (sensitivity 0.92, specificity 0.77, 14 studies) or 13 or higher (sensitivity 0.79, specificity 0.89, 18 studies) among a total of 18 studies.13 The results were not pooled for pregnant women because there were too few studies, and no subgroup analyses were conducted among postpartum women because primary studies did not report the necessary data. Estimates were not done separately for different types of reference standards, although important differences exist in design and structure, and in the likelihood of major depression classification between different diagnostic interviews.161718 Therefore, the optimal cut-off value for screening remains unknown, and whether different cut-off values are needed for women with different characteristics needs to be determined.

    Whereas conventional meta-analyses synthesise aggregate results from study reports, individual participant data meta-analysis (IPDMA) involves the synthesis of participant level data from primary studies.19 The advantages of an IPDMA of the EPDS are the ability to include data from studies that collected EPDS and reference standard outcomes but did not publish accuracy results; the results for all cut-off values from all included studies can be taken into account rather than just published cut-off results; subgroup analyses can be conducted, which were not done in primary studies; and accuracy results can be reported separately for different reference standards. Our objectives were to use IPDMA to evaluate EPDS screening accuracy among studies that used different types of reference standards separately, with semi-structured interviews prioritised; and to investigate whether EPDS screening accuracy differs based on pregnant versus postpartum status, age, and country human development index.

    Methods

    This IPDMA was registered in PROSPERO (CRD42015024785), a protocol was published,20 and the results were reported following PRISMA-DTA (preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies)21 and PRISMA-IPD (preferred reporting items for systematic review and meta-analyses of individual participant data)22 guidelines. We followed similar methods to those used in our previously published Patient Health Questionaire-9 diagnostic accuracy IPDMA.23 Individual prediction models described in the protocol will be developed in future database versions. Deviations from the protocol include searching from database inception rather than from year 2000, including only one assessment time point for each woman given the small number of studies with multiple time points, and reporting results for cut-off values of 7-15 rather than 9-15.

    Study eligibility

    Datasets from studies that met the following criteria were deemed eligible: they administered the EPDS; diagnostic classification for current major depressive disorder or major depressive episode used Diagnostic and Statistical Manual of Mental Disorders (DSM)242526 or international classification of diseases (ICD)27 criteria based on a validated semi-structured or fully structured interview; the EPDS and diagnostic interview were conducted no more than two weeks apart; participants were adult women aged at least 18 years who completed assessments during pregnancy or within 12 months of giving birth; and participants were not recruited because they were receiving psychiatric assessment or care, or because they were identified as having possible depression because screening seeks to identify women with otherwise unrecognised major depression.28 Studies in which some participants did not meet eligibility criteria were included in the IPDMA if primary data allowed for the selection of eligible participants.

    Database searches and study selection

    A medical librarian designed a peer reviewed29 search strategy (eMethods1 in supplementary material) and searched Medline, Medline In-Process and Other Non-Indexed Citations, and PsycINFO through OvidSP, and Web of Science through ISI Web of Knowledge from inception to 3 October 2018. Additionally, investigators examined citations from relevant reviews and requested information about unpublished studies from authors who contributed studies. Citations identified by the search were uploaded into RefWorks (RefWorks-COS, Bethesda, MD, USA). Duplicates were removed and unique citations were uploaded into DistillerSR (Evidence Partners, Ottawa, Canada).

    Two reviewers independently reviewed titles and abstracts. For publications deemed potentially eligible by either reviewer, a full text review was done by two reviewers, also independently. Any conflicts were resolved by consensus and a third reviewer was consulted if necessary.

    Data contribution, extraction, and synthesis

    We invited investigators with eligible datasets to contribute deidentified versions of their datasets. We attempted to contact corresponding authors of eligible primary studies by email up to three times, as necessary. When authors did not respond to our emails, we tried to contact them by phone and emailed their coauthors.

    Two investigators independently extracted information on the diagnostic interview administered and the country of study from the published reports. We used the United Nation’s human development index, based on year of study publication, which reflects life expectancy, education, and income,30 to categorise countries as very high, high, or low-medium development. Participant level data included in the synthesised dataset included age, pregnant or postpartum status, EPDS scores, and major depression classification status. We used major depressive disorder or major depressive episode based on the DSM or ICD criteria to classify major depression; if both were reported, we used major depressive episode because screening attempts to detect episodes of depression. Additional assessment would be needed to determine if episodes are related to major depressive disorder or another psychiatric disorder (bipolar disorder, persistent depressive disorder). We also prioritised DSM over ICD. We used statistical weights to reflect sampling procedures if provided in the datasets; for instance, when primary studies administered a diagnostic interview to all participants with positive screening results but only a random sample of those with negative results. Some studies used sampling procedures that merited weights but did not use weights. For those studies, we used inverse selection probabilities to generate appropriate weights.

    We verified that participant characteristics and accuracy results from individual datasets matched those that had been published. If any discrepancies were found, we worked with the primary study investigators to understand and resolve differences. All study level and individual level participant data were transformed into a standardised format and combined in a single synthesised dataset. For nine studies that collected data at multiple time points (four with two time points, four with three time points, and one with four time points), we selected the time point with the most participants. If the number of participants was maximised at multiple time points, we selected the one with the most women who had major depression.

    Risk of bias assessment

    We used the Quality Assessment of Diagnostic Accuracy Studies-2 tool (QUADAS-2; eMethods2)31 to assess risk of bias of included studies. Two investigators independently performed this assessment and any differences were resolved through consensus or by involving a third investigator if necessary. Values used in the risk of bias assessment were coded at both study and participant levels because some values might have differed among participants from the same study (eg, time interval between index test and reference standard).

    Statistical analyses

    We estimated sensitivity and specificity for three reference standard categories separately across cut-off values of 7-15 for all women. Reference standard categories included semi-structured interviews (Structured Clinical Interview for DSM Disorders (SCID),32 Clinical Interview Schedule,33 Diagnostic Interview for Genetic Studies34), fully structured interviews, excluding the Mini International Neuropsychiatric Interview (MINI3536; Composite International Diagnostic Interview (CIDI),37 Clinical Interview Schedule-Revised38), and the MINI. We analysed studies that used different types of reference standards separately because we previously found that, controlling for depressive symptom levels, the MINI might classify depression more than other diagnostic interviews, and the CIDI might classify more participants with low level symptoms as having depression but fewer with high level symptoms.161718 These findings are consistent with the design of the different types of diagnostic interviews. Semi-structured interviews are designed to be administered by an experienced diagnostician who can incorporate probes and queries, and use clinical judgment. Fully structured interviews are entirely scripted so that they can be administered by a trained lay interviewer and reduce required resources. By design, fully structured interviews are intended to increase standardisation, but this could be at the cost of reduced validity.39404142 The MINI is a brief version of a fully structured interview that was designed for rapid administration and tends to be overinclusive.36

    We fit bivariate random effects models using Gauss-Hermite quadrature for each reference standard category, for cut-off values of 7-15 separately.43 This is a two stage meta-analytic approach that models sensitivity and specificity simultaneously and accounts for the correlation between them, and for precision of estimates within studies (that is, the clustering). This model provided estimates of pooled sensitivity and specificity for each analysis. We found four studies for the fully structured subgroup, one of which included only one participant with major depression. For this subgroup, we modified the bivariate model by setting the correlation between random effects to zero, and excluded the major depression case from the study that had only one major depression case. Therefore, we used three studies to evaluate sensitivity and four studies to measure specificity.

    We constructed empirical receiver operating characteristic curves based on pooled sensitivity and specificity estimates, and calculated area under the curves for each reference standard category. Additionally, we conducted one stage meta-regressions with interactions between reference standard category (reference category: semi-structured) and accuracy coefficients (logit(sensitivity) and logit(1−specificity)). We generated nomograms to present positive and negative predictive values for the optimal cut-off value (maximising Youden’s J=sensitivity+specificity−1) and the commonly used cut-off values of 10 or higher and 13 or higher for assumed major depression prevalence of 5-25%.

    We evaluated heterogeneity for each reference standard category by generating sensitivity and specificity forest plots for each study for the optimal cut-off value and for cut-off values of 10 or higher and 13 or higher. We quantified heterogeneity by reporting estimated variances of the random effects for sensitivity and specificity (τ2) and by estimating R, which is the ratio of the estimated standard deviation of the pooled sensitivity (or specificity) from the random effects model to that from the corresponding fixed effects model.44

    Within semi-structured and MINI reference standard categories separately, we fit one stage meta-regressions where we interacted all participant characteristics (age (measured continuously), pregnant v postpartum status (reference category=pregnant), and country human development index (reference category=very high)) with logit(sensitivity) and logit(1−specificity). Too few studies existed that used other fully structured interviews to enable us to perform meta-regressions. We conducted post hoc analyses in which we fit additional one stage meta-regressions for year of study publication. No participants had missing data for any covariates in the meta-regressions. We assessed characteristics one at a time because models attempting to fit all participant characteristics simultaneously did not converge.

    When characteristics were significantly associated with sensitivity or specificity for all or most cut-off values in the meta-regressions, we fit bivariate random effects models for cut-off values of 7-15 for each subgroup. Age was fit continuously in the meta-regression but was dichotomised (<25 v ≥25 years45) to estimate accuracy by subgroups. For analyses in the age less than 25 subgroup, we excluded four semi-structured studies and four MINI studies that did not have any participants with major depression because the bivariate random effects model could not be applied. Therefore, 21 participants (1%) younger than 25 were excluded from semi-structured studies, and 77 (9%) from MINI studies.

    In sensitivity analyses, we conducted additional meta-regressions based on QUADAS-2 scores in semi-structured and MINI reference standard categories separately. We interacted QUADAS-2 domain scores with logit(sensitivity) and logit(1−specificity) for all domain scores with at least 100 participants with major depression and 100 without major depression among those categorised as having low risk of bias and among those with high or unclear risk of bias. We again assessed items one at a time. We performed additional sensitivity analyses for EPDS cut-off values of 10-13 by combining IPDMA accuracy results with results from studies that did not contribute individual participant data but published eligible accuracy results.

    All analyses were run in R (R version R 3.4.1, R Studio version 1.0.143) by using the glmer function within the lme4 package.

    Patient and public involvement

    No patients were involved in setting the research question or the outcome measures, nor were they involved in developing plans for design or implementation of the study. No patients were asked to advise on interpretation or writing up of results. There are no plans to disseminate the results of the research to study participants or the relevant patient community. However, an online knowledge translation tool, intended for clinicians (the end users of the EPDS screening tool), is available at depressionscreening100.com/epds. The tool allows clinicians to estimate the expected number of positive screens and true and false screening outcomes based on study results.

    Results

    Search results and dataset inclusion

    We identified 4434 unique titles and abstracts from the database search. Of these, 4056 were excluded after title and abstract review and 257 after full text review (eTable1), resulting in 121 eligible articles from 81 unique participant samples. Of these, 56 (69%) contributed datasets (fig 1). Authors of included studies contributed data from two other studies that the search did not retrieve for a total of 58 datasets (15 557 participants, 2069 with major depression). eTable2 shows characteristics of primary studies that contributed data and eligible studies that did not provide datasets. Of 22 788 participants in 83 eligible published studies, 15 557 (68%) were included. Eligible studies that did and did not contribute data were generally similar in terms of sample size, proportion of participants with major depression (excluding non-contributing studies where number with major depression was not reported), and country human development index. The proportion of studies with pregnant women only was also similar for contributing and non-contributing studies. Among both contributing and non-contributing studies, most studies used semi-structured interviews as the reference standard, followed by the MINI, and other fully structured interviews.

    Fig 1
    Fig 1

    Flow diagram of study selection process. EPDS=Edinburgh Postnatal Depression Scale

    Of 58 included studies, 25 included pregnant women, 30 postpartum women, and three both pregnant and postpartum women. Thirty six studies used semi-structured reference standards, including 34 that used the SCID; four used fully structured reference standards (MINI excluded), including three that used the CIDI; and 18 used the MINI (table 1).

    Table 1

    Participant data by diagnostic interview

    View this table:

    EPDS sensitivity and specificity by reference standard category

    Table 2 shows sensitivity and specificity estimates for cut-off values of 7-15 by reference standard category. Combined sensitivity and specificity was maximised at a cut-off value of 11 or higher for semi-structured interviews (Youden’s J=0.70), fully structured interviews (Youden’s J=0.73), and the MINI (Youden’s J=0.66). For semi-structured interviews, sensitivity and specificity were 0.85 (95% confidence interval 0.79 to 0.90) and 0.84 (0.79 to 0.88) for a cut-off value of 10 or higher, 0.81 (0.75 to 0.87) and 0.88 (0.85 to 0.91) for a cut-off value of 11 or higher, and 0.66 (0.58 to 0.74) and 0.95 (0.92 to 0.96) for a cut-off value of 13 or higher. eFigure1 shows receiver operating characteristic curves and area under the curve values. No significant differences in accuracy by reference standard category were found that held across all cut-off values (eTable3). Results did not change substantively in sensitivity analyses that included published results from eight of the 26 studies that did not contribute individual participant data but published eligible accuracy results (eTable4). The other 18 eligible datasets that did not contribute individual participant data did not publish eligible diagnostic accuracy results (eTable2b).

    Table 2

    Comparison of sensitivity and specificity estimates for each reference standard category

    View this table:

    Nomograms of positive and negative predictive values by reference standard category are shown in figure 2 (cut-off values of ≥11 and ≥13) and eFigure2 (cut-off value of ≥10). For major depression prevalence values of 5-25% and a cut-off value of 11 or higher compared with semi-structured interviews, positive predictive values ranged from 26% to 69% and negative predictive values ranged from 93% to 99%. Ranges were similar for other reference standard types.

    Fig 2
    Fig 2

    Nomograms of positive and negative predictive values by reference standard category (semi-structured diagnostic interviews, fully structured diagnostic interviews, and the MINI) for major depression prevalence values of 5-25%. Upper left panel: EPDS cut-off value of 11 or higher positive predictive value; upper right panel: EPDS cut-off value of 11 or higher negative predictive value; lower left panel: EPDS cut-off value of 13 or higher positive predictive value; lower right panel: EPDS cut-off value of 13 or higher negative predictive value. EPDS=Edinburgh Postnatal Depression Scale; MINI=Mini International Neuropsychiatric Interview

    Heterogeneity analyses suggested moderate heterogeneity across studies. eFigure3 shows forest plots of sensitivity and specificity, and eTable5 shows τ2 and R values.

    EPDS accuracy among subgroups

    Older age (measured continuously) was associated with higher specificity for both the semi-structured and MINI reference standard categories for cut-off values of 9-15 (eTable3). However, based on bivariate random effect models among participants younger than 25 and among those aged 25 or older, specificity estimates were similar across age groups (median difference across cut-off values of 7-15: 0.02 for semi-structured studies, 0.03 for MINI studies; eTable6). No other study or participant characteristics were consistently associated with differences in sensitivity or specificity estimates across both reference standard categories.

    Risk of bias sensitivity analyses

    eTable7 shows QUADAS-2 ratings for included studies. No QUADAS-2 domain items were consistently associated with differences in sensitivity or specificity estimates across semi-structured and MINI reference standard categories (eTable3).

    Discussion

    Principal findings

    Our main finding was that combined sensitivity and specificity was maximised at a cut-off value of 11 or higher across reference standards. For semi-structured interviews, which are designed to closely replicate clinical diagnoses by mental health professionals, sensitivity and specificity were 81% and 88% for a cut-off value of 11 or higher. At cut-off values of 10 or higher and 13 or higher, which are commonly used for depression screening,13 sensitivity and specificity were 85% and 84%, and 66% and 95%, respectively. Accuracy was similar across reference standards, similar among pregnant and postpartum women, and similar based on other study and participant characteristics.

    Comparison with other studies

    The cut-off value of 11 or higher that maximised combined sensitivity and specificity in the present study is lower than both the most commonly used cut-off value of 13 or higher13 and the cut-off value of 12 or higher that maximised combined sensitivity and specificity in a previous EPDS accuracy meta-analysis.13 Based on studies with a semi-structured reference standard, across cut-off values of 10-13, sensitivity estimates in the present IPDMA were 6-13% lower than those in the previous meta-analysis, whereas specificity estimates were 4-7% higher. Differences in results between the current IPDMA and the previous meta-analysis might have occurred because the current IPDMA consisted of 58 primary studies, including 36 with a semi-structured reference standard, versus 21 primary studies with various types of reference standards in the previous meta-analysis. Additionally, the current IPDMA incorporated data from all cut-off values for all included studies, whereas the previous meta-analysis was limited to published results and used different sets of studies to evaluate accuracy at different cut-off values.

    Implications

    Depression screening recommendations differ among prominent international guideline making bodies, and well conducted trials are needed to determine if screening would improve mental health or other important outcomes, such as child development and family outcomes. The present study found that an EPDS cut-off value of 11 or higher maximised combined sensitivity and specificity. Other cut-off values could be used in practice or clinical trials if either sensitivity or specificity is to be prioritised. For instance, if the intention is to only capture participants with high depressive symptom levels, a higher cut-off value might be desired. Conversely, if the intention is to avoid false negatives and capture all participants who might meet diagnostic criteria based on further evaluation, a lower cut-off value might be preferred. Clinicians considering screening for depression with the EPDS can refer to our online knowledge translation tool (depressionscreening100.com/epds), which estimates expected numbers of positive screens and true and false screening outcomes based on results from our IPDMA.

    Strengths and limitations

    This study used IPDMA to assess EPDS screening accuracy. Strengths include analysis of data from more than twice the number of studies compared with the previous conventional meta-analysis,13 and including results for all cut-off values from all studies. Additionally, our analysis examined the possible influence of study and participant characteristics on accuracy, and assessed accuracy separately across reference standards. A previous meta-analysis of the EPDS, which was a conventional meta-analysis and only included published results, was not able to incorporate results for all key cut-off values from all included studies because they were not consistently published. Furthermore, the previous meta-analysis was not able to conduct subgroup analyses by participant characteristics or reference standards.13 Among other important findings, the present study showed that the same cut-off value can be used in pregnant and postpartum women.

    Limitations also need to be considered. Firstly, we did not obtain data from 25 of 83 published eligible datasets, although the results did not change when we incorporated published results from studies that did not contribute data but published eligible accuracy results. Secondly, moderate heterogeneity was found across studies. Thirdly, we could not conduct subgroup analyses based on cultural aspects, such as country or language, or in specific pregnancy trimesters or postpartum periods because the data were insufficient. We found no significant or substantive differences based on country human development index, but few studies from low and middle income countries were included. Fourthly, while we categorised studies based on the interview administered, interviews might not always be used as intended; one third of studies were coded as unclear for interviewer qualification in our risk of bias assessment.

    Conclusions

    In summary, we found that combined sensitivity and specificity for the EPDS is maximised at a cut-off value of 11 or higher. Additionally, accuracy did not differ significantly based on reference standards or participant characteristics, including whether the EPDS is administered during pregnancy or in the postpartum period. Clinicians considering screening for depression with the EPDS can refer to our online tool (depressionscreening100.com/epds) to identify alternative cut-off values that maximise other parameters. Well conducted trials are needed to determine if screening with the EPDS could improve mental health outcomes and minimise harms and resource use.

    What is already known on this topic

    • The Edinburgh Postnatal Depression Scale (EPDS) is the most commonly used depression screening tool in perinatal care, with cut-off values of 10 or higher and 13 or higher typically used to identify women who might be depressed

    • A previous meta-analysis of the screening accuracy of the EPDS, which was conducted more than 13 years ago, found that a cut-off value of 12 or higher maximised combined sensitivity and specificity in postpartum women (21 studies)

    • The previous meta-analysis did not pool results for pregnant women because too few studies were found, and no subgroup analyses were conducted among postpartum women because primary studies did not report the necessary data

    What this study adds

    • An EPDS cut-off value of 11 or higher maximised combined sensitivity and specificity (81% and 88%, respectively)

    • For commonly used cut-off values of 10 or higher and 13 or higher, sensitivity and specificity were 85% and 84%, and 66% and 95%, respectively; results did not differ across subgroups, including pregnant versus postpartum status

    • An online knowledge translation tool is available to estimate the expected number of positive screens and true and false screening outcomes based on study results (depressionscreening100.com/epds)

    Acknowledgments

    Members of the DEPRESSD EPDS Group: Chen He, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Ankur Krishnan, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Yin Wu, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Parash Mani Bhandari, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Dipika Neupane, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Mahrukh Imran, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Danielle B Rice, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Marleine Azar, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Tatiana A Sanchez, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Matthew J Chiovitti, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Nazanin Saadat, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Kira E Riehm, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Jill T Boruff, Schulich Library of Physical Sciences, Life Sciences, and Engineering, McGill University, Montreal, Québec, Canada; Lorie A Kloda, Library, Concordia University, Montréal, Québec, Canada; Pim Cuijpers, Department of Clinical, Neuro and Developmental Psychology, EMGO Institute, Vrije Universiteit Amsterdam, Netherlands; Simon Gilbody, Hull York Medical School and the Department of Health Sciences, University of York, Heslington, York, UK; John P A Ioannidis, Department of Medicine, Department of Epidemiology and Population Health, Department of Biomedical Data Science, Department of Statistics, Stanford University, Stanford, CA, USA; Dean McMillan, Hull York Medical School and the Department of Health Sciences, University of York, Heslington, York, UK; Scott B Patten, Departments of Community Health Sciences and Psychiatry, University of Calgary, Calgary, Canada; Ian Shrier, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada; Roy C Ziegelstein, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Liane Comeau, International Union for Health Promotion and Health Education, École de santé publique de l’Université de Montréal, Montréal, Québec, Canada; Nicholas D Mitchell, Department of Psychiatry, University of Alberta, Edmonton, Alberta, Canada; Marcello Tonelli, Department of Medicine, University of Calgary, Calgary, Alberta, Canada; Simone N Vigod, Women’s College Hospital and Research Institute, University of Toronto, Toronto, Ontario, Canada; Franca Aceti, Department of Neurology and Psychiatry, Sapienza University of Rome, Rome, Italy; Rubén Alvarado, School of Public Health, Faculty of Medicine, Universidad de Chile, Santiago, Chile; Cosme Alvarado-Esquivel, Laboratorio de Investigación Biomédica, Facultad de Medicina y Nutrición, Avenida Universidad, Dgo, Mexico; Muideen O Bakare, Child and Adolescent Unit, Federal Neuropsychiatric Hospital, Enugu, Nigeria; Jacqueline Barnes, Department of Psychological Sciences, Birkbeck, University of London, UK; Amar D Bavle, Department of Psychiatry, Rajarajeswari Medical College and Hospital, Bengaluru, Karnataka, India; Cheryl Tatano Beck, University of Connecticut School of Nursing, Mansfield, CT, USA; Carola Bindt, Department of Child and Adolescent Psychiatry, University Medical Center Hamburg-Eppendorf, Germany; Philip M Boyce, Discipline of Psychiatry, Westmead Clinical School, Sydney Medical School, University of Sydney, Sydney, Australia; Adomas Bunevicius, Neuroscience Institute, Lithuanian University of Health Sciences, Kaunas, Lithuania; Humberto Correa, Medicine Faculty – Universidade Federal de Minas Gerais. Belo Horizonte, MG, Brazil; Tiago Castro e Couto, Federal University of Uberlândia, Brazil; Linda H Chaudron, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA; Genesis Chorwe-Sungani, Department of Mental Health, Kamuzu College of Nursing, University of Malawi, Blantyre, Malawi; Felipe Pinheiro de Figueiredo, Department of Neurosciences and Behaviour, Ribeirão Preto Medical School, Brazil; Valsamma Eapen, University of New South Wales and Ingham Institute South West Sydney LHD, Australia; Nicolas Favez, Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland; Ethel Felice, Department of Psychiatry, Mount Carmel Hospital, Attard, Malta; Gracia Fellmeth, Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Mae Sot, Thailand; Michelle Fernandes, Faculty of Medicine, Department of Paediatrics, University of Southampton, Southampton and Nuffield Department of Women’s and Reproductive Health, University of Oxford, UK; Sally Field, Perinatal Mental Health Project, Alan J. Flisher Centre for Public Mental Health, Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa; Barbara Figueiredo, School of Psychology, University of Minho, Portugal; Jane R W Fisher, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia; Lluïsa Garcia-Esteve, Perinatal Mental Health Unit CLINIC-BCN, Institut Clínic de Neurociències, Hospital Clínic, Barcelona, Spain; Lisa Giardinelli, Psychiatry Unit, Department of Health Sciences, University of Florence, Firenze, Italy; Eric P Green, Duke Global Health Institute, Durham, NC, USA; Nadine Helle, Department of Child and Adolescent Psychiatry, University Medical Center Hamburg-Eppendorf, Germany; Simone Honikman, Perinatal Mental Health Project, Alan J. Flisher Centre for Public Mental Health, Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa; Louise M Howard, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK; Pirjo A Kettunen, Department of General Hospital Psychiatry, North Karelia Central Hospital, Joensuu, Finland; Dina Sami Khalifa, Ahfad University for Women, Omdurman, Sudan; Jane Kohlhoff, University of New South Wales, Kensington, Australia; Laima Kusminskas, Private Practice, Hamburg, Germany; Zoltán Kozinszky, Department of Obstetrics and Gynaecology, Danderyd Hospital, Stockholm, Sweden; Lorenzo Lelli, Psychiatry Unit, Department of Health Sciences, University of Florence, Firenze, Italy; Angeliki A Leonardou, First Department of Psychiatry, Women’s Mental Health Clinic, Athens University Medical School, Athens, Greece; Michael Maes, Department of Psychiatry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand; Carina Y Marsay, Department of Psychiatry, University of the Witwatersrand, Johannesburg, South Africa; Pablo Martínez, Escuela de Psicología, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile; Valentina Meuti, Department of Neurology and Psychiatry, Sapienza University of Rome, Rome, Italy; Sandra Nakić Radoš, Department of Psychology, Catholic University of Croatia, Zagreb, Croatia; Purificación Navarro García, Perinatal Mental Health Unit CLINIC-BCN, Institut Clínic de Neurociències, Hospital Clínic, Barcelona, Spain; Daisuke Nishi, Department of Mental Health, Graduate School of Medicine, University of Tokyo, Japan; Daniel Okitundu Luwa E-Andjafono, Unité de Neuropsychologie, Département de Neurologie, Centre Neuro-psycho-pathologique, Faculté de Médecine, Université de Kinshasa, République Démocratique du Congo; Susan J Pawlby, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK; Emma Robertson-Blackmore, Halifax Health, Graduate Medical Education, Daytona Beach, FL. USA; Tamsen J Rochat, MRC/Developmental Pathways to Health Research Unit, Faculty of Health Sciences, University of Witwatersrand, South Africa; Heather J Rowe, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia; Deborah J Sharp, Centre for Academic Primary Care, Bristol Medical School, University of Bristol, UK; Bonnie W M Siu, Department of Psychiatry, Castle Peak Hospital, Hong Kong SAR, China; Alkistis Skalkidou, Department of Women’s and Children’s Health, Uppsala University, Uppsala, Sweden; Johanne Smith-Nielsen, Center for Early intervention and Family Studies, Department of Psychology, University of Copenhagen, Denmark; Alan Stein, Department of Psychiatry, University of Oxford, Oxford, UK; Robert C Stewart, Department of Mental Health, College of Medicine, University of Malawi, Malawi; Kuan-Pin Su, An-Nan Hospital, China Medical University and Mind-Body Interface Laboratory, China Medical University Hospital, Taiwan; Inger Sundström-Poromaa, Department of Women’s and Children’s Health, Uppsala University, Uppsala, Sweden; Meri Tadinac, Department of Psychology, Faculty of Humanities and Social Sciences, University of Zagreb, Croatia; S Darius Tandon, North western University Feinberg School of Medicine, Chicago, IL, USA; Iva Tendais, School of Psychology, University of Minho, Portugal; Pavaani Thiagayson, The Institute of Mental Health, Singapore; Annamária Töreki, Department of Emergency, University of Szeged, Hungary; Anna Torres-Giménez, Perinatal Mental Health Unit CLINIC-BCN, Institut Clínic de Neurociències, Hospital Clínic, Barcelona, Spain; Thach D Tran, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia; Kylee Trevillion, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK; Friday P Tungchama, Department of Psychiatry, College of Health Sciences, University of Jos, Nigeria; Katherine Turner, Epilepsy Center-Child Neuropsychiatry Unit, ASST Santi Paolo Carlo, San Paolo Hospital, Milan, Italy; Mette S Væver, Centre for Early Intervention and Family Studies, Department of Psychology, University of Copenhagen, Copenhagen, Denmark; Thandi van Heyningen, Perinatal Mental Health Project, Alan J. Flisher Centre for Public Mental Health, Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa; Johann M Vega-Dienstmaier, Facultad de Medicina Alberto Hurtado, Universidad Peruana Cayetano Heredia, Lima, Perú; Karen Wynter, School of Nursing and Midwifery, Deakin University, Melbourne, Australia; and Kimberly A Yonkers, Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.

    Footnotes

    • Contributors: BL, ABe, BDT were responsible for the study conception and design; BL, ZN, YS, BDT contributed to data extraction, coding, evaluation of included studies, and data synthesis; BL, ZN, ABe, BDT contributed to data analysis and interpretation; BL, ABe, BDT drafted the manuscript. Members of the DEPRESSD EPDS Group contributed to data extraction, coding, and synthesis: CH, AK, YW, PMBh, DNe, MI, DBR, MA, TAS, MJC, NS, KER; via the design and conduct of database searches: JTB, LAK; as members of the DEPRESSD Steering Committee, including conception and oversight of collaboration: PC, SG, JPAI, DM, SBP, IS, RCZ; as a knowledge user consultant: LC, NDM, MTo, SNV; by contributing included datasets: FA, RA, CAE, MOB, JB, ADB, CTB, CB, PMBo, ABu, HC, TCeC, LHC, GCS, FPdF, VE, NF, EF, GF, MF, SF, BF, JRWF, LGE, LG, EPG, NH, SH, LMH, PAK, DSK, JK, LK, ZK, LL, AAL, MM, CYM, PM, VM, SNR, PNG, DNi, DOLEA, SJP, ERB, TJR, HJR, DJS, BWMS, ASk, JSN, ASt, RCS, KPS, ISP, MTa, SDT, IT, PT, AT, ATG, TDT, KTr, FPT, KTu, MSV, TvH, JMVD, KW, KAY. All authors, including group authors, provided a critical review and approved the final manuscript. AB and BDT contributed equally as co-senior authors and are the guarantors; they had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analyses. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

    • Funding: This study was funded by the Canadian Institutes of Health Research (CIHR, KRS-140994). BL and YW were supported by Fonds de recherche du Québec – Santé (FRQS) Postdoctoral Training Fellowships. ABe and BDT were supported by FRQS researcher salary awards. PMBh was supported by a studentship from the Research Institute of the McGill University Health Centre. DNe was supported by G R Caverhill Fellowship from the Faculty of Medicine, McGill University. DBR was supported by a Vanier Canada Graduate Scholarship. MA was supported by an FRQS Masters Training Award. The primary study by Alvarado et al was supported by the Ministry of Health of Chile. The primary study by Barnes et al was supported by a grant from the Health Foundation (1665/608). The primary study by Beck et al was supported by the Patrick and Catherine Weldon Donaghue Medical Research Foundation and the University of Connecticut Research Foundation. The primary study by Helle et al was supported by the Werner Otto Foundation, the Kroschke Foundation, and the Feindt Foundation. Robertas Bunevicius (1958-2016) was principal investigator of the primary study by Bunevicius et al, but passed away and was unable to participate in this project. The primary study by Figueira et al was supported by the Brazilian Ministry of Health and by the National Counsel of Technological and Scientific Development (CNPq; grant No 403433/2004-5). The primary study by Couto et al was supported by CNPq (grant No 444254/2014-5) and the Minas Gerais State Research Foundation (FAPEMIG; grant No APQ-01954-14). The primary study by Chaudron et al was supported by a grant from the National Institute of Mental Health (grant K23 MH64476). The primary study by Chorwe-Sungani et al was supported by the University of Malawi through grant QZA-0484 NORHED 2013. The primary study by de Figueiredo et al was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo. The primary study by Tissot et al was supported by the Swiss National Science Foundation (grant 32003B 125493). The primary study by Fernandes et al was supported by grants from the Child: Care Health and Development Trust and the Department of Psychiatry, University of Oxford, UK, and by the Ashok Ranganathan Bursary from Exeter College, University of Oxford. Dr Fernandes is supported by a University of Southampton National Institute for Health Research (NIHR) academic clinical fellowship in Paediatrics. The primary study by van Heyningen et al was supported by the Medical Research Council of South Africa (fund No 415865), Cordaid Netherlands (project 103/10002 G Sub 7) and the Truworths Community Foundation Trust, South Africa. Dr van Heyningen was supported by the National Research Foundation of South Africa and the Harry Crossley Foundation (VHYTHE001/1232209). The primary study by Tendais et al was supported under the project POCI/SAU-ESP/56397/2004 by the Operational Program Science and Innovation 2010 (POCI 2010) of the Community Support Board III and by the European Community Fund FEDER. The primary study by Fisher et al was supported by a grant under the Invest to Grow Scheme from the Australian Government Department of Families, Housing, Community Services and Indigenous Affairs. The primary study by Garcia-Esteve et al was supported by grant 7/98 from the Ministerio de Trabajo y Asuntos Sociales, Women’s Institute, Spain. The primary study by Green et al was supported by a grant from the Duke Global Health Institute (453-0751). The primary study by Howard et al was supported by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (grant reference No RP-PG-1210-12002 and RP-DG-1108-10012) and by the South London Clinical Research Network. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. The primary study by Kettunen et al was supported with an Annual EVO Financing (special government subsidies from the Ministry of Health and Welfare, Finland) by North Karelia Central Hospital and Päijät-Häme Central Hospital. The primary study by Phillips et al was supported by a scholarship from the National Health and Medical and Research Council (NHMRC). The primary study by Roomruangwong et al was supported by the Ratchadaphiseksomphot Endowment Fund 2013 of Chulalongkorn University (CU-56-457-HR). The primary study by Marsay et al was supported by the South Africa Medical Research Counsel in terms of the Clinician Researcher PhD Programme. The primary study by Martínez et al was supported by Iniciativa Científica Milenio, Chile, process No IS130005 and by Fondo Nacional de Desarrollo Científico y Tecnológico, Chile, process No 1130230. The primary study by Nakić Radoš et al was supported by the Croatian Ministry of Science, Education, and Sports (134-0000000-2421). The primary study by Navarro et al was supported by grant 13/00 from the Ministry of Work and Social Affairs, Institute of Women, Spain. The primary study by Usuda et al was supported by Grant-in-Aid for Young Scientists (A) from the Japan Society for the Promotion of Science (primary investigator Daisuke Nishi), and by an intramural research grant for neurological and psychiatric disorders from the National Center of Neurology and Psychiatry, Japan. The primary study by Pawlby et al was supported by a Medical Research Council UK Project Grant (number G89292999N). Dr Robertson-Blackmore was supported by a Young Investigator Award from the Brain and Behaviour Research Foundation and NIMH grant K23MH080290. The primary study by Rochat et al was supported by grants from the University of Oxford (HQ5035), the Tuixen Foundation (9940), the Wellcome Trust (082384/Z/07/Z and 071571), and the American Psychological Association. Dr Rochat receives salary support from a Wellcome Trust Intermediate Fellowship (211374/Z/18/Z). The primary study by Rowe et al was supported by the diamond Consortium, beyondblue Victorian Centre of Excellence in Depression and Related Disorders. The primary study by Comasco et al was supported by funds from the Swedish Research Council (VR: 521-2013-2339, VR: 523-2014-2342), the Swedish Council for Working Life and Social Research (FAS: 2011-0627), the Marta Lundqvist Foundation (2013, 2014), and the Swedish Society of Medicine (SLS-331991). The primary study by Smith-Nielsen et al was supported by a grant from the charitable foundation Tryg Foundation (grant No 107616). The primary study by Prenoveau et al was supported by The Wellcome Trust (grant No 071571). The primary study by Stewart et al was supported by Professor Francis Creed’s Journal of Psychosomatic Research Editorship fund (BA00457) administered through University of Manchester. The primary study by Su et al was supported by grants from the Department of Health (DOH94F044 and DOH95F022) and the China Medical University and Hospital (CMU94-105, DMR-92-92, and DMR94-46). The primary study by Tandon et al was funded by the Thomas Wilson Sanitarium. The primary study by Tran et al was supported by the Myer Foundation who funded the study under its Beyond Australia scheme. Dr Tran was supported by an early career fellowship from the Australian National Health and Medical Research Council. The primary study by Vega-Dienstmaier et al was supported by Tejada Family Foundation, and Peruvian-American Endowment. The primary study by Yonkers et al was supported by a National Institute of Child Health and Human Development grant (5 R01HD045735). References for all included studies can be found at the end of the supplementary material.

    • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: support from Canadian Institutes of Health Research for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years with the following exceptions: MTo declares that he has received a grant from Merck Canada, outside the submitted work; SNV declares that she receives royalties from UpToDate, outside the submitted work; CTB declares that she receives royalties for her Postpartum Depression Screening Scale published by Western Psychological Services; PMBo declares that he receives grants and personal fees from Servier, grants from Lundbeck, and personal fees from AstraZeneca, all outside the submitted work; LMH declares that she has received personal fees from NICE Scientific Advice, outside the submitted work; ISP declares that she has served on advisory boards and acted as invited speaker at scientific meetings for MSD, Novo Nordisk, Bayer Health Care, and Lundbeck A/S; KAY declares that she receives royalties from UpToDate, outside the submitted work. All authors declare no other relationships or activities that could appear to have influenced the submitted work. No funder had any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    • Ethical approval: As this study involved secondary analysis of deidentified previously collected data, the Research Ethics Committee of the Jewish General Hospital declared that this project did not require research ethics approval. However, for each included dataset, the authors confirmed that the original study received ethics approval and that all patients provided informed consent.

    • Data sharing: Requests to access data should be made to the corresponding author.

    • The manuscript’s guarantor affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

    • Dissemination to participants and related patient and public communities: There are no plans to disseminate the results of the research to study participants or the relevant patient community. However, an online knowledge translation tool, intended for clinicians (the end users of the EPDS screening tool), is available at depressionscreening100.com/epds. The tool allows clinicians to estimate the expected number of positive screens and true and false screening outcomes based on study results.

    • Provenance and peer review: Not commissioned; externally peer reviewed.

    http://creativecommons.org/licenses/by-nc/4.0/

    This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

    References