Article Text

Download PDFPDF

Use of routinely collected electronic healthcare data for postlicensure vaccine safety signal detection: a systematic review
  1. Yonatan Moges Mesfin,
  2. Allen Cheng,
  3. Jock Lawrie,
  4. Jim Buttery
  1. School of Population Health and Preventive Medicine, Monash University, Melbourne, Clayton, Victoria, Australia
  1. Correspondence to Yonatan Moges Mesfin; Yonatan.Mesfin{at}


Background Concerns regarding adverse events following vaccination (AEFIs) are a key challenge for public confidence in vaccination. Robust postlicensure vaccine safety monitoring remains critical to detect adverse events, including those not identified in prelicensure studies, and to ensure public safety and public confidence in vaccination. We summarise the literature examined AEFI signal detection using electronic healthcare data, regarding data sources, methodological approach and statistical analysis techniques used.

Methods We performed a systematic review using the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines. Five databases (PubMed/Medline, EMBASE, CINAHL, the Cochrane Library and Web of Science) were searched for studies on AEFIs monitoring published up to 25 September 2017. Studies were appraised for methodological quality, and results were synthesised narratively.

Result We included 47 articles describing AEFI signal detection using electronic healthcare data. All studies involved linked diagnostic healthcare data, from the emergency department, inpatient and outpatient setting and immunisation records. Statistical analysis methodologies used included non-sequential analysis in 33 studies, group sequential analysis in two studies and 12 studies used continuous sequential analysis. Partially elapsed risk window and data accrual lags were the most cited barriers to monitor AEFIs in near real-time.

Conclusion Routinely collected electronic healthcare data are increasingly used to detect AEFI signals in near real-time. Further research is required to check the utility of non-coded complaints and encounters, such as telephone medical helpline calls, to enhance AEFI signal detection.

Trial registration number CRD42017072741

  • electronic healthcare records
  • post-licensure safety surveillance
  • adverse event following immunization
  • systematic review

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key questions

What is already known?

  • Adverse event(s) following immunisation (AEFI) signal detection has primarily relied on passive surveillance reporting.

What are the new findings?

  • AEFIs signal monitoring using population-based electronic health records (EHRs) is increasing, but has been primarily limited to diagnostic data from hospital settings.

  • Continuous sequential (rapid cycle) analysis method allows AEFIs signal monitoring in near real-time.

  • Data delays (data accrual lags) are the key challenges to perform near real-time AEFI monitoring using EHRs.

What do the new findings imply?

  • A complementary and efficient AEFI signal monitoring system is feasible using EHRs.

  • Further research is required to evaluate the utility of syndromic data/proxy measures to enhance the timeliness of monitoring AEFIs.


Vaccination is one of the most effective public health interventions. Current immunisation programmes provide protection against up to 26 diseases and prevent an estimated 2–3 million deaths every year.1 2 It is estimated that 1.5 million more deaths could be saved through further increasing vaccination coverage of existing vaccines.3 However, this remarkable success has been challenged due to vaccine safety concerns and increasing vaccine hesitancy, largely due to fear of adverse event following immunisation (AEFIs). Notably, following the sharp reduction of incidence of vaccine-preventable diseases the public attention to AEFI has increased. This can result in loss of confidence in vaccination, a resultant drop in vaccine coverage and eventually lead to a re-emergence of controlled disease (figure 1).4 Hence, timely detection of potentially causally related adverse events (AEs) and more rapidly refute spurious claims regarding AEs using real-world data is critical to maintain the community and providers confidence in vaccine programmes. Nevertheless, recent analysis of global AEFI reporting found that more than 36% of WHO member countries do not have a functional postlicensure safety monitoring system for vaccines.5

Figure 1

Potential stage in the evolution of an immunisation programme, vaccine safety. Diagram adapted from Chen et al. The Vaccine Adverse Effect Reporting System (VAERS). Vaccine 1994:12(6):542–50.

Postlicensure AEFIs monitoring is often classified into three stages: signal detection, signal refinement and signal confirmation. A vaccine safety signal is defined as reported information on a possible causal relationship between an adverse event and a vaccine, the relationship being unknown or incompletely documented previously’.6 Generally, AEFI signal detection has been undertaken using passive surveillance or active surveillance system. Passive surveillance systems, the prevailing AEFI monitoring system, monitor reports of AEs that are spontaneously submitted by healthcare providers, vaccinated individuals/their caregivers or others. Its wide population coverage allows for detection of new and unanticipated AEs but has limitations of under-reporting and imprecise risk estimates due to lack of appropriate denominator data.7 According to the 2015 Global Vaccine Safety Initiative meeting report, low passive AEFI reporting rates are a significant barrier to detect vaccine safety signal timely.8 In contrast, active surveillance of AEFI involves proactively seeking information from healthcare providers, vaccinated individuals/their caregivers, or related datasets using well-designed study protocols. These surveillance systems provide more detail, less biased information and appropriate denominators. However, active surveillance systems are resource intensive and takes substantial time to achieve the required sample size to study rare AEs. Hence, their use in many settings are largely limited to investigate signals detected from the passive surveillance systems, literature review or possible prelicensure trial safety questions.7 9 10

Encouragingly, in recent years, new studies have shown that routinely collected electronic health records (EHRs) can be used as an alternative data source to monitor for AEFI signals in near real-time.11 12 For example, in the USA, newly marketed vaccines are monitored for potential AEFIs weekly using the Vaccine Safety Datalink (VSD) collaboration between the US Centre for Disease Control and eight healthcare organisations. In the VSD, patient encounters and diagnoses made in an emergency department, outpatient clinic and hospital are linked with previous vaccine via patient-specific study identification numbers. Though the regular use of VSD is to investigate known AEFI signals identified from passive surveillance, published studies also show that VSD and other EHR detection systems are suitable for rapid detection of AEFIs signals.13–15

Considering the increasing availability of EHRs and the necessity of further improving the capacity of vaccine safety monitoring, particularly in low-income and middle-income countries, EHRs can offer an alternative data source to establish complementary active AEFI surveillance systems. By systematically summarising these literature, we intend to provide valuable information for countries considering establishing AEFI signal detection system based on EHRs. Therefore, we aimed to: (1) describe the features of postlicensure vaccine safety studies employing EHRs primarily for safety signal detection and (2) catalogue the nature of data sources, methodological approaches and analysis techniques applied


Search strategy

A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines,16 as provided in online supplementary file 1. The protocol was registered at the international prospective register of systematic reviews (registration number CRD42017072741). We searched OVID Medline (1946 to September week 3 2017), OVID Embase (1974 to 2017 September 10), the Cochrane Library, Scopus and Web of Science. Comprehensive search terms for all databases were developed in consultation with a medical librarian to identify all potentially relevant studies. A combination of keywords and Medical Subject Headings (MeSH) were used in each database with appropriate adjustment. Final searches were performed on 25 September 2017. An example of the search strategy used in Ovid MEDLINE is shown in online supplementary file 1. In addition, bibliographies of relevant studies, conference papers/proceedings and grey literature databases, such as and, were searched to identify further important and unpublished studies.

Supplemental material

Studies selection criteria and screening

We included studies primarily focussing on AEFI signal detection using EHRs. Studies were included regardless of vaccine type, population group studied, study setting and methodology used. However, studies based on passive pharmacovigilance data or administrative (claim) data; studies conducted solely to test or verify the previously identified signals and feasibility studies or studies conducted to evaluate methodologies were excluded from the review. We also excluded non-English records and conference abstracts.

Search results were downloaded and managed in EndNote X8. Articles were screened in three stages (titles alone, abstracts and then full-text review) based on the PRISMA flow of information (figure 2). At the initial stage, titles and abstracts were screened to remove duplicate records and studies clearly outside the scope of the review. Then, two reviewers conducted a full-text review to assess the eligibility based on the inclusions criteria. Study screening stages and the reasons for articles exclusion during full-text review are described in figure 2.

Figure 2

Flow diagram shows stages of study selection and screening. Articles may have been excluded for more than one reasons.

Quality assessment and data extraction

We used a checklist adapted from the Food and Drug Authority (Best Practices for Conducting and Reporting Pharmaco-epidemiologic Safety Studies Using EHR).17 Many of the critical appraisal tools extensively used to appraise observational studies, such as Ottawa-Newcastle tool and strengthening the reporting of observational studies in epidemiology (STROBE), are not suitable for evaluating pharmaco-epidemiological studies and public health surveillance as they are reasonably different from the standard epidemiological studies. The lead author (YMM) assessed risk of bias of all the included studies, and the second independent reviewer (TK) evaluated 25% of the studies randomly for verification. As there was no substantial risk of bias identified, we considered all appraised studies for the final review. The methodological quality and risk of bias assessment criteria were:

  • Well defined research questions.

  • Sample representativeness.

  • Clear inclusion and exclusion criteria.

  • Appropriateness of study design and comparison groups.

  • Follow-up (risk interval) long enough for the events to occur.

  • Appropriateness of data integration method, when relevant.

  • Adjustment of confounders.

  • Employed appropriate statistical analyses method.

  • Used objective criteria to measure outcomes.

The lead author consistently extracted the required data using pretested data abstraction template. The following information were extracted across the included studies:

  • Study author.

  • Publication year.

  • Study setting and period.

  • Data source(s) and nature of the data (diagnostic vs prediagnostic).

  • Study design(s) employed.

  • Studied population.

  • Vaccine(s) and AE(s) studied.

  • Statistical analysis approaches and signal detection method used.

  • Frequency of assessment.

  • Method(s) of controlling confounders reported and challenges reported.

  • Main findings (signal (s) identified or not).

Data analysis

Key features of the studies are described quantitatively. Results from the selected studies are synthesised in a narrative analysis. The structure of the detailed review includes: vaccines monitored; AEs studied; study design(s) used; data analysis approach and signal detection method employed.

Patient and public involvement statement

No patient data were consided in this study.


Studies identified and characteristics

After removal of duplicate articles, we screened the titles and abstracts of 606 articles and excluded articles clearly out of the scope of this review. Then, we screened the remaining 235 full-text articles according to the exclusion criteria (figure 2). Studies could be excluded for more than one reason. Forty-seven articles, conducted between 2002 and 2017, were included in the final synthesis.18–64 No studies were excluded based on quality or bias.

Almost all studies included in this review were conducted in the USA (n=45).18–25 27–33 35–65 Two additional studies were conducted in the UK26 and Taiwan.34 A considerable number of studies (n=13, 28%) assessed the safety of vaccines administered to high-risk groups (pregnant women or elderly subjects). Fourteen (30%) studies assessed the AEFIs in near real-time (table 1).

Table 1

Summary characteristics of selected studies

Vaccines studied

Multiple types of vaccines, including live, inactivated, monovalent and combined, were monitored after licensure for potential AEFI. Seasonal influenza vaccines (trivalent inactivated influenza vaccines (TIIV), live attenuated influenza vaccines, monovalent influenza vaccines and live attenuated monovalent influenza vaccines) were most frequently studied (n=17), followed by combined diphtheria-tetanus toxoid-acellular pertussis (Tdap) vaccines (n=5)(figure 3).

Figure 3

Type of vaccines studied by the selected studies.

AEFIs studied and data source

Most of the reviewed studies (n=35) studied preidentified AEs using a fixed postvaccination risk interval. AEs were selected based on the safety concerns from passive surveillance reports and prelicensure clinical trials. Frequently studied AEs were Guillain-Barré syndrome, febrile convulsions, seizures, anaphylaxis, meningitis/encephalitis and local reactions. Potential maternal and infant outcome (AEFIs), such as pre-eclampsia/eclampsia, maternal death, small for gestational age, preterm birth, stillbirth and neonatal death were also evaluated. Studied AEFIs were mainly identified using International Classification of Diseases (ICD) Clinical Modification codes as well as relevant ICD-9 or ICD-10 codes from electronic records (outpatient, inpatient and emergency department settings). In some studies, patients’ charts/medical records were manually reviewed to verify the AEs.

In this review, 14 statistically elevated vaccine-AE pairs (signals) were detected, and 6 were confirmed. These were measles, mumps, rubella and varicella vaccine and seizure/febrile convulsion,38 43 2010–2011 TIIV and febrile seizure,57 monovalent rotavirus vaccine and intussusception,61 2014–2015 TIIV and febrile seizures48 and Tdap vaccine and chorioamnionitis.41

Study designs employed

Self-controlled design was the most frequently used study design (n=22),18–21 25 27 28 30–34 36 38 39 44 46–48 53 57–59 62 63 followed by cohort design with historical comparison (also called observed vs expected analysis) (n=20).18 22–26 29 34 38 39 43 45 47–49 57 60 61 63 64 Self-controlled design can be self-controlled risk interval (SCRI) or self-controlled case series (SCCS). Cohort design with concurrent/parallel comparison group,19 20 29 40–42 50–52 mostly to examine vaccines administered to pregnant women, and case-crossover study designs were also employed.28 32 Of note, 18 studies (38.3%) employed more than one study design; of these, SCRI and current versus historical designs were often used together.25 34 38 39 47 48 57 63

Statistical analysis and signal detection method

Two broad data analytic approaches, non-sequential analysis and sequential analysis, were employed to identify elevated risk of AEs associated with a given vaccine. In studies that employ a non-sequential analysis approach (n=33), statistical tests are performed after all the data are collected/accumulated. Detailed description of these studies and their analytic approaches are provided in online supplementary file 2. The sequential analysis approach allows repeated examination of data to check for AEFI increased occurrence. This was implemented in two different ways in the included studies: (i) as group sequential analysis (n=2), which involved a periodic statistical test and limited number of statistical tests over time and (ii) as continuous sequential analyses (n=12), also called ‘rapid cycle analysis’, which involved a weekly statistical test until the end of the study period (table 2).

Supplemental material

Table 2

Included studies implemented near real-time vaccine safety monitoring methods (sequential analysis)

The choice of specific statistical tests was guided by the data analysis approach used. Standard analytic tests, such as logistic and Cox regression, were used to examine the data at the end of the study period (end-of-study analysis). A sequential hypothesis test statistic, the sequential probability ratio test (SPRT), was used to examine data for an elevated risk of AEFI continually over time. In particular, maximised sequential probability ratio test (MaxSPRT) was the most frequently applied sequential hypothesis test statistic.22 24 29 34 39 43 47 48 57 61 62 64 It has different versions: Poisson MaxSPRT, Binomial MaxSPRT and Conditional MaxSPRT (table 2). Further, supplementary analyses were performed to verify the detected signals and instances of elevated risks. These included temporal scan statistics, to evaluate clustering of events after vaccination, and case-centred regression and logistic regression.29 39 43 47–49 60 61 64

Confounder adjustment and potential challenges

Many different potential confounders were measured including age, gender, chronic conditions, site, seasonality, trend, concomitant vaccines and delay in the arrival of patient data. Generally, studies adjusted confounding variables in three ways: using data restriction, matching and stratification (alone or in combination). Strategies chosen were often design-based and included the following: (i) using a matched control design to adjust baseline confounders and seasonal trends; (ii) using self-controlled design, which automatically addresses time-invariant confounders and (iii) adjusting the expected rate calculated from historical data. Interestingly, during analysis, MaxSPRT inherently allows controlling bias due to repeated tests. In this review, the most cited challenges, particularly in the case of continuous sequential analysis, were uncertainty in estimating background rates, outcome misclassification, partially elapsed risk window and late-arriving data (data accrual lags).


Routinely collected EHRs are increasingly used for the detection of AEFIs signal besides for testing hypothesis based on known signals. Evidence from this review suggests that electronic healthcare data have a significant potential to establish a near real-time AEFI surveillance systems. All the included studies used coded diagnostic medical data to get information about the studied AEs. Further, non-pharmacovigilance studies have also suggested that alternative non-coded medical information, such as telephone triage data and ambulance data, have potential for near real-time syndromic surveillance and rapidly detection of outbreak signal.66 67

A near real-time surveillance systems involves continuous checking (rapid cycle analysis (RCA)) of the EHRs for an elevated occurrence of AEs as the new data are added over the study period. It was first used to evaluate the safety of meningococcal conjugate vaccine using electronic healthcare data from the VSD in the USA,14 though Davis etal established its feasibility by replicating the previously recognised rotavirus-intussusception signal.68 Since then, we identified 12 studies that examined AEFI signal using RCA method.14 22 24 29 39 43 47 48 57 61 62 64 The RCA method has been also used based on an alternative data sources other than EHRs. For example, in the UK, H1N1 vaccine was monitored using passive surveillance data,69 and in Australia seasonal influenza vaccines have been monitored since 2015, based on data collected directly from consumers using SMS-messaging and email (AusVaxSafety).70

The near real-time AEFI surveillance systems use sequential analysis approach, primarily MaxSPRT, to continuously evaluate data for signals while adjusting bias due to multiple testing. MaxSPRT is an improved type of the classical SPRT, which uses a two-sided alternative hypothesis and a predefined relative risk (RR) value usually other than 1. MaxSPRT uses one-sided composite alternative hypothesis by defining the RR usually as >1 to declare statistically significant risk.71 The key advantage of MaxSPRT over the classical SPRT is that it helps to minimise the risk of late detection of AEs due to an incorrect choice of RR and make it suitable for data monitoring more frequently.14 Indications, advantages and weakness of both classical and MaxSPRT, including the three variants of MaxSPRT, are provided in table 3.24 47

Table 3

Sequential statistical approaches for postlicensure vaccine safety surveillance (description, indication and challenges)

As vaccines are often recommended for all persons in a given age group, traditional epidemiological cohort and case-control designs are usually not suitable to study vaccines AEs after licensure. The main reasons include an inadequate number of comparison groups (unvaccinated individuals), concern regarding comparability of the vaccinated to unvaccinated groups (selection bias), insufficient power and timeliness.72 Rather, self-controlled design (SCRI and SCCS) and cohort design, with a historical comparison, are the preferred design choice in postlicensure vaccine safety studies (table 4). In self-controlled design, comparisons are made with individuals in two different periods, vaccination risk period and control period. The incidence of AEFI is compared between prespecified postvaccination risk period and control period (unexposed period).73 Studies showed that including a prevaccination control period is essential to facilitate timely data analysis for vaccines administered in a short period, mostly in case of seasonal influenza vaccine. However, if there are clinical confounders that are a contraindication for vaccination (eg, allergic reaction) or indications for vaccination (eg, seizure disorder), a prevaccination control period is not recommended.39 47 48 57 74 75

Table 4

Commonly used study designs in postlicensure vaccine safety monitoring (study population, comparison group, indication, strength and weakness)

A cohort study design with a historical comparison is used frequently for detecting AEFI signals. This design compares the observed incidence of AEFI in the risk period after vaccination of the studied vaccine(s) against the expected incidence of AEFI projected based on the historical data.22 It helps to improve the timeliness of detecting the AEFI signal because only data for the risk window is collected rather than waiting for data for the comparison window.48 However, studies showed that accurate baseline risk estimation is a very challenging task, and it may introduce bias if the historical population are considerably different from the studied population. Nevertheless, this problem can be minimised through simultaneous use of the self-controlled design as they have complementary strengths (table 4).14 48

The essential requirement to conduct a near real-time AEFI surveillance based on EHRs is the availability of timely data. Both data accrual lag and partially elapsed risk window, the risk windows might not be fully elapsed for some AEs at the time of each analysis, can deter performing RCA.74 76 Data accrual lag in EHRs can occur due to several reasons and the level of delay may vary depending on the outcomes studied. A study from UK showed that up to 30 days or more are required to completely record AEFI diagnoses at general practice level.77 Two studies were included in this review,39 48 and methodological evaluation studies suggested that various design-based measures can be taken for adjusting partially elapsed risk window and data accrual lags. These include: (i) calculating the expected counts of AEFIs comparable to the elapsed risk window length; (ii) restricting comparison periods proportional to the elapsed risk period or (iii) AEFIs occurring in later weeks in the risk window can be ignored if the matching weeks in the control period have not elapsed.48 71 78–80


The utility of routinely collected EHRs for AEFI monitoring globally has been demonstrated, with most published experience drawn from US literature. In addition, the advancement of statistical analysis techniques and RCA provide a significant potential to detect AEFI signal in near real-time.

To date, AEFI monitoring based on EHRs use is limited to diagnostic medical information. Potential incorporation of other electronic health information, including non-coded complaints and encounters, offers further opportunities to improve AEFI real-time surveillance systems to help maintain safe immunisation programmes and maximise confidence in those programmes.


The Australian Government through the International Postgraduate Research Scholarship funds YMM for his PhD study tuition fee. The authors are grateful to Ms Lorena Romero, a Senior Medical Librarian at the Ian Potter Library, Alfred Hospital, Melbourne for providing a feedback on the pilot search strategy for the review. We thank Mr. Kelmu T Kibret for his help in evaluating the methodological quality of studies considered in this review


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. 80.


  • Handling editor Seye Abimbola

  • Contributors YMM conceived the original research idea with guidance from JB. All authors contributed to the design of the study. YMM searched and screened the studies, extracted the data and wrote the initial drafts of the paper. JB, AC and JL revised the article critically. All authors contributed to and approved the final manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement The authors are happy to provide further data up on request.