Variational data assimilation with epidemic models

https://doi.org/10.1016/j.jtbi.2009.02.017Get rights and content

Abstract

Mathematical modelling is playing an increasing role in developing an understanding of the dynamics of communicable disease and assisting the construction and implementation of intervention strategies. The threat of novel emergent pathogens in human and animal hosts implies the requirement for methods that can robustly estimate epidemiological parameters and provide forecasts. Here, a technique called variational data assimilation is introduced as a means of optimally melding dynamic epidemic models with epidemiological observations and data to provide forecasts and parameter estimates. Using data from a simulated epidemic process the method is used to estimate the start time of an epidemic, to provide a forecast of future epidemic behaviour and estimate the basic reproductive ratio. A feature of the method is that it uses a basic continuous-time SIR model, which is often the first point of departure for epidemiological modelling during the early stages of an outbreak. The method is illustrated by application to data gathered during an outbreak of influenza in a school environment.

Introduction

Mathematical models of infectious disease epidemics are important tools for assessing the impact of communicable disease in both human and animal (wild and domesticated) populations (Anderson and May, 1992). At the most basic level they are a concise means of quantitatively representing the essential epidemiological and biological factors that relate to a particular disease pathogen in a given population. More sophisticated implementations can be used to assist in the design or evaluation of actual or prospective intervention programmes, such has been done for childhood disease immunisation (Grenfell et al., 2001; Jansen et al., 2003), analysing the spread and containment of BSE and foot-and-mouth disease in the United Kingdom (Anderson et al., 1996; Keeling et al., 2003), investigating the impact of HIV (Anderson and May, 1992), and in assessing the role of badgers in the on-going bovine TB epidemic in south-west England for example (Donnelly et al., 2006).

In common with mathematical modelling in other disciplines there is a research imperative to develop ever more realistic epidemic models and simulations that include higher-fidelity representations of underlying biological or population processes. However, more realistic models tend to be more complex and they are frequently populated by a proliferation of parameters that need to be robustly estimated (Riley et al., 2003; Ferguson et al., 2005; Longini et al., 2005). In practice, the degree of model sophistication that is used in a given situation often reflects a judicious balance between the questions the model is required to address and the ability to reliably estimate parameters.

When an unknown or poorly understood pathogen suddenly emerges in a population the challenges of epidemic modelling are exacerbated because much of the quantitative work has to be performed on a fast time-scale, usually with limited or poor quality data. There is a risk that the epidemiological dynamics might outpace the modelling response if too sophisticated an approach is adopted. One of the immediate tasks faced by policy advisors and modellers is determining the epidemic curve; specifically, the initial requirement is to estimate model parameters, produce a forecast for the duration of the epidemic and provide an estimate of the expected proportion of the population to be infected by the disease. In these circumstances the simplest epidemic models are generally used because it is pointless to compound epidemiological uncertainties with those generated by inappropriate use of over-complex and poorly parameterised models. Fitting epidemic models to data is a topic of on-going research. Anderson and May (1992) and Daley and Gani (1999) give accounts of commonly used methods. Wearing et al. (2005), Ferrari et al. (2005) and Fraser (2007) provide a discussion of many of the challenges faced when attempting to estimate the basic reproductive ratio from epidemiological data.

Motivated by the requirement for robust methods for parameter estimation and the need to derive full predictive benefit from the most basic of epidemic models in the early stages of an outbreak we present a new method for the determination of epidemiological parameters and for the subsequent production of epidemic forecasts. This method is one that has hitherto been used in weather and climate modelling, and it is known as variational data assimilation (VDA) (Bouttier and Courtier, 1999; Huang and Yang, 1996). A version has also been developed for predator-prey systems (Lawson et al., 1995). VDA has been developed to optimally combine a dynamical model with observations of the system to produce accurate forecasts, and it can be readily adapted to epidemiological applications.

In this paper we introduce the concept of data assimilation and show how it can be adapted and applied to epidemic modelling and infectious disease management. In Section 2 we review the VDA procedure, and in Section 3 it is applied to a continuous-time SIR epidemic model. In Section 4 we introduce some simulated data (with errors) for a basic underlying epidemic process and show how the assimilation can be carried out using a model and data. Section 5 illustrates the use of assimilation to estimate the basic reproductive ratio R0 of the epidemic that gave rise to the data. In Section 6 the method is applied to an outbreak of influenza in a school setting.

Section snippets

Variational data assimilation

Exploiting dynamical models that are constrained by observations or measurements to produce a prediction of future system behaviour is an issue that is of on-going interest to weather and climate forecasters. The challenges faced when attempting to predict the future behaviour of an epidemic and when forecasting the weather have much in common. Observations and data (with their associated uncertainties) need to be combined with (often non-linear) dynamical models in order to produce an estimate

A basic epidemic model

The foundation of the majority of epidemic models is the susceptible–infectious–recovered (SIR) compartmental model (Bailey, 1957; Anderson and May, 1992). Despite its simplicity this basic formulation, cast in either deterministic or stochastic form, has provided a wealth of insight into the dynamics of many different transmissible diseases in a variety of population types. The structural simplicity and ease of parameterisation make the continuous-time SIR model the first point of departure

Assimilation using model data

One of the principal motivations for modelling epidemic outbreaks is to use simple models to provide insight into the epidemiological characteristics of the outbreak, and this can be particularly challenging if the outbreak is due to a novel or poorly understood infectious pathogen. In this section we will show how the VDA method can be used to meld observations with a dynamic model to provide estimates of initial conditions and then to provide predictions of future epidemic behaviour. To

Parameter estimation

In the development and application of VDA to an epidemiological context we have heretofore assumed that all the parameters of the forward model are known. In the SIR forward model this amounts to knowing the basic reproductive ratio, R0 of the pathogen and the average period of infectiousness, σ. Often these quantities are known or can be estimated. This is particularly true when the outbreak is triggered by a known pathogen, such as measles or chickenpox in humans (for example Fraser, 2007).

Application to an outbreak of influenza

In order to illustrate how assimilation might be used in a real application we apply the method to data recorded during an outbreak of influenza in a boarding school (Anonymous, 1978). Following initiation of the epidemic, disease prevalence (as reflected by the number of cases confined to hospital) was recorded over a two week period in a closed population of 763 individuals. Here we will use data assimilation with an SIR model to estimate the basic reproductive ratio. This analysis is not

Conclusions

Some of the most challenging epidemiological modelling applications take place in real-time with a need to meld data and simple epidemic model structures to provide forecasts and parameter estimations. When intervention measures are applied during an outbreak the underlying dynamics of the epidemic inevitably changes. Behavioural changes that are a response to an epidemic in the population can also generate significant dynamical changes. As a consequence, only the limited amount of relatively

Acknowledgements

CJR is supported by the Research Councils of the United Kingdom and Imperial College. TDH was supported by the European Community (SARSTRANS, Contract SP22-CT-2004-511066). The authors thank two referees for useful comments that significantly improved the presentation of the manuscript.

References (23)

  • M.J. Ferrari et al.

    Estimation and inference of R0 of an infectious pathogen by a removal method

    Math. Biosci.

    (2005)
  • L.M. Lawson et al.

    A data assimilation technique applied to a predator–prey model

    Bull. Math. Biol.

    (1995)
  • R.M. Anderson et al.

    Infectious Diseases of Humans: Dynamics and Control

    (1992)
  • R.M. Anderson et al.

    Transmission dynamics and epidemiology of BSE in British cattle

    Nature

    (1996)
  • R.M. Anderson et al.

    Epidemiology, transmission dynamics and control of SARS: the 2002–2003 epidemic

    Philos. Trans. R. Soc. B

    (2004)
  • Anonymous, 1978. Influenza in a boarding school. Brit. Med. J., 4th...
  • N.J.T. Bailey

    The Mathematical Theory of Epidemics

    (1957)
  • Bouttier, F., Courtier, F., 1999. Data assimilation concepts and methods. European Centre for Medium Range Weather...
  • R. Daley

    Atmospheric Data Analysis

    (1991)
  • D.J. Daley et al.

    Epidemic Modelling: An Introduction

    (1999)
  • C.A. Donnelly et al.

    Positive and negative effects of widespread badger culling on tuberculosis in cattle

    Nature

    (2006)
  • Cited by (0)

    View full text