Variational data assimilation with epidemic models
Introduction
Mathematical models of infectious disease epidemics are important tools for assessing the impact of communicable disease in both human and animal (wild and domesticated) populations (Anderson and May, 1992). At the most basic level they are a concise means of quantitatively representing the essential epidemiological and biological factors that relate to a particular disease pathogen in a given population. More sophisticated implementations can be used to assist in the design or evaluation of actual or prospective intervention programmes, such has been done for childhood disease immunisation (Grenfell et al., 2001; Jansen et al., 2003), analysing the spread and containment of BSE and foot-and-mouth disease in the United Kingdom (Anderson et al., 1996; Keeling et al., 2003), investigating the impact of HIV (Anderson and May, 1992), and in assessing the role of badgers in the on-going bovine TB epidemic in south-west England for example (Donnelly et al., 2006).
In common with mathematical modelling in other disciplines there is a research imperative to develop ever more realistic epidemic models and simulations that include higher-fidelity representations of underlying biological or population processes. However, more realistic models tend to be more complex and they are frequently populated by a proliferation of parameters that need to be robustly estimated (Riley et al., 2003; Ferguson et al., 2005; Longini et al., 2005). In practice, the degree of model sophistication that is used in a given situation often reflects a judicious balance between the questions the model is required to address and the ability to reliably estimate parameters.
When an unknown or poorly understood pathogen suddenly emerges in a population the challenges of epidemic modelling are exacerbated because much of the quantitative work has to be performed on a fast time-scale, usually with limited or poor quality data. There is a risk that the epidemiological dynamics might outpace the modelling response if too sophisticated an approach is adopted. One of the immediate tasks faced by policy advisors and modellers is determining the epidemic curve; specifically, the initial requirement is to estimate model parameters, produce a forecast for the duration of the epidemic and provide an estimate of the expected proportion of the population to be infected by the disease. In these circumstances the simplest epidemic models are generally used because it is pointless to compound epidemiological uncertainties with those generated by inappropriate use of over-complex and poorly parameterised models. Fitting epidemic models to data is a topic of on-going research. Anderson and May (1992) and Daley and Gani (1999) give accounts of commonly used methods. Wearing et al. (2005), Ferrari et al. (2005) and Fraser (2007) provide a discussion of many of the challenges faced when attempting to estimate the basic reproductive ratio from epidemiological data.
Motivated by the requirement for robust methods for parameter estimation and the need to derive full predictive benefit from the most basic of epidemic models in the early stages of an outbreak we present a new method for the determination of epidemiological parameters and for the subsequent production of epidemic forecasts. This method is one that has hitherto been used in weather and climate modelling, and it is known as variational data assimilation (VDA) (Bouttier and Courtier, 1999; Huang and Yang, 1996). A version has also been developed for predator-prey systems (Lawson et al., 1995). VDA has been developed to optimally combine a dynamical model with observations of the system to produce accurate forecasts, and it can be readily adapted to epidemiological applications.
In this paper we introduce the concept of data assimilation and show how it can be adapted and applied to epidemic modelling and infectious disease management. In Section 2 we review the VDA procedure, and in Section 3 it is applied to a continuous-time SIR epidemic model. In Section 4 we introduce some simulated data (with errors) for a basic underlying epidemic process and show how the assimilation can be carried out using a model and data. Section 5 illustrates the use of assimilation to estimate the basic reproductive ratio R0 of the epidemic that gave rise to the data. In Section 6 the method is applied to an outbreak of influenza in a school setting.
Section snippets
Variational data assimilation
Exploiting dynamical models that are constrained by observations or measurements to produce a prediction of future system behaviour is an issue that is of on-going interest to weather and climate forecasters. The challenges faced when attempting to predict the future behaviour of an epidemic and when forecasting the weather have much in common. Observations and data (with their associated uncertainties) need to be combined with (often non-linear) dynamical models in order to produce an estimate
A basic epidemic model
The foundation of the majority of epidemic models is the susceptible–infectious–recovered (SIR) compartmental model (Bailey, 1957; Anderson and May, 1992). Despite its simplicity this basic formulation, cast in either deterministic or stochastic form, has provided a wealth of insight into the dynamics of many different transmissible diseases in a variety of population types. The structural simplicity and ease of parameterisation make the continuous-time SIR model the first point of departure
Assimilation using model data
One of the principal motivations for modelling epidemic outbreaks is to use simple models to provide insight into the epidemiological characteristics of the outbreak, and this can be particularly challenging if the outbreak is due to a novel or poorly understood infectious pathogen. In this section we will show how the VDA method can be used to meld observations with a dynamic model to provide estimates of initial conditions and then to provide predictions of future epidemic behaviour. To
Parameter estimation
In the development and application of VDA to an epidemiological context we have heretofore assumed that all the parameters of the forward model are known. In the SIR forward model this amounts to knowing the basic reproductive ratio, R0 of the pathogen and the average period of infectiousness, σ. Often these quantities are known or can be estimated. This is particularly true when the outbreak is triggered by a known pathogen, such as measles or chickenpox in humans (for example Fraser, 2007).
Application to an outbreak of influenza
In order to illustrate how assimilation might be used in a real application we apply the method to data recorded during an outbreak of influenza in a boarding school (Anonymous, 1978). Following initiation of the epidemic, disease prevalence (as reflected by the number of cases confined to hospital) was recorded over a two week period in a closed population of 763 individuals. Here we will use data assimilation with an SIR model to estimate the basic reproductive ratio. This analysis is not
Conclusions
Some of the most challenging epidemiological modelling applications take place in real-time with a need to meld data and simple epidemic model structures to provide forecasts and parameter estimations. When intervention measures are applied during an outbreak the underlying dynamics of the epidemic inevitably changes. Behavioural changes that are a response to an epidemic in the population can also generate significant dynamical changes. As a consequence, only the limited amount of relatively
Acknowledgements
CJR is supported by the Research Councils of the United Kingdom and Imperial College. TDH was supported by the European Community (SARSTRANS, Contract SP22-CT-2004-511066). The authors thank two referees for useful comments that significantly improved the presentation of the manuscript.
References (23)
- et al.
Estimation and inference of R0 of an infectious pathogen by a removal method
Math. Biosci.
(2005) - et al.
A data assimilation technique applied to a predator–prey model
Bull. Math. Biol.
(1995) - et al.
Infectious Diseases of Humans: Dynamics and Control
(1992) - et al.
Transmission dynamics and epidemiology of BSE in British cattle
Nature
(1996) - et al.
Epidemiology, transmission dynamics and control of SARS: the 2002–2003 epidemic
Philos. Trans. R. Soc. B
(2004) - Anonymous, 1978. Influenza in a boarding school. Brit. Med. J., 4th...
The Mathematical Theory of Epidemics
(1957)- Bouttier, F., Courtier, F., 1999. Data assimilation concepts and methods. European Centre for Medium Range Weather...
Atmospheric Data Analysis
(1991)- et al.
Epidemic Modelling: An Introduction
(1999)