The global and regional burden of genital ulcer disease due to herpes simplex virus: a natural history modelling study

Introduction Herpes simplex virus (HSV) infection can cause painful, recurrent genital ulcer disease (GUD), which can have a substantial impact on sexual and reproductive health. HSV-related GUD is most often due to HSV type 2 (HSV-2), but may also be due to genital HSV type 1 (HSV-1), which has less frequent recurrent episodes than HSV-2. The global burden of GUD has never been quantified. Here we present the first global and regional estimates of GUD due to HSV-1 and HSV-2 among women and men aged 15–49 years old. Methods We developed a natural history model reflecting the clinical course of GUD following HSV-2 and genital HSV-1 infection, informed by a literature search for data on model parameters. We considered both diagnosed and undiagnosed symptomatic infection. This model was then applied to existing infection estimates and population sizes for 2016. A sensitivity analysis was carried out varying the assumptions made. Results We estimated that 187 million people aged 15–49 years had at least one episode of HSV-related GUD globally in 2016: 5.0% of the world’s population. Of these, 178 million (95% of those with HSV-related GUD) had HSV-2 compared with 9 million (5%) with HSV-1. GUD burden was highest in Africa, and approximately double in women compared with men. Altogether there were an estimated 8 billion person-days spent with HSV-related GUD globally in 2016, with 99% of days due to HSV-2. Taking into account parameter uncertainty, the percentage with at least one episode of HSV-related GUD ranged from 3.2% to 7.9% (120–296 million). However, the estimates were sensitive to the model assumptions. Conclusion Our study represents a first attempt to quantify the global burden of HSV-related GUD, which is large. New interventions such as HSV vaccines, antivirals or microbicides have the potential to improve the quality of life of millions of people worldwide.

is measured by prospective studies (often clinical trials) that evaluate GUD symptoms typically during one year of follow-up among those who seroconverted to HSV-2 during the study (Table A1; Figure 1).
is informed by clinic-based studies that recruit individuals with a (diagnosed) first episode due to HSV-2 and measure the percentage of individuals with at least one recurrence during follow-up (Table A1; Figure 1). It is also informed by studies of unrecognised infection which recruit individuals who are HSV-2 seropositive but without a history of (recognised) genital herpes and observe how many experience documented GUD during follow-up. Deriving estimates from these data is difficult because: (1) the percentage of infected people who have diagnosed versus unrecognised infection is difficult to define and measure, and may depend on factors such as access to healthcare; (2) these two different types of study population are not necessarily exhaustive of all those with HSV-2 infection: clinic-based studies may disproportionately include infected individuals with more severe disease, meaning that the percentage who recur and the recurrence rate measured in these studies may be overestimates; (3) GUD in those with unrecognised/undiagnosed infection makes an important contribution to disease burden.
Thus, for our base case estimates we calculated the second part of the equation by applying estimates of the percentage of individuals with established infection (τ>1 year since infection) who have received a diagnosis, = 1, who have one or more GUD recurrences in a year after the first year of infection as measured by clinic studies, denoted by is informed by surveys of all those HSV-2 infected which ask participants if they have received a diagnosis (Table A1). We assumed that the percentage of individuals with established infection who experience recurrences is independent of time since infection ( = ), given the lack of data to fully inform how this might change over time. This does not necessarily mean that the recurrence rate does not vary with time since infection.
The mean number of days with GUD due to HSV-2 in a given year depends on the contribution to GUD days from individuals with recently-acquired infection experiencing their first episode (in the first year since infection), from individuals with recently-acquired infection with recurrences (in the first year since infection), and from individuals with established infection with recurrences (after the first year since infection), i.e.: 1. The number of GUD days per person with recently-acquired infection (τ≤1) experiencing a first episode, , multiplied by the percentage of the population with recently-acquired HSV-2 infection, ( ), and the percentage of individuals with recently-acquired infection who experience a first episode, ; 2. The number of GUD days per person with recently-acquired infection (τ≤1) due to all recurrences in the first year, (averaged over all those with recently-acquired infection, including those without a first episode or recurrences), among diagnosed individuals, = 1, and undiagnosed individuals, = 0, multiplied by the percentage of the population with recently-acquired HSV-2 infection, ( ); 3. The number of GUD days per person 1<τ≤10 years following infection due to all recurrences in a year, (averaged over all those with established infection, including those without a first episode or recurrences), among diagnosed individuals, = 1, and undiagnosed individuals, = 0, multiplied by the percentage of the population with established HSV-2 infection, ( ); 4. The number of GUD days per person τ>10 years following infection due to all recurrences in a year, (averaged over all those with established infection, including those without a first episode or recurrences), among diagnosed individuals, = 1, and undiagnosed individuals, = 0, multiplied by the percentage of the population with established HSV-2 infection, ( ); The equation is therefore as follows: Mean number of days with GUD due to HSV-2 in a given year at age a = Studies may measure the percentage of days on which individuals experience GUD recurrences, which when multiplied by the total number of days in a year gives . Studies may also measure the average duration of a recurrence, , and the number of recurrences in a year, , which when multiplied together also give .
is measured by studies of documented first-episode symptoms due to HSV-2 (Table A1; Figure  1). Similar to estimates for the percentage of people with any GUD due to HSV-2 in a given year, we calculated the latter parts of the equation by applying estimates of among those who have received a diagnosis, , , as measured by clinic studies, to estimates of , and applying estimates of among those who have not received a diagnosis,

GUD due to genital HSV-1 infection
For genital HSV-1, recurrences were limited to the first five years since infection, because there is a low recurrence rate up to five years since infection and no data available past 5 years. , , and ( Table A1; Figure 1) are informed by similar studies to those for HSV-2, with the difference that there are no studies of recurrences in those with unrecognised infection, and the percentage who are diagnosed is unknown: only can be estimated.
The relevant equations for the percentage with any GUD due to genital HSV-1 in a given year at age a (expressed as a percentage of the total population) and the mean number of days with GUD due to genital HSV-1 in a given year at age a are as follows: The number of people with any GUD over the total population was calculated by multiplying the percentage with GUD from equations ii and iv by total population size, ( ). Person-days with GUD per total population was calculated by multiplying mean number of days with GUD from equations iii and v by total population size, ( ). Using all the information we had, we also calculated the contribution to GUD burden of established versus recent infection, the total number of first episodes and the total number of recurrences, and the number of GUD days per person with infection and per person experiencing GUD.

Data sources for model parameters
( ) and ( ) were taken from our existing WHO global and regional estimates of HSV-2 and genital HSV-1 infection annual incidence, ( ), and prevalence, ( ), in 2016 among those aged 15-49 years expressed as percentages of the population 2 . These estimates were done for 2016 informed by comprehensive literature reviews conducted up to August 2018. Full details are reported in the corresponding paper 2 . Of note, the estimates for genital HSV-1 infection are particularly uncertain, due to a lack of HSV-1 prevalence data among children for all regions, and in general for WHO Africa and South-East Asia regions in particular, and use of a pooled estimate for the percentage of incident HSV-1 infection from age 15 years that is genital, , to determine ( ) and ( ) for genital HSV-1 infection, informed by only four available longitudinal studies (all from the US). Population size, ( ), was obtained from the United Nations Population Division 28 . Natural history parameter data were limited to GUD and not any other symptoms such as dysuria or fever. We extracted data separately for HIV-negative populations, and PLHIV or populations stated to include those HIV-positive. For the latter, we additionally extracted information on HIV prevalence and antiretroviral (ART) use. However, only data from HIV-negative populations were subsequently used in the estimates. Where it was possible to identify when multiple publications had reported on the same natural history parameter from the same study population, we extracted all data but only used the value associated with the largest sample size to inform the parameter in question.
Studies had to report GUD natural history for either HSV-2 or genital HSV-1, except for the duration of a first episode of GUD, , which is clinically indistinguishable between HSV-2 and genital HSV-1 6 . estimates for HSV-2 pooled together data on first episodes due to HSV-2 or either HSV type (data not separable by type), while estimates for genital HSV-1 pooled together data on first episodes due to HSV-1 or either HSV type (data not separable by type), but for the latter, excluding data from those with existing HSV-1 infection (non-primary HSV-2 infection) where possible. Apart from this, we did not consider the effect of sex or pre-existing HSV-1 infection (for the HSV-2 GUD estimates) on the assumed parameter values. For both HSV-2 and genital HSV-1, we took the mean value where results were only reported separately for each sex, and for HSV-2, we took the mean value where results were only reported separately for primary and non-primary infection. Recurrences in those HSV-2 seropositive were assumed to be due to HSV-2. To minimize overestimating GUD, we excluded parameters from any study which selected participants based on having 4 recurrences or more per year, except for recurrence duration. Only data from the placebo/control group were extracted from clinical trials. Based on the available data, it was decided to separately pool and , deriving from using our pooled estimate for where data were available for and not . It was decided a priori to allow to vary by time since infection but Pooling of studies reporting binary outcomes Studies with reporting binary outcomes ( , and ) were pooled on the log odds scale, where the standard error (SE) of the log odds was calculated according to SE=sqrt((1/cases)+(1/noncases)). This was computed using the metan command in Stata 13.1.

Pooling of studies reporting duration of symptoms
For studies reporting duration of symptoms ( and ), we pooled mean duration on the log scale. For studies reporting median duration we estimated the mean by assuming an underlying Exponential distribution for duration of symptoms, using the formula mean=median/ln(2). SE of the mean duration was calculated as SE=mean/sqrt(N), also based on an underlying Exponential distribution for duration of symptoms. Mean duration was pooled on the log scale, which was achieved by transforming the mean and SE from a natural scale to a log scale, using the formulae from a log-Normal distribution: the log mean duration calculated from ln(mean/(sqrt(1+(SE/mean) 2 ))) and SE of the log mean duration calculated from sqrt(ln(1+(SE/mean) 2 )). These were then pooled using the metan command in Stata 13.1, noting that the resulting pooled estimates were on the log-scale.

Pooling of studies reporting event counts
For studies reporting event counts (i.e., frequency of recurrences, ), we pooled mean event counts on the log scale. For studies reporting median event counts, we estimated the mean using the formula mean=((median-1/3)+sqrt(((1/3)-median) 2 +0.08))/2, which is based on an underlying Poisson distribution for event frequency. SE of the mean frequency was calculated as SE=sqrt(mean/N), again based on an assumed Poisson distribution for event frequency. Mean event rates were pooled on the log scale, as for duration, noting that the resulting pooled estimates were on the log scale.   Define the indicator(s), populations (including age, sex, and geographic entities), and time period(s) for which estimates were made. Introduction para 3 2 List the funding sources for the work.

Funding section Data Inputs
For all data inputs from multiple sources that are synthesized as part of the study: 3 Describe how the data were identified and how the data were accessed. Literature search and pooling section, and Data sources for model parameters section in the Appendix

4
Specify the inclusion and exclusion criteria. Identify all ad-hoc exclusions.

5
Provide information on all included data sources and their main characteristics. For each data source used, report reference information or contact name/institution, population represented, data collection method, year(s) of data collection, sex and age range, diagnostic criteria or measurement method, and sample size, as relevant. Table A1 6 Identify and describe any categories of input data that have potentially important biases (e.g., based on characteristics listed in item 5). Table A1 and Discussion For data inputs that contribute to the analysis but were not synthesized as part of the study: 7 Describe and give sources for any other data inputs. Methods, and Data sources for model parameters section in the Appendix For all data inputs: 8 Provide all data inputs in a file format from which data can be efficiently extracted (e.g., a spreadsheet rather than a PDF), including all relevant meta-data listed in item 5. For any data inputs that cannot be shared because of ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.

Table A1
Data analysis 9 Provide a conceptual overview of the data analysis method. A diagram may be helpful. Methods, Figure  1, and Natural history model section in the Appendix 10 Provide a detailed description of all steps of the analysis, including mathematical formulae. This description should cover, as relevant, data cleaning, data pre-processing, data adjustments and weighting of data sources, and mathematical or statistical model(s Provide published estimates in a file format from which data can be efficiently extracted. Tables 1-3 and  Appendix  16 Report a quantitative measure of the uncertainty of the estimates (e.g. uncertainty intervals). Table 3 17 Interpret results in light of existing evidence. If updating a previous set of estimates, describe the reasons for changes in estimates.

Results and Discussion 18
Discuss limitations of the estimates. Include a discussion of any modelling assumptions or data limitations that affect interpretation of the estimates. Discussion