Article Text

Estimation of country-specific tuberculosis resistance antibiograms using pathogen genomics and machine learning
  1. Avika Dixit1,2,
  2. Luca Freschi2,
  3. Roger Vargas2,3,
  4. Matthias I Gröschel2,
  5. Maria Nakhoul4,
  6. Sabira Tahseen5,
  7. S M Masud Alam6,
  8. S M Mostofa Kamal7,
  9. Alena Skrahina8,
  10. Ramon P Basilio9,
  11. Dodge R Lim9,
  12. Nazir Ismail10,
  13. Maha R Farhat2,11
  1. 1Division of Infectious Diseases, Department of Pediatrics, Boston Children’s Hospital, Boston, Massachusetts, USA
  2. 2Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
  3. 3Center for Computational Biomedicine, Harvard Medical School, Boston, Massachusetts, USA
  4. 4Informatics and Analytics Department, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
  5. 5National Tuberculosis Control Programme, Islamabad, Pakistan
  6. 6Ministry of Health and Family Welfare, Kolkata, West Bengal, India
  7. 7National Institute of Diseases of the Chest and Hospital, Dhaka, Bangladesh
  8. 8Republican Scientific and Practical Center for Pulmonology and Tuberculosis, Minsk, Belarus
  9. 9Research Institute for Tropical Medicine, Muntinlupa City, Philippines
  10. 10Clinical Microbiology and Infectious Diseases, University of the Witwatersrand Johannesburg Faculty of Health Sciences, Johannesburg, South Africa
  11. 11Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
  1. Correspondence to Dr Maha R Farhat; Maha_Farhat{at}


Background Global tuberculosis (TB) drug resistance (DR) surveillance focuses on rifampicin. We examined the potential of public and surveillance Mycobacterium tuberculosis (Mtb) whole-genome sequencing (WGS) data, to generate expanded country-level resistance prevalence estimates (antibiograms) using in silico resistance prediction.

Methods We curated and quality-controlled Mtb WGS data. We used a validated random forest model to predict phenotypic resistance to 12 drugs and bias-corrected for model performance, outbreak sampling and rifampicin resistance oversampling. Validation leveraged a national DR survey conducted in South Africa.

Results Mtb isolates from 29 countries (n=19 149) met sequence quality criteria. Global marginal genotypic resistance among mono-resistant TB estimates overlapped with the South African DR survey, except for isoniazid, ethionamide and second-line injectables, which were underestimated (n=3134). Among multidrug resistant (MDR) TB (n=268), estimates overlapped for the fluoroquinolones but overestimated other drugs. Globally pooled mono-resistance to isoniazid was 10.9% (95% CI: 10.2-11.7%, n=14 012). Mono-levofloxacin resistance rates were highest in South Asia (Pakistan 3.4% (0.1–11%), n=111 and India 2.8% (0.08–9.4%), n=114). Given the recent interest in drugs enhancing ethionamide activity and their expected activity against isolates with resistance discordance between isoniazid and ethionamide, we measured this rate and found it to be high at 74.4% (IQR: 64.5–79.7%) of isoniazid-resistant isolates predicted to be ethionamide susceptible. The global susceptibility rate to pyrazinamide and levofloxacin among MDR was 15.1% (95% CI: 10.2-19.9%, n=3964).

Conclusions This is the first attempt at global Mtb antibiogram estimation. DR prevalence in Mtb can be reliably estimated using public WGS and phenotypic resistance prediction for key antibiotics, but public WGS data demonstrates oversampling of isolates with higher resistance levels than MDR. Nevertheless, our results raise concerns about the empiric use of short-course fluoroquinolone regimens for drug-susceptible TB in South Asia and indicate underutilisation of ethionamide in MDR treatment.

  • Tuberculosis

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • We searched PubMed and the WHO online documents to identify country-level prevalence estimates for resistance to first-line and second-line antibiotics used clinically for the treatment of tuberculosis (TB). In PubMed, we used the terms ‘Antibiogram’, ‘tuberculosis’, ‘resistance’ and ‘surveillance’ published between January 2018 and January 2023. The WHO documents reported annual estimates of rifampicin resistance/multidrug resistance rates at the country level, but no annual reports of resistance prevalence to other drugs. The PubMed search yielded 442 studies that included reports on TB drug resistance rates in hospitals, clinics, communities, cities and countries with differing sampling strategies varying from systematic to convenience samples. We identified no meta-analyses or original reports of TB antibiograms to both first-line and second-line drugs that aimed to survey multiple countries.


  • This study presents a method for the use of public pathogen whole-genome sequencing (WGS) data to survey first-line and second-line TB drug resistance rates by country. The method corrects for oversampling of MDR and outbreak bias and is benchmarked using simulations and a national drug resistance survey. The application of this method to 19 149 TB WGS from 29 countries provides evidence that mono-resistance to fluoroquinolones has a higher rate in South Asia than in other surveyed countries.


  • We expect the approach, and shared code and antibiograms accessible via the web, to assist in future surveillance of drug-resistant TB more comprehensively. With the expansion of WGS for use in TB surveillance programmes, the data available to generate these estimates is expected to grow and can be leveraged to allow for the monitoring of trends in resistance over time.


Tuberculosis (TB) and its causative agent Mycobacterium tuberculosis (Mtb) are a persistent global health threat resulting in more than 10 million incident cases and 1.5 million deaths annually.1 Although overtaken as the top infectious disease killer by the coronavirus disease 2019 (COVID-19) in the years 2020 and 2021, TB is now again the top infectious disease killer globally. One of the major challenges in TB control is the emergence of antibiotic-resistant TB that is difficult and expensive to cure with favourable treatment outcomes achieved in only 56% of cases.1–3 Improved strategies to tackle resistant TB first require improved local estimates of resistance burden. National estimates currently reported by the WHO focus on rifampicin resistance. These rely on either modelling, periodic surveys or pooling rifampicin testing data, often derived from nucleic acid amplification tests, including cartridge-based tests or line probe assays, for countries that routinely test for rifampicin resistance.1 Resistance estimates for the remaining agents are even more limited and still, largely rely on expensive and labor-intensive culture-based antibiotic susceptibility testing. Surveillance of resistance to other first-line and second-line agents, for example, pyrazinamide, fluoroquinolones, is needed to guide disease control and to project patient eligibility to standardised regimens, including newer fluoroquinolone-based regimens for antibiotic susceptible TB and short-course regimens for multidrug resistance (MDR).4 5

Between 2000 and 2010, 395 Mtb genomes were published in the National Center for Biotechnology Information’s Sequence Read Archive. This number rose to 79 716 between 2011 and 2020 and is increasingly representative of high-prevalence settings.6 Major motivators for sequencing include the study of TB transmission/outbreaks as well as genotypic surveillance of MDR or rifampicin resistance in TB in both high and low burden countries.7–9 While these efforts involve oversampling of MDR or rifampicin-resistant isolates, they are less likely to oversample higher-level resistance including pre-XDR and XDR TB, as this information is difficult to obtain due to the laboratory cost and biohazard, especially in high-burden settings.

Enabled by an improved understanding of genetic resistance mechanisms, prediction from whole-genome sequencing (WGS) is now a reliable approach to resistance diagnosis for the majority of Mtb antibiotics.10 11 Several methods have been developed to predict resistance in silico from WGS data, these span simpler methods, like direct association, to machine learning.12–15 Among the best-performing methods that have been validated across 12 different antibiotics is a random forest classifier (see Methods for performance details).16 We leveraged this model to comprehensively survey TB antibiotic resistance at the country level using public and surveillance WGS data. We outline an approach to correct DR and outbreak oversampling bias, and the model’s imperfect sensitivity and specificity. We validate our WGS-based estimates of drug resistance burden against national drug resistance survey data for South Africa. The results are accessible to the user using a point-and-click open web platform that can be refined over time as new WGS data and models become available.


Data curation

We curated Mtb genomes from NCBI, PATRIC and, published literature (online supplemental table 1).8 10 17–19 We also included genotype and resistance phenotype data from the WGS-based resistance survey from seven countries (Azerbaijan, Bangladesh, Belarus, Pakistan, Philippines, South Africa and Ukraine) led by the WHO.8 Additional details are shown in online supplemental methods.

Supplemental material

Phenotypic DST

Phenotypic drug susceptibility testing (DST) data were curated from PATRIC and the published literature. Details of the phenotypic testing methodology used by each study are shown in online supplemental table 1. For studies that reported minimal inhibitory concentrations, we converted the results into a binary variable (indicating sensitive or resistant) using critical concentrations endorsed by the WHO.20

Genomic analysis/variant calling

We used a previously validated genomic analysis pipeline for Mtb with modifications described in online supplemental methods.21 22

Implementation of random forest predictor

We used a previously described random forest (RF) model for in-silico resistance prediction described in online supplemental methods.12

Estimation of resistance burden/antibiograms by country

We developed a procedure to correct for the possible oversampling of outbreaks in our data set as described in online supplemental methods and figure 1. We calculated the proportion of isolates resistant to each antibiotic, by country, using the number of isolates predicted as resistant divided by total isolates available for prediction for each country. We focus on four basic resistance estimates: (1) the marginal proportion of resistance among all TB isolates available, and the conditional proportion of resistance to a specific agent among one of the following three categories: (2) rifampicin-susceptible isolates that we label as mono-resistant, (3) rifampin-resistant isolates and (4) MDR isolates. In each case, we estimated the bias-corrected prevalence of antibiotic resistance and corrected for resistance oversampling (online supplemental methods). We determined residual bias in genotype-based estimates of drug resistance after correction for outbreak sampling for South Africa only using the results of the South African national TB antibiotic resistance survey from 2012 to 2014 (online supplemental methods).8 23 Resistance prevalence to certain drug combinations was assessed based on existing treatment regimens and guidelines, that is, fluoroquinolone-containing regimen for drug-susceptible TB and short course (‘Bangladesh regimen’) for MDR-TB.

Other analysis, data, and code availability

Antibiograms were plotted for each country using ggplot2.24 We validated the resistance estimation procedure using a simulated data set with a range of resistance oversampling and outbreak bias (5–25% or 50%) as described in online supplemental methods. All custom scripts used for this analysis were written using R V.3.6.1 and Python V.3.7.4 and are available on GitHub at



We identified 22 599 isolates with data on country of origin that had high-quality WGS available (online supplemental methods). There were 82 countries represented by at least 1 isolate and 32 countries by at least 100 isolates that met sequence quality criteria.

Phenotypic resistance

Of the 22 599 isolates, 12 023 had phenotypic DST results (figure 1). The 12 023 isolates originated from 43 countries; 11 343 were tested for both isoniazid and rifampicin, and 2856 (25% (95% CI: 24-26%)) were resistant to both, that is, were MDR. Among the MDR isolates, 456 (16% (95% CI: 15-17%)) were resistant to at least one fluoroquinolone (ofloxacin, ciprofloxacin, levofloxacin or moxifloxacin). Among the 17 countries with at least 100 isolates with phenotypic DST, we computed the raw frequency of MDR in the sample. In comparison to the WHO-reported MDR rates, 16 of the 17 countries had a higher MDR rate confirming the concern of the over-representation of MDR in public genomic data.

Figure 1

Isolates filtered and available for analysis. See online supplemental figure 1 for details on the computation of pairwise SNP distance. QC, quality control; SNP, single nucleotide polymorphism.

Due to over-representation of MDR/rifampicin resistance, we assessed phenotypic resistance patterns strictly among rifampicin susceptible isolates (n=8581 from 43 countries, median 30 isolates per country (IQR=3–225)). Mono-isoniazid (Hr-TB) by phenotypic assay was seen in 9% (780/8581) of pooled isolates; Peru had the highest proportion at 33% (95% CI: 28-37%, n=423) and none were isoniazid mono-resistant from China (95% CI: 0.0-8.2%, n=43). Among isolates susceptible to rifampicin with DST to levofloxacin and/or moxifloxacin, 1.6% (95% CI: 1.1-2.2%, n=1906 from eight countries) were resistant. Among the four countries with at least 100 rifampicin susceptible isolates with DST to at least one fluoroquinolone, the highest proportion of fluoroquinolone mono-resistance was found in Bangladesh (3.9%, 95% CI: 2.3-6.2%, n=431), online supplemental table 2. Genotype-based antibiograms, as detailed below, showed trends consistent with these phenotypic patterns even after bias correction. Genotype-based prediction allowed for estimation in a larger number of countries. Concordance between phenotypic DST and genotypic prediction was high for first-line drugs and lower for second-line drugs and consistent with published validation data (online supplemental results).16

Genotypic antibiogram estimation

To generate 12-drug antibiograms from WGS data, we followed a three-step procedure. We first filtered sequenced isolates that may represent Mtb outbreaks. We next applied a Bayesian correction for the imperfect sensitivity and specificity of the in silico resistance model. Lastly, to generate marginal antibiograms, we marginalised over rifampicin resistance categories using the WHO-reported rate of rifampicin resistance for that country (Methods). In addition to marginal antibiograms, we generated country-level bias-corrected estimates of Hr-TB and mono-levofloxacin resistant TB, that is, only among rifampicin susceptible isolates, for the 26 countries with at least 50 rifampicin susceptible isolates available for analysis (figure 2A,B). We also estimated resistance to 11 drugs including pyrazinamide, and levofloxacin among MDR-TB for the 15 countries with at least 100 sequenced MDR-TB isolates (online supplemental table 3).

Figure 2

Bias-corrected estimates of (A) isoniazid mono-resistance and (B) levofloxacin mono-resistance in rifampicin susceptible isolates. Only countries with at least 100 total isolates of which at least 50 were rifampicin susceptible are shown.

The outbreak correction was performed because outbreak investigation is one application of Mtb WGS potentially resulting in the oversampling of specific resistance genotypes (online supplemental methods).22 25 This correction led to the exclusion of 2354 isolates from the 22 599 meeting sequencing quality criteria, leaving 20 245 isolates from 78 countries, of which 29 countries had at least 100 isolates available. The median percentage of isolates excluded from each country was 14.7% (IQR: 0.0–15.6%). We compared estimates with and without outbreak correction for the 29 countries represented by at least 100 sequenced isolates (median 342 isolates per country (IQR: 198–829), n=19 149), online supplemental figures 2 and 3, table 1 and online supplemental table 4. Drug resistance estimates before and after outbreak correction were comparable for the majority of the countries (online supplemental figures 4 and 5).

Table 1

Summary table of results for predicted prevalence estimates

Comparison with national antibiotic resistance survey data

We used the South African national TB antibiotic resistance survey from 2012 to 2014 to study residual bias in genotype-based estimates of drug resistance after the corrections described above (figure 3).8 23 The confidence interval (CI of marginal resistance estimates contained the national survey rate for rifampicin, ethambutol, pyrazinamide, levofloxacin, ofloxacin and streptomycin. For isoniazid, ethionamide and second-line injectables (n=3134), the survey rate was higher than the public WGS-derived CI. For para-amino salicylic acid, the survey rate was lower than the estimated CI. In each case, the discrepancy between the point estimates of resistance prevalence was ≤2%, except for isoniazid where the WGS-derived point estimate was 3.9% lower than the survey-reported rate. Among MDR isolates, the CI estimated using public WGS data (n=268) contained the national DR survey rate for levofloxacin and para-amino salicylic acid but ranged higher than the survey estimate for other drugs including pyrazinamide, ethambutol and second-line injectables (figure 3).

Figure 3

Validation of estimates for South Africa. Comparison of drug resistance estimates for South Africa from the national drug resistance (DR) survey 2012–2014 to those calculated using public whole-genome sequences (WGS) in this study from (A) all isolates (n=3134) and, (B) among multidrug-resistant (MDR) isolates (n=268), that is, those resistant to both rifampicin and isoniazid (INH). National DR survey reported second-line injectable resistance estimates were used to compare with amikacin (AMK), capreomycin (CAP) and kanamycin (KAN) estimates generated using public WGS. National DR survey reported ofloxacin (OFLX) resistance estimates were used to compare with OFLX and levofloxacin (LEVO) estimates generated using public WGS. EMB, ethambutol; ETH, ethionamide; INH_MONO, INH resistance in rifampicin susceptible isolates; PAS, para-amino salicylic acid; PZA, pyrazinamide; STR, streptomycin.

Validation using a simulated data set

Given the overestimation of resistance among MDR relative to the South African DR survey, we tested if this is due to residual oversampling of resistance or if this is due to a bias in the genotypic prediction itself. We used a simulated data set of Mtb phenotypes and genotypes with a prespecified marginal rate of phenotypic resistance for rifampicin (~6.5%), isoniazid (~10%), pyrazinamide (2.5%) and levofloxacin (1.5%) (the ground truth). We simulated incremental bias related to outbreak oversampling (5–50%) and MDR oversampling (5–25%) of two starting sample sizes (n=500, and n=1000) (online supplemental methods). We then passed the corresponding genotypes into the RF model with corrections as outlined in the methods section for real data and compared the final estimates with the ground truth of the input. The results demonstrated that the genotypic method with corrections introduced <±1.5% final mean bias in the marginal rate of resistance for the four drugs (online supplemental table 5). The introduction of outbreak bias did not affect the final mean bias substantially (final mean bias <±1.8%) but widened the CI around the final estimate. Lastly, the genotypic method was robust to the introduction of MDR oversampling except for the drug pyrazinamide for which the final mean bias increased to +3% from +1% (online supplemental results).

In summary, validation using the South African DR survey and in simulation demonstrated that the genotypic estimates of marginal resistance are accurate within ±2% and are robust to the outbreak and MDR sampling bias. Levofloxacin resistance rates among MDR are also accurate and robust, but resistance rates to other drugs among MDR are less reliable.

Country-level estimates of antibiotic resistance

Marginal rates of resistance

The pooled rate of isoniazid marginal resistance was 11.6% (95% CI: 8.5-14.6%, n=19 149) across 29 countries. For 10 of the 29 countries, >30% of isolates were estimated resistant to isoniazid; 5 of these were former Soviet Union countries (online supplemental figure 2, table 1, online supplemental table 4). The highest isoniazid resistance rate was estimated for Russia (66%, 95% CI: 45-87%, n=829), followed by Ukraine (58%, 95% CI: 39-76%, n=957). Isolates from the five former Soviet Union countries also had the highest pyrazinamide resistance rates. For countries outside of the former Soviet Union, the highest isoniazid resistance rate was measured in the Philippines (43%, 95% CI: 40-46%, n=181), Portugal (39%, 95% CI: 38-40%, n=100) and Peru (39%, 95% CI: 31-42%, n=1521). The lowest isoniazid resistance rate was seen in South Africa (4.7%, 95% CI: 3.3-6.2%, n=3134) and Japan (8%, 95% CI: 6-10%, n=368).

Ethionamide and isoniazid have a similar mechanism of action and share the InhA enzyme as the drug target but vary in other genetic resistance determinants including ethA and katG respectively. Given recent interest in drugs that boost ethionamide activity and their expected activity against some isolates with isoniazid resistance we focus here on the marginal rates of ethionamide resistance and resistance discordance between isoniazid and ethionamide.26 Marginal rates of ethionamide resistance demonstrate a wide range from country to country. On the high extreme was the Republic of Moldova with a rate of 32% (95% CI: 22-42%, n=278) while the lowest rate was seen in the UK at 0.15% (95% CI: 0.05-0.30%, n=2831). All countries (n=29) had higher rates of resistance to isoniazid compared with ethionamide with a median difference of 16.3% (IQR: 9.5–28.1%). Among isoniazid-resistant isolates (n=6090 from 29 countries, median 145 (IQR: 96–246) isolates per country), a median of 74.4% (IQR: 64.5–79.7%) were predicted to be ethionamide susceptible. Of the 4827 total isolates that were predicted to be isoniazid-resistant but ethionamide susceptible, 4477 (92.7%) harboured antibiotic resistance-conferring mutations in katG but not in inhA. Phenotypic DST was available for 565 of the 4477 isolates and of these 80.2% tested resistant to isoniazid but susceptible to ethionamide.

Mono-resistance to isoniazid and fluoroquinolones

26 of the 29 countries were represented by at least 50 rifampicin susceptible isolates (median 237 (IQR: 116–500) isolates per country). The pooled rate of Hr-TB was estimated at 10.9% (95% CI: 10.2-11.7%, n=14 012) across the 26 countries and explained most of the marginal rate of isoniazid resistance (11.6%, 95% CI: 8.5-14.6%, n=19 149) as noted above. By country (figure 2A), we searched for evidence of over-estimation and under-estimation at the two extremes of mono-resistance to isoniazid: highest in the Philippines (40%, 95% CI: 30-51%, n=119), and lowest in South Africa (1.2%, 95% CI: 0.5-2%, n=2746). We verified high genotype and phenotype concordance for isoniazid in the Philippines (online supplemental table 6) and found evidence for oversampling of Hr-TB in public WGS data compared with the DR survey (online supplemental results). For South Africa, Hr-TB was underestimated due to the lack of resistance mutations in 60% of isolates; the remaining 40% of isoniazid mono-resistance was missed due to rare mutations in katG and the ahpC promoter, but not in the fabG1-inhA promoter (online supplemental tables 7 and 8).

The pooled rate of mono-resistance to levofloxacin was estimated at 0.1% (95% CI: 0.003-0.3%, n=14 012). There was considerable geographical variation: Pakistan and India had the highest rate of levofloxacin resistance, at 3.4% (0.1–11.1%) and 2.8% (0.1–9.4%) of 111 and 114 rifampicin susceptible isolates, respectively. In Bangladesh, which had a high prevalence of fluroquinolone mono-resistance based on phenotypic data, the bias-corrected genotypic estimate was 0.7% (0.02–2.3%, n=454). Results by country are shown in figure 2B. For India, we compared the estimates to the national TB antibiotic resistance survey performed in 2014–2016 and in which the prevalence of levofloxacin resistance among new TB patients was 2.7% (95% CI: 2.2-3.4%) (online supplemental figure 6).27

Fluoroquinolone resistance among MDR isolates

Fifteen countries had at least 100 MDR isolates after filtering of closely related isolates (n=3964, median 179 isolates/country (IQR: 147.5–299.5)). We estimated MDR antibiograms for the 15 countries and across the whole sample, with and without outbreak correction (online supplemental figures 7 and 8, table 3). Specifically, we focused on moxifloxacin/levofloxacin given the validation results detailed above.28 29 The pooled rate of levofloxacin resistance among MDR-TB was 8.6% (95% CI: 0.5-19.8%, n=3964); by country, this ranged from 48% in Japan (95% CI: 27-69%, n=135) to 0.98% in the DRC (95% CI: 0.02-3.59%, n=144).

Susceptibility to antibiotics used in the MDR-TB short-course regimen

We estimated an upper bound on the proportion of MDR isolate by country with combined susceptibility to levofloxacin and pyrazinamide, two of the key drugs used in the short-course regimen (also known as ‘Bangladesh regimen’) that were best predicted individually by our genotypic approach as benchmarked against the South African national DR survey (online supplemental table 9). We assessed the sensitivity and specificity of the RF model for predicting combined susceptibility to pyrazinamide and levofloxacin by comparing against phenotypic resistance for the pooled MDR isolates with pyrazinamide and levofloxacin/moxifloxacin resistance phenotype available (n=2513, online supplemental table 10). The RF determines combined susceptibility with high sensitivity=91.1%, but low specificity 59.5% compared with phenotypic testing (online supplemental table 10). Therefore, using the RF to measure eligibility provides an optimistic or upper-bound estimate. This approach further overcalls eligibility because it ignores resistance to the other components of the short course regimen, ethambutol and kanamycin, which can be common among MDR-TB patients globally (online supplemental results).

The availability of WGS data on MDR isolates allowed us to estimate eligibility across 15 countries (each n>100 MDR). We used a similar procedure to prior prevalence estimates, where we corrected for outbreak sampling followed by bias correction of the raw RF estimates using the above-described sensitivity and specificity for predicting combined susceptibility to pyrazinamide and levofloxacin among MDR-TB. The bias-corrected global estimate of eligibility to short course MDR therapy was 15.1% of MDR TB (95% CI: 10.2-19.9%, n=3964). This combined rate was high for the Democratic Republic of Congo (60.7% (95% CI: 45.6-75.1%, n=144)) and the Netherlands (60.5% (95% CI: 46.4-73.8%, n=178)). The former Soviet Union countries had the lowest combined rates (eg, Azerbaijan: 3.5% (95% CI: 0.1-11.3%, n=179) and Moldova: 3.8% (95% CI: 0.1-12.1%, n=163)) (figure 4). For Bangladesh, where the regimen was originally developed, the rate was 42.5% (95% CI: 19.8-64.6%, n=69).

Figure 4

Bias-corrected estimates of susceptibility to antibiotics used for short course regimen for treatment of multidrug-resistant (MDR) tuberculosis. Numbers shown for each country are bias-corrected percentages of MDR isolates that were sensitive to two of the antibiotics (pyrazinamide and levofloxacin) used in the short-course regimen. Only countries with at least 100 MDR isolates are shown.

Access to antibiogram estimates

Genotypic estimates are available through a point-and-click web interface at to allow for quick reference by clinicians and public health practitioners.


Using a large Mtb genomic data set, we estimate antibiotic resistance rates to 12 first-line and second-line antituberculosis agents across 29 countries with correction applied for oversampling of MDR-TB and outbreaks, as well as for genotypic model performance. We demonstrate the strengths and weaknesses of this sequencing-based approach for resistance surveillance using simulations and by comparing the model estimates to systematic national drug resistance survey data. The genotypic approach circumvents a major current constraint in DR surveillance that of reliance on culture-based phenotypic DST. Our results reinforce that public WGS data is increasingly representative of high TB prevalence settings, especially countries with a high burden of MDR. Simulations show that the proposed procedure accurately estimates marginal resistance rates for isoniazid, levofloxacin and pyrazinamide and is largely robust to MDR and outbreak oversampling in public data. Overall, the antibiograms generated here provide key insights into resistance prevalence and co-resistance patterns globally and have implications for TB management including empiric short regimen use for both drug-susceptible and MDR-TB.

In recent clinical trials, a 4-month fluoroquinolone and rifapentine-based treatment regimen for antibiotic susceptible TB was shown to be non-inferior to the current standard of care.4 As this regimen may be soon rolled out for TB treatment, we studied the proportion of rifampicin susceptible isolates (as a proxy for rifapentine susceptibility) that harboured resistance to moxifloxacin or levofloxacin. This resistance would not be detectable by the widely used GeneXpert MTB/RIF that only detects rifampicin resistance mutations.1 Globally, the estimated prevalence of late-generation fluoroquinolone resistance was low at <1%, yet in countries we studied in South Asia the rate was 20–30 times higher. Our estimate of the prevalence was consistent with the prevalence among new TB cases reported by the National Drug Survey in India and with estimates among rifampicin susceptible isolates from Pakistan.27 29 We speculate that the higher rate of fluoroquinolone resistance, may be related to overperscribed or over-the-counter use of fluoroquinolones in those countries.30 31 But other factors including bacterial fitness of fluoroquinolone resistance and transmissibility of such isolates are yet to be explored.22 Overall, our results highlight the need for comprehensive diagnostics that identify fluoroquinolone resistance with a quick turnaround time. These will aid in the rapid identification of patients eligible for the newer fluoroquinolone-containing regimen for antibiotic susceptible TB or short course regimen for MDR-TB.

We found a high proportion of ethionamide susceptibility among isoniazid-resistant isolates across the countries studied. The side-effect profile of ethionamide is less favourable compared with isoniazid (eg, hepatotoxicity seen in up to 5% of patients as compared with up to 3% with isoniazid) and it is hence typically reserved for second-line use.32–34 Our results however support the wider consideration of this agent in the treatment of Hr-TB or MDR-TB.

This study is a proof of concept and has several limitations. This included sampling bias of antibiotic resistance in public TB WGS data. We recognised this limitation and designed a multistep bias correction procedure that performed well for marginal resistance rate estimation on a simulated data set at different levels of the outbreak and MDR oversampling bias. Compared with the South African National DR survey, we observed residual bias in resistance prevalence estimates among MDR isolates with the exception of levofloxacin. These results that there is residual oversampling of resistance at higher levels than MDR, that is, of XDR, despite the relative rarity of XDR globally. The estimates also suggested oversampling of Hr-TB for countries at the upper extreme of resistance prevalence estimates including the Philippines. Another limitation is the imperfect sensitivity of genotypic models for predicting resistance. We note that the RF models used in this study had higher sensitivity and specificity than direct association as reported for the recently released WHO catalogue of resistance mutations.35 36 However, there are still notable gaps in sensitivity for mono-resistance to isoniazid, and second-line injectables compared with phenotypic testing.

Our results support that Hr-TB at least in some parts of the world can be caused by rarer mutations that do not occur in MDR isolates, and hence training separate models for isoniazid mono-resistance will be necessary for accurate genotypic prediction in the future. This however requires large sequencing data sets of isoniazid mono-resistant isolates that are not currently available. Isoniazid remains an important agent in TB treatment and is challenged by the lack of high sensitivity point-of-care diagnostic and delays and complexity of phenotypic assays. Despite limitations in estimating rates of Hr-TB, our average global rate 10.4% is comparable to the recent WHO global estimate of Hr-TB prevalence (8.4%) that used phenotypic data aggregated at the country level.37 Our procedure addresses raised concerns about lower phenotypic sensitivity for Hr-TB and low sample sizes raised in the same report.37

The novel anti-tuberculosis agents including bedaquiline have recently become cornerstone agents in MDR-TB therapy. We were unable to generate estimates for bedaquiline, linezolid and delamanid in this study due to the lack of reliable genotypic prediction methods for these drugs. Recent reports do suggest very low rates of resistance to these agents, in part due to their recent introduction to clinical practice.38 Another limitation is the lack of clinical metadata that did not allow us to estimate antibiograms separately for new and previously treated TB cases. Lastly, antibiogram estimation was necessarily limited to countries well represented in the isolate data set and does not yet cover all high TB burden countries. We anticipate the wider adoption of Mtb sequencing for routine resistance diagnosis in high TB burden countries, championed by agencies such as UNITAID and the Gates Foundation to address several of the aforementioned limitations in future WGS-based surveillance efforts.39 40

In conclusion, we present an effort at global and comprehensive resistance rate estimation by repurposing public pathogen genomic sequence data and leveraging state-of-the-art resistance prediction models. We have made these data publicly accessible for use by clinicians and public health practitioners globally. Acknowledging their limitations, these estimates can assist geography-specific strategies for the control of TB and drug resistance. With the expansion of WGS for use in TB surveillance programmes, the data available to generate these estimates is expected to grow and can be leveraged to allow for the monitoring of trends in resistance over time.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.


We would like to thank Anna Dean of the Global Tuberculosis Programme, WHO, Geneva, Switzerland, for her support in data collection and coordination of this project.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Handling editor Alberto L Garcia-Basteiro

  • Contributors AD and MRF conceived and designed the study. MRF is the guarantor of this study. AD, LF, RV, MIG, MN and MRF wrote scripts for the data analysis. AD and MN conducted the data analysis. ST, SSMA, SSMK, AS, RPB, DRL and NI provided data for analysis, and contributed with guidance and advice throughout the project. AD and MRF wrote the first version of the manuscript and the final manuscript contained contributions from all authors. The final manuscript was read and approved by all authors.

  • Funding This work was funded by the Harvard Global Health Institute’s Burke Fellowship (MRF) Grant No. N/A. AD was supported by the Boston Children's Hospital Office of Faculty Development (the Basic and Translational Executive Committee, the Clinical and Translational Research Executive Committee Faculty Career Development Fellowship) Grant No. N/A and the Bushrod H Campbell and Adah F Hall Charity Fund (Charles A King Trust Postdoctoral Fellowship) Grant No. N/A. RV was supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE1745303. MIG was supported by the German Research Foundation (GR5643/1-1). The funders had no role in study design, data collection and interpretation or the decision to submit the work for publication.

  • Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.