Global Dietary Database 2017: data availability and gaps on 54 major foods, beverages and nutrients among 5.6 million children and adults from 1220 surveys worldwide

Background We aimed to systematically identify, standardise and disseminate individual-level dietary intake surveys from up to 207 countries for 54 foods, beverages and nutrients, including subnational intakes by age, sex, education and urban/rural residence, from 1980 to 2015. Methods Between 2008–2011 and 2014–2020, the Global Dietary Database (GDD) project systematically searched for surveys assessing individual-level intake worldwide. We prioritised nationally or subnationally representative surveys using 24-hour recalls, Food-Frequency Questionnaires or short standardised questionnaires. Data were retrieved from websites or corresponding members as individual-level food group microdata or aggregate stratum-level data. Standardisation included quality assessment; data cleaning; categorising of foods and nutrients and their units; aggregation by demographic strata and energy adjustment. Results We standardised and incorporated 1220 surveys into the final GDD 2017 database, together represented 188 countries and 99.0% of the world’s population in 2015. 72.1% were nationally, 17.0% subnationally, and 10.9% community-level representative. 41.2% used Food-Frequency Questionnaires; 23.4%, 24-hour recalls; 15.8%, Demographic Health Survey questionnaires; 13.1%, biomarkers and 6.4%, household surveys. 73.9% of surveys included data on children; 52.2%, by urban and rural residence; and 30.2%, by education. Most surveys were in high-income countries, followed by sub-Saharan Africa and Asia. Most commonly ascertained foods were fruits (N=803 surveys), non-starchy vegetables (N=787) and sugar-sweetened beverages (N=440); and nutrients, sodium (N=343), energy (N=256), calcium (N=224) and fibre (N=200). Least available data were on iodine, vitamin A, plant protein, selenium, added sugar and animal protein. Conclusions This systematic search, retrieval and standardised effort provides the most comprehensive empirical evidence on dietary intakes across and within countries worldwide.


INTRODUCTION
Diet is critical to both human 1 2 and planetary health. 3 4 Comprehensive and reliable evidence on individual dietary intakes in all nations of the world is essential for evaluating

Key questions
What is already known? ► Comparable and standardised global data on intakes of foods, beverages and nutrients relevant to maternalchild health and chronic diseases have not traditionally been available across nations nor key subnational subgroups.
What are the new findings?
► Through systematic searches and collaboration with investigators worldwide, we retrieved and standardised 1220 surveys of nationally or subnationally representative data on individual-level dietary intakes from 188 countries/territories around the world. ► Most nationally or subnationally representative surveys were identified in high-income countries, followed by sub-Saharan Africa, Asia, Former Soviet Union, Latin America/Caribbean. Middle East/North Africa and South Asia were more data sparse. ► Among foods, data on fruits, vegetables and sugarsweetened beverages were most available; among nutrients, on sodium, energy, calcium and fibre. Data on iodine, vitamin A, plant protein, selenium, added sugar and animal protein were most sparse. ► Less than one-third of surveys had dietary intake data on infants (age 0 to <2 years), young children (2 to <6 years), school age children (6 to 10 years), older adults (70+ years) and pregnant/lactating women or by education level.
What do the new findings imply? ► These identified, collected and standardised data in the Global Dietary Database 2017 provide most comprehensive empirical evidence on dietary intakes across and within countries worldwide.

BMJ Global Health
diet-related burdens for maternal-child health (MCH) and non-communicable diseases (NCDs), as well as for understanding population-level disparities, food costs and affordability, environmental sustainability, and progress toward key aims. For example, the recognition that global diets are relevant to 12 of 17 of the United Nations (UN) Sustainable Development Goals has led to ongoing planning for the first-ever UN Food Systems Summit, scheduled for 2021. 5 Given large potential variation within nations, such data are also crucial to provide empirical evidence of dietary habits in key subgroups, such as by age, sex, socioeconomic status and urban or rural residence. Data on habitual dietary intakes in diverse world regions and subpopulations are also essential to inform the potential impact of acute shocks such as the COVID-19 pandemic. Unfortunately, little data have been systematically identified, collated and standardised on dietary habits worldwide. Available instruments have assessed food commodities or household expenditures, which are not reflective of individual dietary intakes; for example, the UN Food and Agriculture Organization (FAO) Food Balance Sheets (FBS) 6 of estimated national food availability, or the World Bank's Living Standard Measurement Survey 7 and other household expenditure instruments which assess household-level food purchasing (but not foods produced by the household, purchased outside the home or actual individual intakes). These available data sources also do not assess heterogeneity within nations, such as by demographic subgroups that may vary in both dietary intakes and disease risk. Numerous national or subnational health and nutrition surveys at the individual level have been conducted around the world, but these are seldom standardised or comparable across countries, time, dietary factors or demographic groups, 8 and many are not publicly available.
To address these gaps, the Global Dietary Database (GDD) project was created to comprehensively identify, compile and standardise individual-level data on dietary factors relevant to health. 9 The first iteration, GDD 2010, developed systematic methods to compile data from 527 dietary surveys (including urine biomarker surveys) from 116 countries/territories, representing 88.7% of the global adult population in 2010. 10 The GDD 2010 represented a major advancement over previously available data, facilitating novel assessments of global dietary habits, trends and patterns, [11][12][13][14][15] burdens of diet-related illness [16][17][18][19] and diet-related sustainability concerns. 20 21 In addition, the GDD 2010, together with its associated systematic characterisation of diet-disease aetiological effects and optimal intake levels, 22 formed the foundation for the 2010 and 2013 Global Burden of Diseases Study risk estimates of diet-related health burdens. 16 23 These findings confirmed for the first time, for example, that poor diet has overtaken tobacco smoking as the leading cause of preventable death in the world. 23 Yet, key limitations remained. GDD 2010 focused on major dietary risks for NCDs, but excluded other dietary components, such as micronutrients, relevant for MCH. GDD 2010 also focused on adults (age ≥20 years), with little data on dietary intakes in children or youth. While GDD 2010 provided the first global data stratified by age and sex within nations, it did not include other subnational stratifiers likely to influence diets such as socioeconomic status or urban versus rural residence. GDD 2010 also identified data sparsity in certain regions of the world. To address these gaps, update available data, and advance the characterisation of dietary intakes worldwide, the GDD 2017 systematically identified, collected and standardised additional dietary surveys, including new information on many more foods, beverages and nutrients; an expanded age focus to include infants, children and adolescents; and further joint stratification by age, sex, education level, urban/rural residence and pregnancy/lactation status within nations. This analysis reports on data availability, and corresponding gaps, of global dietary data.

Prioritisation of dietary surveys
Our methods for GDD 2010 have been reported. 10 14 15 17 24 Briefly, we conducted systematic searches of multiple electronic databases and extensive personal communications with experts and authorities worldwide to identify and obtain individual-level dietary surveys globally. We focused on quantitative data on dietary consumption of 21 foods, beverages and nutrients in 16 age-specific and sex-specific subgroups among adults (age ≥20 years) in up to 116 nations across 21 geographical regions between 1980 and 2010. For GDD 2017, we performed additional systematic electronic searches together with extensive communications with 644 data owners worldwide to identify further public and nonpublic data sources on individual-level dietary intakes. We prioritised nationally or subnationally representative surveys, with a special focus on previously identified data sparse low-income and middle-income countries.
We searched for surveys that collected quantitative dietary intake information on one or more of 54 foods, beverages, nutrients or dietary indices (table 1). These were selected and defined based on evidence for relationships with MCH or NCDs as well as clinical and policy interests in their intakes. We also searched for surveys on four additional dietary factors (animal protein excluding dairy protein, dairy protein, glycaemic index and glycaemic load), but identified too few available surveys to include these in the GDD 2017. We prioritised surveys with individual-level assessments using standardised 24-hour recalls, food-frequency questionnaires (FFQ) or short standardised questionnaires (eg, Demographic Health Survey (DHS) questionnaires). Household-level surveys were considered if individual-level surveys were not available in a country. For assessment of dietary sodium and iron, we also searched for and included BMJ Global Health This definition includes soybeans but excludes soy milk and soy protein.
Includes nuts/seeds, soy protein, soy products, peanuts and peas.
Refined grains g/day Total intake of refined grains, defined as grains which have been milled to remove the bran and germ. Examples include white or polished rice, and products made with refined (white) flour, including white bread, pasta/noodles, cereals, crackers, and bakery products/desserts containing refined grains. This definition excludes corn products including corn flour and corn meal.
Includes corn products, soybeans, sweetened cakes and breads with grain as the main ingredient. May include whole grains.
Whole grains g/day Total intake of whole grains, defined as a food with ≥1.0 g of fibre per 10 g of carbohydrate, in which all components of the kernel (ie, bran, germ, and endosperm) are present in the same relative proportions as the intact grain. Examples include whole grain bread, brown rice, whole grain pasta, whole grain breakfast cereals, oats, rye, barley, millet, sorghum, and bulgur. This definition excludes corn products including corn flour, corn meal and popcorn.
Includes wholegrain breads, cereals, rice/pasta, bread and other products such as biscuits.
Total processed meats g/day Total intake of processed meat, defined as any meat (including poultry) that has been cured, smoked, dried or chemically preserved. Examples include bacon, salami, sausages, hot dogs and processed deli or luncheon meats. This definition excludes fish and eggs.
Includes sausages and unprocessed meats.
Unprocessed red meats g/day Total intake of unprocessed red meat, defined as beef, pork, lamb, mutton or game that has not been cured, smoked, dried or chemically preserved. This definition excludes poultry, fish and eggs.
Includes processed red meats, poultry, fish and organ meats.
Total seafoods g/day Total intake of fish and shellfish. Examples include salmon, tuna, trout, tilapia, shrimp, crab, oysters and cephalopods.
Includes salted fish, processed fish and other animal products.

BMJ Global Health
Dietary factor Unit Preferred definition

Alternative definition
Eggs g/day Total intake of eggs produced by poultry/birds, including chicken, goose, or duck eggs. This definition excludes fish eggs.
Cheese g/day Total intake of cheese derived from the milk of livestock (eg, cows, buffalo, yak), including hard cheese (eg, cheddar, mozzarella, Swiss), soft cheese (eg, ricotta, cottage cheese, paneer) and processed cheese.
Includes yoghurt, milk products and cheese.
Yoghurt g/day Total intake of yoghurt and fermented milk, including reduced-fat and full-fat yoghurt. Includes dairy curd, buttermilk, paneer, cheese and milk.
Sugar-sweetened beverages g/day Total sugar-sweetened beverage intake, defined as any beverage with added sugar having ≥50 kcal per eight ounces (236.5 g) serving, including commercial or homemade beverages, soft drinks, energy drinks, fruit drinks, punch, lemonade, and frescas. This definition excludes 100% fruit and vegetable juices and non-caloric artificially sweetened drinks.
Includes fruit and vegetable juices. May also include coffee, tea and milk.
Fruit juices g/day Total intake of 100% fruit juice, excluding sugar-sweetened fruit juice and vegetable juice.
Includes fruit juices, vegetable juices and sweetened juices.

Coffee
Cups/day (one cup=8 ounces) Total coffee intake including caffeinated, decaffeinated, sweetened or unsweetened coffee.
Includes tea.

Tea
Cups/day (one cup=8 ounces) Total green or black tea intake, including caffeinated, decaffeinated, sweetened or unsweetened tea. This definition excludes herbal tea.
Includes coffee.
Reduced fat milk g/day Total reduced-fat dairy milk intake, including non-fat, low-fat milk and skim milk. This definition excludes yoghurt, fermented milk and soy or plant-derived milk (eg, coconut milk, almond milk).
Includes sweetened reduced fat flavoured milk.
Whole fat milk g/day Total whole-fat dairy milk intake. This definition excludes yoghurt, fermented milk, and soy or other plant-derived milk (eg, coconut milk, almond milk).
Includes sweetened whole fat flavoured milk.
Total milk g/day Total intake of dairy milk including non-fat, low-fat, skim, and whole-fat milk. This definition excludes yoghurt, fermented milk and soy or other plant derived milk (eg, coconut milk, almond milk).
Includes yoghurt, dairy drinks, cheese and dairy products. Added sugar Per cent energy/day Total intake of sugar added during the preparation or processing of foods and beverages. Examples include the sugar added in sugar-sweetened beverages, desserts, candy, breakfast cereals and sweetened milk. This definition excludes noncaloric sweeteners and sugar that naturally occur in foods, such as those in fruits, milk or milk products.
Includes all dietary sugar.
Calcium mg/day Total intake of calcium from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
Dietary sodium mg/day Total intake of sodium from all sources.
Includes urinary sodium.
Iodine µg/day Total intake of iodine from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
Iron mg/day Total intake of heme and non-heme iron from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
Magnesium mg/day Total intake of magnesium from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
Potassium mg/day Total intake of potassium from all sources, excluding dietary supplements.
Selenium µg/day Total intake of selenium from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
Vitamin A with supplements µg RAE/day (RAE=retinol activity equivalent) Total intake of vitamin A (including retinol, retinal, retinoic acid, and retinyl esters) and provitamin A carotenoids from all sources, including dietary supplements. Includes intake from supplements in a population with relatively low supplement use.
Vitamin E mg/day Total intake of vitamin E tocopherols and tocotrienols from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use and alpha tocopherol.
Zinc mg/day Total intake of zinc from all sources, excluding dietary supplements.
Includes intake from supplements in a population with relatively low supplement use.
The foods capture nearly the entire diet, with the exceptions of poultry, dairy-based desserts, candy and sweeteners, and cakes, cookies and other baked goods, which may be collected in future iterations of the GDD. AOAC, Association of Official Analytical Chemists; DFE, dietary folate equivalent; GDD, Global Dietary Database; RAE, retinol activity equivalent.  10 14 Because few identified publications reported dietary intake data according to comparable definitions or stratified by all relevant subgroups, we used published articles to identify data owner contacts. As done for GDD 2010, 10 24 data owners were invited to join the GDD as a corresponding member (CM), which involved contributing their expertise and survey data through standardised electronic forms. GDD 2010 CMs were also invited to update previously submitted data with the new age group, dietary factor and other subgroup strata, as well as share any newly collected data. A detailed contact and communication algorithm was used to maximise responsiveness and participation (online supplemental text S2). Each CM registered their survey and its characteristics, completed a data sharing agreement, and uploaded the data as individual-level data (preferred) or stratum-level data based on standardised dietary factor and strata definitions. Relevant publicly available surveys were identified using systematic database searches for health and nutrition surveys, as well as communication with our global CM network. For each potentially relevant public survey, data codebooks were screened for inclusion, and eligible surveys downloaded with prioritisation of data-sparse regions and nations as well as large (populous) nations.

Survey screening and inclusion
Identified published articles were screened by title and abstract and, for all potentially relevant articles, screened as full text by a single reviewer. A random subset of articles from each database and geographic region were screened by a second reviewer to ensure consistency and accuracy. When published reports did not contain the necessary data format (most often), data owners were invited to become CMs and share their data. Datasets and survey documentation submitted by CMs and those from public surveys were reviewed again by a third reviewer to ensure survey inclusion criteria were met. We prioritised nationally or subnationally representative surveys whenever available. When no such surveys were identified for a nation, we allowed community-level surveys, and then household-level surveys, if these were felt to be representative of the community; that is, such surveys were excluded if focused on special populations (eg, people with specific disease conditions) or cohorts (eg, people of a certain profession or dietary pattern).

Data retrieval and assessment
Standardised protocols were used to identify, extract and analyse data in a systematic and comparable manner. For CM-provided surveys, survey characteristics were retrieved using a standardised electronic form, including data on survey name, country, years performed, sampling methodology, response rate, national representativeness, level of data collection (individual or household level), dietary assessment method and validation, sample size, population demographics (age, sex, education, urban/rural residence, pregnancy/lactation status), and definitions and measurement units of dietary factors. Individuallevel microdata were retrieved as SAS, STATA, SPSS v.25 or Microsoft Excel files. Aggregate stratum-level dietary intake data were collected using standardised electronic spreadsheets, including data on stratum sample size and means, SD, and 10th, 25th, 50th, 75th and 90th percentiles of intake for each dietary factor, jointly stratified by age, sex, education and urban/rural strata, as available. Data from publicly available surveys were retrieved using a similar standardised electronic spreadsheet as for CM surveys. Random double checks of data retrievals were performed to ensure correct extraction of publicly available surveys. When the same study collected information across multiple countries, data for each country were separated and counted as a separate survey for reporting purposes.

Data standardisation
Standardisation included data quality assessment, standardised categorisation of foods, beverages and nutrients and their units, aggregation by subgroup strata, energyadjustment and compilation into a relational database. Each dietary variable was characterised according to a standard definition and units (table 1). Surveys with varying definitions were classified using defined secondary definitions. When multiple days of dietary intakes were collected (eg, diet recalls or records), these were averaged for each individual. Semi-quantitative instruments (eg, FFQs based on a single specified portion size) and short standardised questionnaires (eg, DHS surveys) were converted to standard serving sizes for each frequency category. Household-level data were converted to individual-level intakes within each household using Adult Male Equivalents, 25 which accounts for the household composition and differing energy intakes by age and sex of household members. Based on national estimated average requirements [26][27][28] and observed population intakes, 29 all intakes were adjusted to 700 kcal/ day for ages 0 to <1 years, 1000 kcal/day for ages 1 to <2 BMJ Global Health years, 1300 kcal/day for ages 2-5 years, 1700 kcal/day for ages 6-10 years, 2000 kcal/day for ages 11-74 years and 1700 kcal/day for ages 75+ years (online supplemental text S3). Individual-level microdata were aggregated into subgroups jointly stratified by age (0-5, 6-11 and 12-23 months; and 2-4, 5-10, 11-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-79, 80-84, 85-89, 90-94 and 95+years), sex, education (≤6 years of education, 6.01-12 years or ≥12.01 years; and for children, head of household's educational attainment), urban/rural residence, and pregnancy/ lactation status, as available. Urban versus rural residence were defined according to each survey's established definition, due to absence of any single global definition of these factors as well as logistical challenges in aiming to revise each survey's existing definitions. Education was selected as the most widely available and standardised metric of socioeconomic status, as compared with income or wealth indices which are not always reported similarly or accurately across countries.

Quality control and data management
Data integrity and quality were assessed at each step during survey collection, processing, standardisation and analyses. Duplicate reviews were performed of recorded survey characteristics, demographic variables, dietary definition classifications and unit conversions. To assess for outliers and validity (errors) in reported intakes, plausibility thresholds were defined for each dietary factor, both at the individual level and stratum (eg, group mean) level, based on dietary reference intakes, tolerable upper limits, toxicity ranges and existing regional data on mean intakes in populations (online supplemental tables S9-S14). Any value identified as potentially implausible was reviewed for extraction errors, followed by direct correspondence with the CM or public survey data owners, to detect and correct potential errors. Data remaining implausible after such steps were excluded from final datasets. Results for each dietary factor were further graphed and visually inspected by country, age, sex, dietary assessment method, representativeness and time period, reviewing survey result plausibility and consistency within and across countries.
Data analyses were performed using SAS V.9.4 (SAS Institute), Stata V.14.0 (StataCorp) and RStudio V.1.1.453 (RStudio, Massachusetts, USA). Data files were organised using Microsoft Access 2010 relational database (Microsoft, Redmond, Washington, USA), linking survey characteristics, data owner institutional information, and survey processing and standardisation details. SQL queries were developed to deduplicate data, summarise survey characteristics and calculate survey quality scores.

Modelling and imputation
In addition to identifying, collecting, standardising and disseminating survey-level data, the GDD 2017 uses advanced imputation modelling to account for differences in survey design, representativeness, dietary assessment methods and dietary factor definitions, as well uncertainty in dietary estimates and missingness, to estimate stratum-specific mean dietary intakes jointly stratified by age, sex, education and urban/rural residence (N=240 strata per country year), for each of 188 countries/territories per year between 1990 and 2018. Our modelling methods for GDD 2010 have been reported, 11 15 30 31 and the updated modelling methods and findings for GDD 2017 will be the focus of a forthcoming paper.

Patient and public involvement statement
There was not public involvement in the study; we used publicly available or privately held data for the analysis.  (table 2). Most surveys included data on children and adolescents (age 0-19 years; 73.9%); about two-thirds of surveys (64.5%) included data on adults (age 20+ years). More than half (52.2%) included data that specified urban or rural residence of individuals, including 4.7% urban only and 1.4% rural only. Data on education level or pregnancy/ lactation status of participants was available in 30.2% and 11.2% of surveys, respectively.

BMJ Global Health
Compared with CM-provided surveys, public surveys were more often nationally representative (76.9% vs 60.7%), conducted before 2000 (42.0% vs 21.6%) and available as individual-level microdata (65.6% vs 41.6%), and more likely to be household level (7.8% vs 0.8%) (table 2)   . †Each survey count represents a country-specific survey year. When data collection for a single survey was performed over multiple years, the median survey year was used (or first year if 2 years).

BMJ Global Health
‡Because data on children/adolescents (0-19 years), urban/rural residence, education, pregnancy/lactation and response rate were not collected in GDD 2010 (41.7% of total surveys), these percentages may underestimate available data in these surveys. Values are shown for surveys including data on that subgroup and may sum to greater than 100% because a survey can include multiple subgroups.
§Based on the food groups collected in GDD 2010 (up to 21, 41.7% of surveys) and GDD 2017 (up to 54, 58.3% of surveys), not including biomarker surveys. ¶Individual-level microdata represent individual-level data in possession of the GDD. Aggregated stratum-level distributions are based on individual-level data aggregated by data owners in standardised subgroups jointly stratified by age, sex, education and urban/rural residence, and pregnancy/lactation status, as available; and provided to the GDD including stratum-specific means, SD and percentiles of intake. Nearly all (94.2%) surveys collected in 2014-2020 (GDD 2017 round of data collection) were individual-level microdata. DHS, Demographic Health Survey; GDD, Global Dietary Database.    Unfortunately, countries in these regions have also been more likely to suffer prolonged conflict and economic shocks impacting food security and nutrition, 32 rendering even greater the challenges as well as importance of collecting robust nutrition surveillance data. Less than one-third of surveys each provided data on infants or children younger than 10 years, older adults (age 70+ years), or by pregnancy/lactation status in women. Publicly available (especially DHS) surveys were more likely to report data on young children and pregnant/lactating women, while nonpublic surveys were more likely to report data on older adults. Because nutritional requirements are especially sensitive in these subgroups, our findings demonstrate the global need for more nationally representative high-quality surveys, using 24-hour recalls or FFQs, in these special populations. In addition, while South Asia comprises about a quarter of the world's population and has the highest global prevalence of stunting and wasting, 33 our results demonstrate the fewest number of dietary surveys with data on children (age 0-19 years) in this region. These novel findings highlight specific key gaps in dietary surveillance in this region.
Among different dietary factors, the availability of surveys varied substantially. Generally, more surveys collected data on intakes of foods, especially fruits and vegetables; and fewer surveys estimated nutrient intakes. Importantly, relatively few public surveys reported on

BMJ Global Health
vitamins and minerals relevant to MCH such as iron, vitamin A, zinc, iodine, folate and vitamin B 12 , among others. We found many public surveys used dietary instruments which estimate only a portion of the diet (eg, DHS questionnaires) and thus cannot estimate nutrient intakes; or included broader food assessments but without reliable and updated national food composition databases or food composition tables to estimate nutrients. This was identified to be particularly problematic in sub-Saharan Africa and Latin America/Caribbean, emphasising need for increased collection of individual level, nationally representative dietary data using 24-hour recalls or FFQs, together with creation or updating of food composition data, in these regions. While 24-hour recalls are considered the best standard for assessing national and stratum-specific mean dietary intakes, their collection is more time and resource intensive than for FFQs, which also may better assess habitual intakes among each individual participant. 34 35 Consistent with this, more global surveys used FFQs than 24-hour recalls. Both of these instruments are more valid than short dietary questionnaires, which collect far less detailed data on a handful of foods and beverages, creating more measurement error and far less coverage of the whole diet. 36 However, short dietary questionnaires are easier and less expensive to administer, consistent with our findings that they are most common in low-resource nations. The GDD 2017 results highlight the unfortunate irony that micronutrient information is least available and valid where it matters most.
For specific factors, we identified and collected biomarker data but valid biomarkers are not available for most dietary factors. 37 We did not use householdlevel data except where individual-level data were not available, given their significant limitations in assessing dietary intakes. 8 Many prior efforts to estimate dietary intakes globally have used FAO FBS as primary data inputs. 12 38 While FAO data represent powerful and useful annual estimates of national per capita availability of food commodities, they are not intended to capture and are poor representations of dietary intake. 39 For example, they do not capture well, unreported food waste, local food production and especially subnational heterogeneity in dietary habits among different population subgroups. 6 40 The Global Burden of Diseases Study is another study that estimates global diet. While the 2010 and 2013 cycles of the Global Burden of Diseases Study collaborated with the GDD 2010 for their global dietary estimates, the Global Burden of Diseases Study subsequently internalised their processes for estimating diets. Based on available publications, that study now uses FAO FBS estimates, national product sales data and household budget surveys as primary data inputs, adjusting using a single global regression against individual-level diet surveys from 67 countries. 2 41 Because the relationship between national food availability and individual-level dietary intakes is known to vary significantly and jointly by age, sex, world region and other factors, 39 such methods will not sufficiently capture the heterogeneity in national and subnational intakes across diverse countries. Compared with GDD 2017 which includes data on 54 dietary factors, the Global Burden of Diseases Study currently reports on 8 foods, 1 beverage and 6 nutrients. We and others have collaborated with Gallup, Inc., in their planning for potential standardised polling on dietary intakes. 42 Such data, mostly likely based on short dietary questionnaires, will not capture the full diet nor estimate nutrients but will complement DHS data and provide

BMJ Global Health
useful new inputs to the GDD global dietary modelling efforts.
Overall, while gaps and heterogeneity in data sources are evident, the GDD 2017 represents, to our knowledge, the most comprehensive and updated data on global dietary intakes. To maximise its benefits as a public resource, the GDD 2017 is now available for free public download at http://www. glob aldi etar ydat abase. org. Survey-level information and original data download weblinks are provided for all public surveys; and survey-level microdata or stratum-level aggregate data, as available, are provided for direct download for all non-public (CM) surveys granted consent for public sharing by the data owners (currently 81.9%). Importantly, the full modelled GDD 2017 data, which will leverage all surveys as primary data inputs together with survey-and country-level covariates to estimate the mean intakes of all 54 food and nutrients within each of 240 subnational subgroups in 188 nations by year between 1990 and 2018, will also be available for free public download when finalised (estimated Spring of 2021). 9 The GDD is also collaborating with the FAO/ WHO Global Individual Food consumption data Tool (FAO/WHO GIFT) project 43 and European Food Safety Authority to jointly facilitate harmonisation of dietary datasets on a global scale, public dissemination of methods and dietary datasets, and global collaboration and capacity development with dietary data owners worldwide. We hope the individual survey microdata, standardised dietary datasets, and global modelled data will each serve as critical public resources for researchers, health agencies and governments to evaluate national and subnational dietary intakes and trends, diet-related health burdens and disparities, dietary costs and affordability, strains and options for sustainability, corresponding policy and intervention priorities, and strengths and gaps in dietary surveillance. For example, ongoing efforts to biofortify staple foods in low-income nations 44 45 will require data on national and subnational (eg, by age, sex, education, rural/urban residence) intakes of those staple foods, as well as on existing national and subnational intakes of the targeted nutrients from other foods, to effectively plan and implement biofortification.
Our investigation has several strengths. We performed systematic global searches for dietary surveys and employed standardised methods for survey and dietary factor identification, retrieval, processing, checking, standardisation and analysis. We searched multiple online databases of published literature and publicly available data, including region-specific databases, with extensive additional contacts of data owners to identify dietary surveys. We focused searches on data sparse world regions to improve the characterisation of diet in these populations and identify remaining dietary surveillance needs. We collected individual-level microdata or aggregate stratified data by key demographic subgroups, providing critical information on dietary heterogeneity within nations. We collected data on 54 foods, beverages, and nutrients, providing the most complete available information on overall diets. To maximise consistency and comparability, we performed standardised data extraction and analysis including data quality assessments, standardised food definitions and units, energy-adjusting intakes to age appropriate levels, and assessment of outliers and plausibility.
Limitations should be considered. Identified surveys used varying designs and instruments. Thus, GDD 2017 followed a rigorous documentation process to detail each survey's methods and standardisation process to better standardise the data. Due to the breadth and scope of data collected and standardised, we focused on food categories (eg, fruits) rather than individual foods (eg, apples). Yet, food categories have been most often assessed in relation to MCH and NCD outcomes, and individual-level microdata are available in GDD for future assessments of more granular dietary categories. Four identified food categories (poultry, dairy-based desserts, candy and sweeteners, cakes, cookies and other baked goods) were excluded from our original assessment design; we hope to capture these categories in future iterations of the GDD. Not all potentially relevant dietary surveys could be retrieved due to accessing certain publicly available surveys and logistical challenges in contacting and engaging data owners.
In summary, the GDD 2017 identified, collated and standardised 1220 dietary surveys across 188 countries/territories globally, providing a public resource of data on 54 dietary factors in children and adults over time, nationally and subnationally by age, sex, urban/rural residence, education and pregnancy/lactation status; as well as identifying specific gaps for accelerated surveillance.

BMJ Global Health
Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.
Competing interests VM reports a grant from the American Heart Association, outside the submitted work. PW reports research grants and contracts from the US Agency for International Development and UNICEF and personal fees from the Global Panel on Agriculture and Food Systems for Nutrition, outside the submitted work. RM reports grants from National Institute of Health, Nestle, Danone, personal fees from Bunge, Development Initiatives, outside the submitted work. DM reports research funding from the National Institutes of Health and the Bill & Melinda Gates Foundation; personal fees from GOED, Bunge, Indigo Agriculture, Motif FoodWorks, Amarin, Acasti Pharma, Cleveland Clinic Foundation, America's Test Kitchen and Danone; scientific advisory board member for Brightseed, Day Two, Elysium Health, Filtricine, HumanCo and Tiny Organics; and chapter royalties from UpToDate, all outside the submitted work.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available in a public, open access repository. To maximise its benefits as a public resource, the GDD 2017 is now available for free public download at http://www. glob aldi etar ydat abase. org. Survey-level information and original data download weblinks are provided for all public surveys; and survey-level microdata or stratum-level aggregate data, as available, are provided for direct download for all non-public (CM) surveys granted consent for public sharing by the data owners.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
Open access This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https:// creativecommons. org/ licenses/ by/ 4. 0/.