Original article
Validity of US norms for the Bayley Scales of Infant Development-III in Malawian children

https://doi.org/10.1016/j.ejpn.2013.11.011Get rights and content

Abstract

Objective

Most psychometric tests originate from Europe and North America and have not been validated in other populations. We assessed the validity of United States (US)-based norms for the Bayley Scales of Infant and Toddler Development-III (BSID-III), a neurodevelopmental tool developed for and commonly used in the US, in Malawian children.

Methods

We constructed BSID-III norms for cognitive, fine motor (FM), gross motor (GM), expressive communication (EC) and receptive communication (RC) subtests using 5173 tests scores in 167 healthy Malawian children. Norms were generated using Generalized Additive Models for location, scale and shape, with age modeled continuously. Standard z-scores were used to classify neurodevelopmental delay. Weighted kappa statistics were used to compare the classification of neurological development using US-based and Malawian norms.

Results

For all subtests, the mean raw scores in Malawian children were higher than the US normative scores at younger ages (approximately <6 months) after which the mean curves crossed and the US normative mean exceeded that of the Malawian sample and the age at which the curves crossed differed by subtest. Weighted kappa statistics for agreement between US and Malawian norms were 0.45 for cognitive, 0.48 for FM, 0.57 for GM, 0.50 for EC, and 0.44 for RC.

Conclusion

We demonstrate that population reference curves for the BSID-III differ depending on the origin of the population. Reliance on US norm-based standardized scores resulted in misclassification of the neurological development of Malawian children, with the greatest potential for bias in the measurement of cognitive and language skills.

Introduction

It has been estimated that more than 200 million children under 5 years of age are not reaching their full potential for growth, cognition, or socio-emotional development due to risk factors for neurological delay.1 In the African setting, poverty, human immunodeficiency virus (HIV) infection, malaria and malnutrition have been shown to adversely affect neurological development.1, 2, 3, 4, 5 While it is critical to assess whether children develop appropriately, few national statistics from developing countries exist and neurodevelopmental delay in these settings remains understudied. In addition to the paucity of data on pediatric neurological development in Africa, most extant data were collected using assessments developed in Europe or North America. This is, at least in part, due to the scarcity of psychometric tools to measure neurological development specifically designed for settings outside of Europe and North America.

Diagnosis of a child's neurological development requires comparison to a normative reference population by transforming raw scores to percentile ranks or standardized (“scaled”) scores.6 Normative samples are usually cross-sectional, drawn from healthy children who represent the target population. For example, the Bayley Scales of Infant and Toddler Development, Third Edition (BSID-III), is widely used in international child development research. The normative sample for the BSID-III included 1700 children stratified by age, sex, parental education, race and geographic region in the United States (US).7 Data from the normative sample were then used to construct norms that represent the distribution of test performance in the US population. Test norms are necessary to identify delay; without the ability to compare a child's performance to what is considered “normal”, raw scores have limited value for research or clinical practice.

Several authors have raised concerns about the use of a psychometric tool in a population other than the one in which the tool was developed.8, 9, 10 To address possible cross-cultural bias when using developmental assessments in populations other than those for whom norms were developed, three approaches have been commonly used. First, an entirely new test can be developed and normed for a specific population.11 Successful generation of a new test involves an inter-disciplinary research team, an adequate representative sample for testing items and test cohesion, and the concurrent development of norms or standards that represent typical development.12 For example, the Kilifi Developmental Inventory is a new tool designed to monitor and describe the development of at-risk children in resource-limited settings in Kenya.11 A new test ensures a culturally appropriate psychometric tool but its development is resource-intensive and prohibits use beyond its target population, thus limiting comparability of findings with other tests and across populations.

An alternative to the development of a new test is the adaptation of existing tools for use in new populations. Published guidelines have been developed for this process, which typically involves translation of test materials, modification of test items inappropriate for the local context to preserve consistency with the constructs measured, followed by a process of iterative adaptation and testing of the assessment tool.12, 13, 14 For example, Nampijja et al. successfully adapted several Western measures to assess cognition in 5-year-old semi-urban Ugandan children.15 Gladstone et al. employed a hybrid approach in which existing test items were modified for local context and combined with new test items designed specifically for the population under study.16 While an adapted test improves cultural appropriateness, it is also resource-intensive and does not permit comparability between populations. Furthermore, although adaptation of existing assessment tools may reduce bias in test items measuring specific constructs, there remains a risk of bias unless these adaptations are accompanied by creation of local norms as children in one setting may perform better, on average, than children in another due to cultural differences in child rearing and access to early education.8

Finally, tests developed in Europe and North America have been used in other populations in epidemiologic studies of specific exposures without adaptation by employing a healthy control group for comparison. For example, controls were enrolled in a study comparing the neurodevelopment of HIV-infected versus HIV-exposed uninfected children in the Democratic Republic of Congo.17 Healthy children who serve as a control group can be used to reduce norm-related bias when assessing group-level differences in developmental delay. Existing tests without adaptation can also be used to identify factors associated with neurodevelopmental scores within one population, as was done in a study of the effect of stunting and wasting on the neurodevelopment of Tanzanian children.18 However, while use of controls allows comparisons of scores between groups, this approach does not allow the unbiased diagnosis of delay for individual children.

We present an alternative approach to the challenge of neurodevelopmental assessment of children in resource-poor countries by developing new population norms. We applied this method to the BSID-III using data collected in healthy Malawian children age 10 weeks to 30 months and compared the classification of neurological delay using the Malawian and US norms.

Section snippets

Study site and selection of participants

From March 2008 to December 2009, 167 healthy children were enrolled as controls for a cohort study of HIV infection and neurological development. HIV-negative mothers age ≥15 years without a history of alcohol or substance abuse were randomly selected among women attending prenatal care visits at two primary care facilities in Blantyre, Malawi. Their infants were enrolled at age 10 and 14 weeks if born without congenital abnormalities and free of severe disease at enrollment. Children were

Study population characteristics

A total of 167 children contributed 5173 unique BSID-III sub-tests. The majority (70%) completed 12 months follow-up, most (59%) completed 24 months follow-up, and 50% completed assessments at 30 months of age. Children who were lost to follow-up contributed to the analysis through their last study visit. Reasons for incomplete follow-up were relocation or withdrawal (n = 73), death (n = 6), and administrative censoring at end of study (n = 7). Except for a higher proportion of mothers being

Discussion

Despite the interest in examining effects of biological and environmental factors on cognitive functioning of children in low and middle income countries, many psychometric tests originate from Europe and North America and have not been validated in other populations. We present the creation of locally-normed population reference curves for a neurodevelopmental tool (BSID-III) developed for and commonly used in the US. Our data show that the BSID-III norms developed from test scores of Malawian

Financial disclosure

All authors have no financial relationships relevant to this article.

Conflict of interest

All authors have no conflict of interest to disclose.

Acknowledgments

Funding was provided by the United States National Institutes of Health/National Institute of Child Health and Human Development, via grant R01 HD053216. MLW is funded through a core program grant from the Wellcome Trust, UK. We acknowledge Jill Lebov, Caroline Hexdall and Daniel Lowe for their contributions to the study. We also thank the participants and their families for their participation.

References (33)

  • S. Grantham-McGregor et al.

    Developmental potential in the first 5 years for children in developing countries

    Lancet

    (2007)
  • C.M. McDonald et al.

    Stunting and wasting are associated with poorer psychomotor and mental development in HIV-exposed Tanzanian infants

    J Nutr

    (2013)
  • S. Grantham-McGregor et al.

    Review of the evidence linking protein and energy to mental development

    Public Health Nutr

    (2005)
  • K. Le Doare et al.

    Neurodevelopment in children born to HIV-infected mothers by infection and treatment status

    Pediatrics

    (2012)
  • R. Idro et al.

    Cerebral malaria: mechanisms of brain injry and strategies for improved neuro-cognitive outcome

    Pediatr Res

    (2010)
  • S.M. Grantham-McGregor et al.

    Undernutrition and mental development

    (2001)
  • D. Cicchetti

    Guidelines, criteria, and rules of thumb for evaluaing normed and standardized assessment instruments in psychology

    Psychol Assess

    (1994)
  • N. Bayley

    Bayley scales of infant and Toddler development

    (2006)
  • K.F. Geisinger

    Cross-cultural normative assessment: translation and adaptation issues influencing the normative interpretation of assessment instruments

    Psychol Assess

    (1994)
  • F.J.R. Van de Vijver et al.

    Translating tests: some practical guidelines

    Eur Psychol

    (1996)
  • Y.H. Poortinga

    Cultural bias in assessment: historical and thematic issues

    Eur J Psychol Assess

    (1995)
  • A. Abubakar et al.

    Assessing developmental outcomes in children from Kilifi, Kenya, following prophylaxis for seizures in cerebral malaria

    J Health Psychol

    (2007)
  • L.C.H. Fernald et al.

    Examining early child development in low-income countries: a toolkit for the assessment of children in the first five years of life

    (2009)
  • R.K. Hambleton

    Guidelines for adapting educational and psychological tests: a progress report

    Eur J Psychol Assess

    (1994)
  • R.K. Hambelton

    Issues, designs, and technical guidelines for adapting tests into multiple languages and cultures

  • M. Nampijja et al.

    Adaptation of western measures of cognition for assessing 5-year-old semi-urban Ugandan children

    Br J Educ Psychol

    (2010)
  • Cited by (42)

    • Cultural adaptation of the Bayley Scales of Infant and Toddler Development, 3rd Edition for use in Kenyan children aged 18–36 months: A psychometric study

      2021, Research in Developmental Disabilities
      Citation Excerpt :

      Within sub-Saharan Africa, studies have been performed in South Africa (Ballot et al., 2017; Pendergast et al., 2018), Ethiopia (Hanlon et al., 2016) and Malawi (Cromwell et al., 2014). Most of these studies have found the Bayley-III to be a valid assessment tool for development; while some caution on the limited applicability of the U.S. norms within local populations (Cromwell et al., 2014). No studies on the validity of the Bayley-III have been performed in Kenya, and thus it is unclear how it will perform psychometrically within this setting.

    • Neonatal neurological examination in a resource-limited setting: What defines normal? Neurologic Assessment of the Newborn

      2020, European Journal of Paediatric Neurology
      Citation Excerpt :

      Nevertheless, this study is not the first to highlight concerns regarding the international and cross-cultural applicability of normative data to assess infants and children. For example, when the “gold standard” Bayley Scales of Infant and Toddler Development, Third Edition was administered in Malawi using norms developed in the United States, misclassification of neurodevelopmental status among Malawian infants has been shown [14]. Similarly, when Brazilian, Greek, and Canadian infants were assessed using the Alberta Infant Motor Scale, a tool with scoring developed from Canadian reference values, Brazilian children had significantly lower scores than other infants (42) despite the tool having previously been validated in Brazil [43].

    View all citing articles on Scopus
    View full text