Discussion
Improving the identification and care of small, high-risk babies is essential to reduce the global burden of neonatal morbidity and mortality. Given that GA and birth weight information is commonly missing in half of births in sub-Saharan Africa and South Asia,1 6 foot length measurement has emerged as a promising method to identify vulnerable infants born in community settings. In this systematic review and meta-analysis, we found that foot length thresholds of <7.7 cm in Asia and <7.9 cm in Africa classified LBW (<2500 g) infants with high sensitivity and lower specificity, and foot length <7.3 cm had relatively high sensitivity and specificity (>80%) to classify infants <2000 g. Data assessing the accuracy of foot length for identifying preterm infants were limited by both quality and heterogeneity of reference standard GA dating method.
Different methods of foot length measurement have been described in the literature. Some investigators have used specialised or higher cost equipment, such as customised measuring boards15 22 42 47 or callipers.11 19 21 48 The most common method used across studies was the measurement of the heel-to-hallux (or to longest digit) with a firm ruler. This method is low cost, easy to train and feasible at the community level. Two studies compared the diagnostic accuracy of different measuring techniques (firm plastic ruler, measuring tape, footprint), and both found that the firm ruler had the highest predictive score for identifying both preterm and LBW newborns.29 32 Feasibility, training, standardisation and cost of equipment are key considerations for scalability in LMIC. In particular, standardising the landmarks used in foot length measurement is critical. The majority of studies that reported normative values for foot growth used the maximum heel-hallux distance,20–22 25 26 though several used the heel-to-longest toe distance.23 24 27 Having standardised landmarks for the distance measured is essential for both consistency of foot length measurements and comparisons between populations.
In the studies included in this review, data on the accuracy of foot length to identify preterm births were heterogeneous and generally of low quality. Only two studies used an early ultrasound-based reference standard GA,28 33 while most relied on clinical exam to determine GA, which estimates GA within ±4 weeks of ultrasound dating.49 The Eregie examination was commonly used in Africa. In this simplified examination, newborn anthropometrics (head circumference and mid-upper arm circumference) are included, and thus, dating is strongly influenced by newborn size.50 In a systematic review, out of three studies assessing the diagnostic accuracy of the Eregie examination, only one used an ultrasound reference and found that the Eregie dated pregnancies within ±3.5 weeks of ultrasound dating.28 49 In this review, among the Asian studies, the range of diagnostic accuracy ranged widely (sensitivity: 64%–98%, specificity: 35%–94%), which may be due to the variation of reference standard GA dating methods, or potentially due to the challenge of discriminating SGA versus preterm infants in settings with high prevalence of fetal growth restriction. In South Asia, this prevalence is as high as 30%. In addition, neonatal clinical examinations (used to determine reference GA in five of the eight Asian studies) have been shown to systematically underestimate GA among growth-restricted infants.49
Foot length was a reasonable proxy of infant size to identify LBW infants. The foot length thresholds to classify LBW were lower in Asia, where babies are smaller and SGA is more prevalent.1 51 For studies in Asia, a foot length cut-off of <7.7 cm identified <2500 g infants with pooled sensitivity of 87.6% and specificity of 70.9%; for identifying <2000 g infants in Asia, a foot length cut-off of <7.3 cm had 82.1% sensitivity and 82.1% specificity. In Africa, a foot length cut-off of <7.9 cm had a pooled sensitivity of 92.0% and specificity of 71.9% to identify <2500 g infants. The balance of sensitivity and specificity is a critical consideration in health systems that must weigh the increasing demand generated by identifying and referring more high-risk babies with the supply of available services and the risk of overburdening health systems. Based on our data, if a community-based foot length screening programme was implemented to refer LBW (<2500 g) infants in South Asia, where the prevalence of LBW is 30%,1 in a population of 100 000 newborns, there would be 26 400 LBW infants correctly identified, 3600 LBW babies missed and 20 300 non-LBW babies who were over-referred (false positives). Approximately 57% referred to health facilities would be truly LBW, and 93% of babies with foot length >7.8 cm would not be LBW. In sub-Saharan Africa, where the prevalence of LBW (<2500 g) is 16.4%,1 in a population of 100 000 newborns, 15 088 LBW babies would be correctly identified, 1312 LBW babies would be missed and 23 408 over-referred. Approximately 40% of referred babies would be LBW, and 98% of babies with foot length >8.0 cm would not be LBW. The local health system and public health implications should be considered for the implementation of any such screening programmes.
Training and standardisation are important considerations for programmatic implementation in LMIC. Intra-rater and inter-rater agreement was generally high for neonatal foot length measurement. Foot length measurement is advantageous, as it can be easily performed with minimal medical training. Two studies assessed inter-rater agreement between a healthcare provider or researcher and a lay community health worker or caretaker, a comparison of important programmatic relevance.44 46 In Tanzania, Marchant et al reported that community volunteers systematically undermeasured foot length compared with research staff and overestimated those needing special care in the community.46 Reliability and continued quality assurance of measurements are important considerations for the potential scale up of this tool in programmatic and research settings, especially considering reliability in a variety of users.
There are several important limitations to this review. The overall quality of studies included in the review was low, with limitations in the quality of reference standard GA data, reporting and selection bias. There is a need for more studies with high-quality ultrasound dating or best obstetric estimate as the reference standard. In addition, we did not put a date restriction on our studies, as many of the original foot length articles were published in the 1970s. However, all studies that reported diagnostic accuracy data and were included in the meta-analyses were from after 2000, with the majority published after 2010. We conducted pooled analysis by major WHO world region, though countries within these regions are heterogeneous and optimal foot length cutoffs may vary by country. Finally, we limited the scope of this review to diagnostic accuracy only, and it would be valuable to assess the effect of foot length measurement as a screening tool on referrals, care seeking behaviours and infant health outcomes. We are aware of an upcoming study in Nepal52 that will assess these outcomes.