Human Development

Using machine learning to target neonatal and infant mortality

  • Blog Post Date 17 May, 2022
  • Articles
  • Print Page

India accounts for one-fourth of the world’s neonatal mortalities, and this has likely been exacerbated by the Covid-19 pandemic – due to lockdowns and lack of access to critical antenatal and postnatal care. Analysing 2011-12 India Human Development Survey (IHDS)-II data, this article uses machine learning to build predictive models of neonatal and infant mortality incidence, and identify the early warning signs, and consequently those at high-risk of neonatal and infant mortality.

Under-five mortality, infant mortality, and neonatal mortality are key indicators of economic development1. While India has made considerable progress in reducing under-five and infant mortality over the last few decades, this progress did not result in a proportional reduction in neonatal mortality which fell at a much slower pace. Data from four rounds (1992-92, 1998-99, 2005-06, 2015-16) of the National Family Health Survey (NFHS) show that neonatal mortality as proportion of infant mortality and under-five mortality has in fact risen steadily. Neonatal mortality alone is now three-quarters of infant mortality (Figure 1).

Figure 1. Neonatal, infant, and under-five mortality rate across four rounds of the NFHS

Note: U5MR stands for under-five mortality, IMR stands for infant mortality, and NMR stands for neonatal mortality.

This indicates that special focus should be given to reducing neonatal mortality over and above infant or child mortality. India accounts for one-fifth of the total births around the world, and one-fourth of the world’s neonatal mortalities (Sankar et al. 2016). This high neonatal mortality has been exacerbated by the Covid-19 pandemic and its lockdowns, which had consequences for access to critical antenatal and postnatal care. Decades of progress has unravelled in months – the true extent of which will become more apparent in the near future (Gates Foundation, 2020).

However, as Dandona and Kumar (2019) point out that widely used datasets such as NFHS do not collect data on several known risk factors for neonatal mortality – such as complications during pregnancy and delivery – or the availability and access to commonly recommended interventions, making them insufficient for policymaking. Although the explicit target under the Sustainable Development Goals 2030 is to reduce neonatal mortality to 12 deaths per 1,000 live births by 2030, India adopted a more ambitious goal of achieving single digit neonatal mortality rate through the “India Newborn Action Plan” (INAP) modelled after the global “Every Newborn Action Plan”. The India Newborn Action Plan recognises delays in the timely identification of danger signs to newborn health to be a crucial bottleneck for healthcare delivery. The INAP also recommends training community health workers to identify early danger signs for neonatal and infant mortality. However, as Di Giorgio et al. (2020) show, poor clinical knowledge among healthcare workers regarding common conditions that lead to child mortality is a crucial constraint – even greater than constraints in the availability of drugs or healthcare worker in developing economies.

Using machine learning to target neonatal and infant mortality

In a recent study (Brahma and Mukherjee 2022), we use the 2011-12 India Human Development Survey- II (IHDS-II) to develop early warning systems for neonatal and infant mortality. We use multiple parametric and non-parametric machine learning (ML) algorithms to build predictive models2 for the incidences of neonatal and infant mortality. Based on the consensus of the results from these algorithms we identify several early warning signs for neonatal and infant mortality. These early warning signs can be used to screen mothers and infants, and identify those at a high-risk of neonatal mortality. Screening for these early warning signs does not require advanced medical knowledge and can be easily identified by community healthcare workers.

However, an econometric challenge in modelling the incidences of neonatal and infant mortality is the issue of ‘rare events’; in household survey data, incidences of neonatal and infant mortality are fewer in number compared to the newborns that remain alive. Traditional econometric techniques and ML techniques tend to underpredict these events that is, they do not accurately identify most of these deaths. We combine boosted classification trees with two resampling techniques3 to achieve prediction accuracy comparable to the easier-to-interpret traditional ML techniques.

We use LASSO (Least Absolute Shrinkage and Selection Operator)4 to select the most important predictors of neonatal and infant mortality out of these. We use non-parametric ML algorithms like random forests and boosted classification trees to rank order the predictors based on commonly used measures of variable importance. The consensus of the results from these algorithms allows us to identify characteristics that comprise early warning signs for high-mortality risk groups (Table 1).

We find that newborns whose mothers have prior-born children succumb to death, who are first-borns; born in poorer households; born being lower-than-average weight; whose mothers experienced complications during delivery (delayed placenta, excessive bleeding, being stuck during delivery etc.); and did not receive cash under JSY have a higher risk of neonatal and infant mortality. In case of infants, girls and those who did not receive essential vaccinations are at a higher risk of infant mortality. Infants who receive early healthcare are at a lower risk of mortality5. Additionally, we employ Group LASSO to assess the relative importance of groups of predictors6. We find that the mother’s demographic and biological characteristics are most important, followed by medical care, household characteristics, and finally by the biological characteristics of newborn.

Table 1. Early warning signs of neonatal and infant mortality

Neonatal Mortality

Infant mortality

Prior deaths

Prior deaths

Polio vaccine

Total births

Total births

BCG (bacille Calmette-Guerin) vaccine

First-born

First-born

DPT (Diphtheria-pertussis-tetanus) vaccine

Income

Income

Measles vaccine

JSY cash transfers

JSY cash transfers

Newborn Care

Mother’s age at time of giving birth

Mother’s age at time of giving Birth

Female

Complications during delivery

Complications during delivery

Folic acid supplement tablets

Perceived birth size7

Perceived birth size

Folic acid supplement tablets

Conclusion

Identifying the early-warning signs would help clearly lay out a roadmap to implement the action points discussed in the India Newborn Action Plan. Achieving the ambitious goals of single-digit neonatal mortality within a decade will need aggressive and targetted policies. The early warning indicators we identify do not require advanced clinical knowledge and can easily serve as an important first triage in identifying high-risk individuals.

Notes:

  1. Under-five mortality, also known as child mortality, refers to the death of children under age five, infant mortality is death within the first year of birth, and neonatal mortality is death within the first 28 days of birth.
  2. We incorporate a range of socioeconomic characteristics of the households; biological and demographic characteristics of the mothers and newborns; incidences of complications during pregnancy, delivery, and post-partum; nature of the healthcare received by expectant mothers and newborns; essential immunisations received (for infants); and cash received under a conditional cash transfer maternity benefits scheme – Janani Suraksha Yojana in the model.
  3. We use the SMOTEBoost and RUSBoost resampling techniques. SMOTEBoost is an upsampling technique which creates synthetic observations for mortality, and gives the ML algorithm more instances to learn from. RUSBoost is a downsampling technique which drops a random subset of live observations to remove some of the distractions for the algorithms.
  4. LASSO is a regression analysis method that performs variable selection and regularisation to improve the prediction accuracy and interpretability of the resulting statistical model.
  5. Our results also hold up to several robustness checks.
  6. These include household characteristics, demographic and biological characteristics, medical care, newborn characteristics.
  7. Mothers were asked to perceive the size of their newborns to assess health status. Unlike actual measurements of weight and height at birth, this is an ‘eyeball estimate’ or mother’s perception of her infant’s health.

Further Reading

No comments yet
Join the conversation
Captcha Captcha Reload

Comments will be held for moderation. Your contact information will not be made public.

Related content

Sign up to our newsletter