Self-enrolment antenatal health promotion data as an adjunct to maternal clinical information systems in the Western Cape Province of South Africa

Information systems designed to support health promotion in pregnancy, such as the MomConnect programme, are potential sources of clinical information which can be used to identify pregnancies prospectively and early on. In this paper we demonstrate the feasibility and value of linking records collected through the MomConnect programme, to an emergent province-wide health information exchange in the Western Cape Province of South Africa, which already enumerates pregnancies from a range of other clinical data sources. MomConnect registrations were linked to pregnant women known to the public health services using the limited identifiers collected by MomConnect. Three-quarters of MomConnect registrations could be linked to existing pregnant women, decreasing over time as recording of the national identifier decreased. The MomConnect records were usually the first evidence of pregnancy in pregnancies which were subsequently confirmed by other sources. Those at lower risk of adverse pregnancy outcomes were more likely to register. In some cases, MomConnect was the only evidence of pregnancy for a patient. In addition, the MomConnect records provided gestational age information and new and more recently updated contact numbers to the existing contact registry. The pilot integration of the data in the Western Cape Province of South Africa demonstrates how a client-facing system can augment clinical information systems, especially in contexts where electronic medical records are not widely available.

Information systems designed to support health promotion in pregnancy, such as the MomConnect programme, are potential sources of clinical information which can be used to identify pregnancies prospectively and early on. In this paper we demonstrate the feasibility and value of linking records collected through the MomConnect programme, to an emergent province-wide health information exchange in the Western Cape Province of South Africa, which already enumerates pregnancies from a range of other clinical data sources. MomConnect registrations were linked to pregnant women known to the public health services using the limited identifiers collected by MomConnect. Three-quarters of MomConnect registrations could be linked to existing pregnant women, decreasing over time as recording of the national identifier decreased. The MomConnect records were usually the first evidence of pregnancy in pregnancies which were subsequently confirmed by other sources. Those at lower risk of adverse pregnancy outcomes were more likely to register. In some cases, MomConnect was the only evidence of pregnancy for a patient. In addition, the MomConnect records provided gestational age information and new and more recently updated contact numbers to the existing contact registry. The pilot integration of the data in the Western Cape Province of South Africa demonstrates how a clientfacing system can augment clinical information systems, especially in contexts where electronic medical records are not widely available.

InTroduCTIon
Prospective antenatal identification of the fact and clinical characteristics of pregnancies is an important health information system goal. It enables monitoring of antenatal risk screening and of the uptake of appropriate interventions, with the opportunity to potentially intervene in time to impact outcomes. Full enumeration of pregnancies and associated birth outcomes at person level, even if not resulting in interventions, is further an important part of health system intelligence, enabling much more detailed exploration of the maternal and neonatal services than is possible from aggregate data as traditionally reported through district health information systems. 1 A mobile health messaging service and helpdesk for South African mothers (MomConnect) was launched as a national initiative in 2014 with the dual intent of providing a

Key questions
What is already known? ► Prospective identification of pregnancies enables monitoring of antenatal risk screening and the uptake of interventions in time to impact outcomes. ► Enumerating pregnancies and outcomes at person-level enables a more detailed exploration of maternal and neonatal health services than what is possible from traditionally reported aggregate data. ► The MomConnect programme is an information system designed to support health promotion and is a potential source of clinical information that can be integrated with data traditionally collected by health facilities to create a comprehensive maternal and neonatal care cascade.
What are the new findings?
► The pilot integration of MomConnect data with existing clinical data in the Western Cape Province of South Africa demonstrates how a client-facing system can augment clinical information systems. ► Linkage was successful in three-quarters of registrations in spite of the limited identifying data available on which to link. ► Those at lower risk of adverse pregnancy outcomes were more likely to register for MomConnect.

BMJ Global Health
platform for health promotion through supportive text messaging to mobile phones of pregnant women and of establishing a registry of pregnancies. 2 3 Information systems designed to support health promotion through self or facility-based enrolment, or a combination, such as the MomConnect programme in South Africa, are potential sources of clinical information which can be integrated with data from traditional facility-based information systems as part of a comprehensive maternalneonatal care cascade. Such a cascade can be used for direct service delivery support and for health system intelligence. Data derived from these client-facing information systems, often based on mobile device interfaces, can be of particular value in settings where clinical data are not routinely digitised at health facilities, as is the case in many resource-limited settings where records are paperbased and retained by patients themselves (common for maternity case records) or at facilities. The aim of this cross-sectional analysis is to demonstrate the feasibility and value of linking records collected through the MomConnect programme for maternal cellphone-based health promotion messaging, to an emergent province-wide health information exchange in the Western Cape Province of South Africa. We describe the characteristics of provincial public sector patients enrolling in MomConnect relative to all pregnant women, determine the linkage success and associations given the limited data available on which to link, estimate the incremental contribution to consolidated clinical and administrative data on pregnancy and explore outcomes for linked patients.
THe WesTern CApe seTTIng: A provInCe-WIde HeAlTH InformATIon exCHAnge Within the Western Cape Province of South Africa, the Western Cape Government Health (WCGH) Department employs a variety of electronic platforms for routine delivery of healthcare. These platforms include hospital and primary care administrative systems; facility pharmacy and prepackaged chronic drug dispensing systems; and laboratory records. Routinely collected clinical information regarding maternal and child health is restricted to key indicators, such as antenatal visits and immunisations, and is largely reported at aggregate level through the district health information system for the purpose of monitoring and evaluation as well as resource planning. 1 Although these data are useful for analysing aggregate outcomes, the prospects for patient-level interventions and detailed analyses are limited. The WCGH has established a unique patient identifier which is also used as the folder number within each facility. This has been gradually implemented through a uniform hospital information system in all 52 hospitals over the past two decades and was extended to primary care clinics beginning in 2007. This unique patient identifier enhances the integration of health data in a patient-level health information exchange, the Provincial Health Data Centre (PHDC). The data are uploaded daily from their source systems and linked to individuals in the patient master index (PMI) based on this and other identifiers. In practice, some individuals have multiple folder numbers, for which de-duplication algorithms are used to identify duplicate folder numbers that most likely represent one individual.
Data are further enriched on uptake by identifying a variety of common health episodes such as HIV and pregnancy. Multiple types of evidence are collated to ascertain pregnancy, including laboratory, pharmacy and facility visit data. While some data unequivocally infer pregnancy, such as birth records, inpatient International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD10) admission and procedure codes that specify pregnancy, antenatal laboratory screening tests (such as for rhesus antibodies) and dispensed drugs specific to termination of pregnancy, others are only suggestive or provide supporting evidence of ongoing care for an already established episode. Examples of supporting evidence include patient encounters at maternity wards or non-specific pregnancy-related dispensed drugs such as folate and iron supplements. The nuances of evidence strength and collation of multiple types of evidence are used to build confidence in the episodes.
In the South African setting, the identification of HIV-infected pregnant women not yet on antiretroviral therapy (ART) and of women who have not been tested for HIV are important risks, in addition to traditional obstetric risks such as maternal age, existing medical conditions and comorbidities, and conditions of pregnancy such as high parity, gestational diabetes and eclampsia. 4 HIV status is determined from many different electronic sources, including laboratory tests for HIV, CD4 count and viral load, pharmacy records of dispensing of HIV-specific medication and inclusion in the HIV-specific TIER.Net database. 5

Key questions
What do the new findings imply? ► Encouraging pregnant women to enrol in health promotion programmes like MomConnect early in their pregnancy may improve their adherence to antenatal care and in turn increase the likelihood of positive birth outcomes. However, there was clear selection bias in those enrolled, cautioning against causal interpretations when looking at programme outcomes. ► The MomConnect programme provided gestational age for a quarter of pregnancies, which is not reliably available at person level from any of the existing clinical information systems in the Western Cape Province of South Africa. ► The lower participation in teenage mothers, who are at higher risk of adverse pregnancy outcomes, indicates that healthcare workers need to focus on providing health promotion initiatives to high-risk groups.

BMJ Global Health
InTegrATIng self-enrolmenT dATA WITH ClInICAl InformATIon sysTems The WCGH receives the provincial complement of the MomConnect data in order to integrate records from the public sector facilities into the PHDC. Although the MomConnect registration process is supposed to take place at fixed facilities facilitated by a healthcare provider, due to the limited identifiers available which do not include a folder number, and in keeping with the notional possibility that the system may in future include registrations that have been completed entirely by patients, the data are treated as coming from a source which is not linked to the PMI and are managed similarly to community-based data sources (discussed below). In addition, MomConnect data are treated as supporting evidence of pregnancy, and as such pregnancy episodes are not created based on MomConnect registrations alone. For each pregnancy episode, all sources of evidence supporting the identification of the pregnancy, as described above, are stored, so pregnancies with and without MomConnect registrations can easily be identified.
lInkIng pATIenTs from CommunITy-bAsed dATA sourCes To HeAlTH fACIlITy dATA Data from services outside health facilities, for example, from community health workers, may come from individuals not yet encountered at public health facilities. Linking data from these individuals to the PMI must therefore include ongoing retrospective scans of unlinked individuals from community-based services in the event that the individual has subsequently visited a health facility and has been assigned a folder number. Similarly, ongoing linkage of enrolment data from MomConnect requires both prospective and retrospective linking. The unlinked records must be retained because they represent individuals who may subsequently require healthcare. Once they access public sector facilities, information about their retrospective access to community-based or self-enrolment services is material to their longitudinal history of access to services. Available identifiers for linking MomConnect records to the PHDC PMI were South African national identification number (SA ID), date of birth, mobile phone number and sex. To improve linkage, we used two additional pieces of identifiable information: (1) fact of pregnancy, which was used to search for matches to pregnancy episodes where the registration date was within the pregnancy episode period; and (2) the date and facility of registration, which enabled comparison of details for patients who visited the same facility on the same day. Each identifier was ranked based on how well it would uniquely identify a person in the absence of other information; and identifiers were weighted according to their rank, whereby the lowest ranked identifier has a weight of one and each consecutive identifier is assigned a weight double that of the previous identifier. 6 Identifiers were ranked from highest to lowest (showing weight of ranking in parentheses): SA ID (64), mobile phone number (32), exact date of birth (16), similar date of birth (8), facility and date of registration (4), fact of pregnancy (2) and sex (1). For the linkage to be valid, an exact match had to be made with one of SA ID, mobile phone number or exact date of birth with matching registration date and facility. Once links were found, further restrictions were applied, namely (1) only SA ID was sufficient alone to infer linkage; other identifiers were not strong enough on their own to define linkage; (2) no linkage was inferred if SA ID records were mismatched; (3) the MomConnect record could only link to one individual in the PHDC PMI and (4) probable and possible links were inferred using the combinations of matched identifiers.
In total, 95.2% of registrations took place at a facility using an electronic platform linked to the unique folder number (table 1). Of the linked records, 9.9% were identified as duplicate records, so that the data represent 65 073 individual enrolees. In total, 73.2% of the MomConnect records could be linked to the PMI and 70.8% were linked with high confidence (table 2). There was a high reliance on the civil identifier-of the linked records, the vast majority of records (84.1%) linked using the SA ID combined with further identifying data where available. However, when the completeness of this field declined in later years, so too did the proportion of registrations which could be linked to clinical records. The percentage that were successfully linked decreased from 73.2% in 2014 to 69.8% in 2017, aligned with a decrease in the percentage of registrations with valid SA IDs from 80.7% in 2014 to 63.5% in 2017. All South Africans are issued an SA ID number at birth; however, it is only possible to receive an ID document at 16 years of age. Non-South Africans only receive an ID number once they have attained permanent residence. In the health system, the SA ID is not required to access health services, and as such in the PHDC PMI less than half of individuals have a valid SA ID recorded. While the SA ID number is not essential for the health promotion service, in the absence of other identifiers such as the folder number, the importance of this field needs to be emphasised in the registration workflow. Similarly, the accurate collection of the SA ID in public health facilities should be encouraged as it will improve linkage to other health services that do not have access to the PMI.
It was encouraging that a high proportion of the MomConnect records could be linked to known pregnancies from clinical data sources. For those without the SA ID linkage, further linkage could be established using combinations of date of birth, fact of pregnancy, mobile phone number and the registration visit to a facility on a specific date matching a visit record in the PHDC. The combination of encounter, pregnancy and date of birth as identifiers was the second highest means of linkage at 5.6% overall. Additional identifiers, including the folder number and names, would further assist linkage to clinical records. There were nevertheless some useful BMJ Global Health learnings on linkage inference in the context of sparse identifying data, including the value of limiting match sets by location, date and health condition-that is, only trying to link to pregnant women who visited the same facility on the same day-and rejecting links where there is more than one possible match.
In addition, although the PHDC had existing contact numbers for the vast majority of the individuals registered

AssoCIATIons WITH regIsTrATIon
The multivariable analysis of associations with MomConnect registration (table 3), where the registrations could be linked as described above, demonstrated that teenagers and older women, patients with the first evidence of pregnancy at a location other than a primary care clinic and patients in the metropolitan area were all less likely to register. The temporal trend towards increased registration was also evident when pregnancies ascertained in 2016 were compared with 2015, the two years with complete data available. It was not surprising that patients presenting outside of routine primary care would be less likely to register, given that those presenting for the first time at hospitals would likely have pregnancy-associated risk factors and be in larger clerical environments oriented to referral rather than first booking services. Data on parity were not available to determine if the decline in registration in older women was related to less subjective need for pregnancy advice. The lower participation in teenage mothers aligns with lower participation and adherence across a range of health conditions and services in this age group. [7][8][9] HeAlTH servICe ConTrIbuTIon of momConneCT dATA The WCGH has elected so far not to delineate pregnancies based just on MomConnect data, in case there are false registrations. However, MomConnect data are used to strengthen inference around pregnancy where there are multiple data points which are deemed to provide moderate confidence of a pregnancy episode. The current analysis has demonstrated that a meaningful proportion of pregnancies might be identified by MomConnect and no other systems, even where a high proportion of pregnancies are appearing in or can be inferred from other electronic clinical systems. Ascertained public sector pregnancies (table 4) were approximately 118 000 in 2015 and 2016 (year of pregnancy reflects the year in which the first evidence of pregnancy falls). Of these, the proportion that had MomConnect registrations as an evidence of pregnancy was 18.1% and 25.9%, respectively. Of the MomConnect registrations that linked to existing pregnancies, 64.5%-70.5% have outcome data between 2014 and 2016, broadly similar to the proportions for pregnancies without MomConnect registrations. Outcome data are still very low for pregnancies first detected in 2017 as many of these pregnancies have not yet reached term. There are a small number of pregnancies that are potentially ascertained only through MomConnect (2986 or an additional 2.5% in 2016), not being evidenced through other clinical data. Of note, in 2017 this figure is the highest, suggesting that at least some of these individuals may yet connect to public sector healthcare as their pregnancy progresses.
In order to have data about pregnancies available antenatally, there is currently a high reliance on clerical and laboratory data (eg, Rhesus antibody testing), which do not include clinical parameters such as gestational age. The MomConnect programme records the estimated delivery date, from which gestational age at registration    *Pregnancies with only MomConnect registration as evidence are not currently incorporated into the pregnancy episodes and therefore do not form part of the total count. These totals however reflect the number of additional pregnancies which could potentially be ascertained through the addition of MomConnect data.

BMJ Global Health
at present, we have assumed that 'unknown' status is most likely HIV negative status based on the knowledge that in the Western Cape the vast majority of pregnant women are screened for HIV. 10 While it is tempting to infer intervention effects from the differences in pregnancy outcomes between MomConnect registrations and other pregnancies, these differences are almost certainly the result of selection bias. For example, the fewer terminations of pregnancy among women registered in MomConnect are most likely because women intending to terminate their pregnancy would be less likely to register. Those at lower risk of adverse pregnancy outcomes were more likely to register, reflecting this substantial selection bias as to who registers with the programme. A further caution is that in spite of the reporting in this analysis of a high proportion of registrations being linked, over a quarter did not link, potentially introducing further selection bias with respect to which MomConnect registrations are being compared with other pregnancies. Nevertheless, the ability to link to clinical outcomes is an important prerequisite for long-term evaluation of health promotion activities, given appropriate study designs.
dATA governAnCe And mAnAgemenT MomConnect data represent a hybrid data source crossing self-enrolment for health promotion and clinician-mediated facility-based enrolment. In order to use these data for clinical purposes, it is important that consenting procedures are clear on the dual intent of the registration and are verified directly with the participant after registration if the registration has been completed on their behalf. In this analysis, only participants who registered at public health facilities were included. However, with appropriate consent the analysis could be extended to include private registrations which will be a valuable piece of information for the individuals who do not seek public antenatal care but choose to deliver in public facilities. An option to withdraw at any point is also required and is currently provisioned for by the system. When using self-enrolment data which are not facility-mediated or could come from outside the jurisdiction, and consent is given for use, these data need to be retained in community rather than patient databases to accommodate people who are not patients of the health system, but who might subsequently become health system users. For many community-based health services, this a

BMJ Global Health
model which enables linkage to care to be tracked for people referred from the community to health services. Similarly for self-enrolment services, such as those for health promotion, a link to formal health services can be retrospectively confirmed when the patients register at the formal health services if the historic unlinked data are retained separately in a community database.

ConClusIons
This analysis has demonstrated that a substantial proportion of pregnant women known to the public health services in the Western Cape did register with the MomConnect service, increasing over time since the launch of the initiative, although to date there are more who have not registered than who have. For those who did register, in spite of very limited identifying information available from the registration process, nearly three-quarters could be linked to pregnant women known to the public health services through a combination of clinical data sources. The MomConnect records were usually the first evidence of pregnancy in pregnancies which were subsequently confirmed by other sources and contributed data on gestational age and additional contact details. If the data were treated as reliable evidence of pregnancy without corroboration, there are a number of pregnancies which could have been ascertained only through MomConnect and for which there was no other evidence in the clinical information systems.
The MomConnect initiative has a clear contribution to make as part of an integrated information system in support of clinical services. The pilot integration of the data in the Western Cape Province of South Africa has demonstrated feasibility and value, and is a model for how hybrid information systems which are both clientfacing and intended for registering clinical events, can be incorporated into routine clinical information systems.