Introduction
Individuals commit a growing proportion of their personal and private data to digital devices during routine use; and simultaneously, technological advances for saving, storing, duplicating and transferring digital data mean that replicating and sharing datasets has become much easier to facilitate. Health systems can leverage these data and bring evidence-based depth to intervention design, programme management and performance assessment. This has led to a rapid expansion of technology use in the health sector to both generate and share large, granular and informative data. These advances in digital approaches to data open avenues for rethinking how we handle data across health systems levels to advance the health of vulnerable populations in low-income and middle-income countries (LMICs).
Near ubiquitous access to mobile phones has raised the profile of mobile phones as tools for improving patient-provider communication, access to health services and information and data collection (reviewed in Ref. 1). Rapid implementation of mobile and digital tools in the health sector, however, has triggered concerns. The digital health ecosystem is particularly vulnerable to data misuse because it combines extremely sensitive health data with digital platforms that are well suited to replication and dissemination of datasets. With an online personal computer or mobile device it is possible to copy and disseminate huge datasets almost instantaneously, which increases the risks of inappropriate data sharing, and makes it harder to contain or reverse data breaches: the stakes are much higher for digital datasets because once shared, it is almost impossible to track down or delete copies of those data. This is further exacerbated by the complexity of the data flow involving multiple channels with a range of stakeholders and points of exposure—from individuals to data consumers, via data collectors; through mobile devices, interoperability layers and intermediate databases; and to databases where the data are permanently held.2 Finally, the potential for unconsented commodification of collected data, whereby individuals cannot control or access how their data are being shared, reused or commercialised, poses a significant risk. Collectively these risks are particularly heightened in low-resource settings marked by deep intersectional inequalities, and where governments are lagging behind in implementing data protection policies and regulatory oversight to ensure protection of personal information.
Given emerging opportunities and risks in digital data use, here we propose a data governance framework to reduce risks of data misuse while promoting increased access for potential benefit, in the context of expanding scope, depth and coverage of data-driven digital health interventions. We present four key domains in which data governance structures can be articulated and implemented to ensure appropriate data stewardship: (1) ethical oversight and informed consent processes, (2) data protection through data access controls, (4) sustainability of ethical data use and (4) application of relevant legislation.
Our primary aim is to provide an overview that will assist stakeholders to understand the key elements required for good data governance within digital health systems (summarised in box 1), so as to ensure maximal benefit while meeting universal ethical standards. This framework is derived from our own experiences working with digital health data in South Africa and India, and is intended to provide a practical framework to assist others similarly developing their own data governance structures. We also highlight key elements from legislation on data protection that are relevant to health programmes. We illustrate our framework with examples drawn from LMICs, and our experience working in programmes in South Africa and India, but we believe that the principles are universal.
A checklist for implementing digital data governance principles
Ethics and informed consent
Vulnerable populations are identified and appropriate resources assigned for protection of their data.
Tiered consent process is clearly delineated and each level of consent is stored.
Patient information describes in detail intended data use, storage and future destruction.
Option to withdraw from study with data deletion is clearly outlined for participants.
Data access
Procedural oversight
Put in place clear procedures for processing data access requests which include oversight by key stakeholders.
Define protocols to guard against data commodification.
Articulate important metrics for assessing access requests, which may include:
Geographic locations of data requestor and requested data.
Fair representation of all stakeholders with sensitivity to postcolonial inequities and appropriation.
Providing minimum data to service requests without unnecessary exposure of sensitive data.
Maximising permissible benefit from appropriate data use.
Avoid person-centric gatekeeping around data and establish committees, standard procedures and guidelines for data use together with government stakeholders.
Structural controls
Install appropriate remote-delete software on devices in case of loss or theft.
Restrict app installations and personal use on devices used to collect participant data.
Separately store and transport identifying and sensitive/clinical data.
Store data in secure, firewall-controlled and access-controlled locations.
Where possible work within secure digital environments used by local health departments.
Sustainability
Build an interoperable data structure so that data can be easily shared where appropriate.
Provide up-to-date documentation, consent information and codebooks for all datasets.
Establish a data backup plan for frequent back up to secure locations.
Implement a long-term data storage and management plan that is not dependent on particular individuals or organisations.
Legal Framework
Familiarity with relevant sections of all local/regional legislation pertaining to Healthcare, Protection of Privacy, Access to Personal Information Acts.
Identify the entity responsible for the data and key stakeholders, in collaboration with government structures.
Facilitate review by local regulators where necessary.
Comply with restrictions on moving data across borders, including identifying related issues with Cloud storage.