While it is important to be able to evaluate and measure a country’s performance in health research (HR), HR systems are complex and multifaceted in nature. As such, attempts at measurement can suffer several limitations which risk leading to inadequate indices or representations. In this study, we critically review common indicators of HR capacity and performance and explore their strengths and limitations. The paper is informed by review of data sources and documents, combined with interviews and peer-to-peer learning activities conducted with officials working in health and education ministries in a set of nine African countries. We find that many metrics that can assess HR performance have gaps in the conceptualisation or fail to address local contextual realities, which makes it a challenge to interpret them in relation to other theoretical constructs. Our study identified several concepts that are excluded from current definitions of indicators and systems of metrics for HR performance. These omissions may be particularly important for interpreting HR performance within the context and processes of HR in African countries, and thus challenging the relevance, utility, appropriateness and acceptability of universal measures of HR in the region. We discuss the challenges that scholars may find in conceptualising such a complex phenomenon—including the different and competing viewpoints of stakeholders, in setting objectives of HR measurement work, and in navigating the realities of empirical measurement where missing or partial data may necessitate that proxies or alternative indicators may be chosen. These findings are important to ensure that the global health community does not rely on over-simplistic evaluations of HR when analysing and planning for improvements in low-income and middle-income countries.
- health policy
- health services research
Data availability statement
Data are available on request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Because health research (HR) systems are complex and multifaceted, current indicators pose limitations for researchers and evaluators attempting to measure their performance.
There are important challenges for the interpretation of HR performance indicators, which in turn affect their relevance, how they are eventually used and to what extent managers adopt them for decision-making.
Local officials express concern over how indicators (and their use) reflect global power imbalances that potentially undermine the importance of local knowledge and contextual understanding that poorer countries attach to these.
It is important that the global health community avoid over-simplistic performance assessments when planning for or comparing HR systems in low-income and middle-income countries.
Three decades ago, the 1990 Commission on Health Research and Development stated that strengthening research capacity in low-income and middle-income countries (LMICs) is ‘one of the most powerful, cost-effective and sustainable means of advancing health and development’.1 (pp71). The WHO more recently attempted to reiterate the importance of health research (HR) in the 2013 World Health Report, arguing that ‘all nations should be producers and users of research as well as consumers’2 (p xi). Yet it is clear that there are still large gaps in reaching this goal. Indeed, Africa remains a region of the world particularly under-represented in HR (taken here to encompass research touching on all aspects of health, disease and healthcare—from basic laboratory science to epidemiological studies, health services research and health policy and systems research. Health sciences research includes all fields that aim to develop knowledge, interventions and technology to improve health outcomes in a population.3 It supports the overall goal of improving human health through scientific research—and research more generally—with recent estimates of less than 1% of global research output produced in Africa.4 HR can serve as a source of information to guide policy and programmatic action, can generate novel products and technologies and serve as a source of highly skilled employment in a country.1 5 The current COVID-19 pandemic has highlighted another key function of national HR systems—serving as critical resources to produce and use evidence in response to epidemic outbreaks.6
Progress towards achieving goals to build capacity for HR in Africa, however, will require indicators and metrics that can be used to evaluate current levels, guide strategic directions and evaluate future achievement. Yet HR is a complex phenomenon that involves multiple inputs, processes and outputs. While some data may exist at a global level that could be used for comparisons, there is a risk that such indicators may miss critical local contexts and processes as well. This raises important questions about how best to monitor and evaluate HR in African settings given the calls for investment and development of HR. In this paper we reflect critically7 8 on the range of quantifiable indicators available at a global level, providing insights from country case studies and stakeholder workshops to reflect on challenges in their application to this critical question of how to assess progress towards building HR capacity in Africa.
The article combines conceptual and empirical work arising out of a larger project studying HR capacity in Africa that included three phases of work. The first phase searched for available data on indicators that could evaluate the current status of HR across the African continent.9 10 A second phase of in-depth qualitative work consisted of interviews undertaken in nine case study countries (Botswana, Côte D’Ivoire, Ethiopia, Kenya, Liberia, Madagascar, Tunisia, Uganda and Zambia) to study the factors shaping HR capacity in Africa. In this second phase, countries were selected to represent key regions of the continent and levels of research activity. Data from interviews with 189 key informants were conducted and transcribed (in French or English) with analysis using Dedoose qualitative analysis software. In a third and final phase, we facilitated a peer-to-peer learning process with officials from each of our nine case study countries. This consisted of two workshops held in 2019 (in Nairobi and Addis Ababa) which brought together two individuals from each country—one from the Ministry of Health (or a key Public Health agency) and the other from the Ministry of Higher Education (or similar agency). These workshops involved a range of activities designed to facilitate peer reflection and learning about how to strengthen HR capacity locally, and further permitted critical discussion about the possible indices and data used to measure HR capacity and performance (such as the data gathered in Phase 1).9 10
This paper presents a critical reflection of the indicators and metrics gathered in the first phase of the research, which represent the data available globally. Data included a range of standard indicators for measuring national capacity for HR. This included indicators proposed by the WHO Global Observatory on Health Research and Development database, for example, gross expenditure on research and development as a proportion of gross domestic product and health researchers per million inhabitants, supplemented with others identified from discussions with authors and members of a project oversight committee, for example, data on clinical trial infrastructure and regulatory environment.
This could in theory be used to measure or rank countries in terms of their HR performance. We begin with some conceptual reflections on the issue of measurement for a complex phenomenon like national HR capacity. We then provide empirical insights derived from the second and third phases of work to consider the strengths and limitations of particular indicators to capture national HR performance.
Key issues in the measurement of HR
A starting point to critically reflect on indicators applied to any set of countries is to recognise the inherently constructed nature of measurement in the first place (table 1). While quantitative indicators can give an appearance of neutrality in many ways, we know that the construction and choice of metrics in global health are not as straightforward. For example, it is a well-established fact that these may be decidedly political. This is because choices will always have to be made as to what to count in the first place, where to count it and how to use or aggregate data.11 In the global health arena, there have been particularly critical voices raising concerns over how construction and choice of indicators reflect global power imbalances12—potentially undermining poorer countries’ ability to act ‘on their own terms’,13 or sidelining the importance of local knowledge and contextual understanding to inform policy and action.14
Other related concerns have been raised in literature reflecting on the challenges in measurements of complex phenomena more broadly. HR can in many ways be seen to represent a complex system—defined by Bar-Yam as made up of several linked elements, with the relationships between these gives rise to its collective distinct behaviour and how it interacts and forms relationships with its environment.15 While much complex system thinking has arisen in the natural sciences, Munck and Verkuilen critically reflect on the challenges in developing metrics to measure complex social phenomena as well (in their case, looking at the concept of democracy16). Specifically, they highlight three key challenges related to the following: conceptualisation—the identifying and organising relevant attributes of the phenomenon; measurement—which involves selecting indicators, measurement levels, and disaggregating data; and aggregation—the level and rules used to combine pieces of data.
Measuring complex phenomena also requires that we take into consideration the fact that some direct measures of a given entity may be unobservable. Audibert, in discussing the accuracy of indicators used in HR and the potential bias of their measurement, highlight the fact that it is important to explicitly consider the objectives of the given measurement—why the measure is needed, and this would or should determine the choice of the indicators used.17 Yet for many scholarly works that attempt to measure HR, it is not clear how matrices or indicators to define the concept were selected; for example, see the bibliometric analysis done on global research production in respiratory medicine by Michalopoulos and Falagas.18 A final challenge to the development of measures of a complex indicator such as HR comes in the limitations of data availability.
One of the most well-known tools used to evaluate HR across the African continent has been the national HR system ‘barometer’—developed in response to a recommendation of the African Advisory Committee for Health Research and Development of WHO Regional Office for Africa.19 The barometer is primarily based on data coming from surveys sent to WHO country representatives with questions related to four ‘functions’ of national systems conceptualised in the 2013 World Health Report: governance, sustaining resources, producing and using research and financing. Most data consist of binary (Yes/No) answers that come from the surveyed individual. Although one piece of data—publications per capita—is taken from other data sources.20
Given its inputs, the barometer is a useful tool to provide insights into strategic areas for policy attention, but with only one comparative quantifiable indicator, it is not particularly suited to providing comparative assessments of HR performance in terms of capacity or output (either within countries over time, or across countries). In the first phase of research undertaken by our research team, however, we searched for all available quantitative metrics; table 2 below presents a list of these, which we searched for from global data sources. In addition to publications, it was possible to also gather data on clinical trials, patents, research institutions, research personnel, financing to HR and policies and regulations.9
While some indicators have fairly comprehensive coverage, others have missing data. This raises several questions about the validity and usefulness of a single metric of performance when either multiple indicators would need to be included with missing data, or whether potentially important indicators would be excluded—biasing attention to those indicators which were more comprehensive (either because they are easier to obtain, or because global funding bodies prioritised them for one or another reason).
In the next section, we provide insights into each of these indicators, informed by our broader research. We highlight the challenges or contextual factors that may need to be considered when using these, or other, metrics to evaluate HR performance; to draw attention to the risks of simple aggregation, and to illustrate what other concerns arise at local levels concerning HR performance.
In table 1 we present some of the key issues we identified with potential health research performance indicators or metrics. Our insights reflect on the use of these different common metrics to measure HR and the weakness of these as single metrics, as we experienced in this work and as reflected in interviews and discussions with local stakeholders. We structure the sections that follow according to indicators of HR which appear in a range of other assessments undertaken and that we mapped in the first phase of this research.9 21 For each of these, we look at the information that a measure of the indicator would provide as was done in that phase, and critique it showing the gaps that would leave. We then present information from our analysis and our reflection on each indicator.
Publications are perhaps the indicator with the clearest and most comprehensive data. Issues arise, however, on how to combine data meaningfully. There is a range of ways to count publications. Counting when an author/coauthor has an LMIC country affiliation provides the broadest possible scope, yet discussions with stakeholders raised concerns that it does not capture if researchers from those countries have senior roles in the research efforts. Authorship position may be more important to do this, especially if one wants to include some elements of leadership in publication metrics.22 23 While the first authorship usually involves a lead role, in health disciplines, the last authorship is often reserved for a principal investigator. Furthermore, the use of first (including co-first) and last authors only tells part of the story. Where any of the authors are not local, this is counted as of less value to the community in comparison to if they were. With this interpretation, one may miss a whole host of local coauthors and people the foreign researcher worked with, which represent capacity built through mentorship. How to value authorship position in aggregating publications becomes an unresolved question.
Finally, much of the HR in Africa is produced through international collaborations. This structural aspect of HR research pipeline has several important implications for HR capacity and performance measurement. A particular concern is how to capture true collaborative efforts which either have a strong role for local researchers or which are working to build strength and capacity for such roles—over-emphasis on first/last authorship may under-represent capacity-building efforts, while inclusion, when an author comes from an area, could potentially give weight to extractive research with tokenistic inclusion of local authors.
While almost all clinical trials are registered in major databases (eg, WHO’s International Clinical Trials Registry Platform), trials only capture clinical and epidemiological-oriented research. Other types of research including health policy and systems research, health services research or basic science work are not globally standardised and categorised in the same way as trials. Thus, while the comprehensive coverage of trials is useful, its use risks skewing attention to countries that have had a preference for, or global investment in trials, against those emphasising other critically important forms of HR.
Decision-makers interviewed also critiqued trials as a metric due to their inability to distinguish different phases of trial research. It was argued that many of the trials in African countries are phase 1 trials aimed at proving safety. Larger phase 2 and 3 trials—testing effectiveness, and requiring more capacity and resources—were said to be much rarer on the continent. In addition, clinical trials have drawn a lot of criticism as a measure of research but also when it comes to ownership and benefit from their products. As one Liberian researcher explained:
… for over 30 years the New York Blood Centre worked with us here in [Liberia Institute for Biomedical Research] for the development of Hepatitis B vaccine. But all the baseline work that was done for the development of that vaccine was done right here in Liberia …………, for us, it was a slap in the face, after all those of [sic] years of working to come out with this vaccine.…… So today, 90% of the work that was done on that vaccine was done here in Liberia, but we cannot even afford a dose. KII L08
As illustrated, there was concern that the efforts, contributions and labour (including intellectual labour), provided by African individuals and institutions to develop new pharmaceuticals were negated when they are not able to have some ownership in, or even afford the products of HR. Existing metrics risk failing to capture if trials are run in-country, from in-country research, as opposed to run in an extractive way by international partners.
Patents can help measure innovation performance.24 They thus are attractive as a metric in measuring HR if they can capture research activity linked to new and potentially profitable technologies. Therefore, patent data may be used for performance evaluation in research and development organisations. Yet patent data has several limitations for inclusion as an HR metric of performance as well. Knowing and measuring the quality of innovation is important, with radical or breakthrough innovations seen to be particularly important for organisations.25 But not all patented inventions turn out to be innovations, and many innovations are never patented.26 Shepherd has pointed out in the past that ‘Patents are a notoriously weak measure of innovation—Most of the 80 000 patents issued each year are worthless and are never used. Still others have negative social value’.27
While discussing patents, innovation and commercialisation of HR, respondents in this study highlighted one of the challenges of using patents as an indicator of HR capacity. They mentioned that patents do not account for the institutional capacity in government to support researchers and universities and providing tools, guidelines and bridging assistance to link universities to industry in this process. They also highlighted the fact that there is a paucity of information regarding processes concerning patenting and leadership that can guide young researchers through that process where it exists. As one Ugandan respondent explained:
If someone had an innovative idea, can you get it patented? How does that happen? What does the law say? How do you get protected? … I think that information isn't as widely spread as it needs to be.…. Second, like you say, the mentorship for that helps you along. And whether you have leaders that are able to guide you as you undertake your research.…. KII U13
There is no standard metric of institutions conducting research. Some sources attempt to count or rank universities according to various criteria, but other types of research institutions may be excluded from such efforts, for example, private research institutions and non-government organisations. One example highly active in several African countries is the UK’s Medical Research Council—which produces HR activity similar to, or exceeding that in, academic institutions. In our case study countries, we found a significant proportion of universities could only do basic teaching and are not yet fully equipped for research. As an Ethiopian respondent explained:
Some are first-generation universities…, 22 assistant professors that’s it. And I was asking them what kind of research, “we don't have research”. KII E12
For non-university laboratories or research centres, there is again a need to consider capacity and quality. One Ugandan researcher, for instance, reflected that the proliferation of smaller laboratories may not be as important for key research needs, explaining the importance of ‘higher level laboratories, rather than similar capabilities just spread all over’. KII U3
Another key concept identified in the literature and from interviews in countries was the importance of so-called Centres of Excellence (CoEs). CoEs may be taken to be organisational environments that strive for and succeed in developing high standards of conduct in a field of research, innovation or learning.28 There are complexities in understanding how CoEs reflect HR performance or can build capacity. Some scholars have argued that funding strategies that target diversity, rather than ‘excellence’, with funders of scientific research unsure whether it is more effective to give large grants to a few ‘elite’ researchers, or smaller grants to many researchers.29 This raises difficult questions about whether CoEs should be seen as an outcome of capacity building, or a strategy, and in turn how to interpret metrics based on them.
While research personnel represents another key input to HR systems, this is another indicator with no standard measurement. Thus, how it is conceived and measured may have important implications for how well it actually serves as an indicator of HR. Existing data can include individuals with PhDs, or counts of researchers with different designations (eg, laboratory technicians). Phase 1 of this research found UNESCO to be the only source of fairly comprehensive data on researchers per capita, but it was not comprehensive for all countries and did not distinguish health from other research staff.30
Existing counts also do not capture other important human elements which were seen in empirical work to be particularly important for resilient and efficient research systems—particularly research-support persons including mentors, research leadership, grants’ managers, clinical research managers and so on. Also, counts alone of staff do not necessarily capture the quality of researchers. A good proportion of these in a few LMICs only begin to interact with research at their postgraduate level, in fact after their masters’ degrees and for others at the doctorate training level. So, for example, when counting doctoral-level personnel, one may need to remember that their capacities will be different. A few may be unable to work independently or confidently even when labelled senior researchers.23 Furthermore, what we also need to consider beyond the numbers and the cadres of researchers, is how long researchers remain at given posts, for example, their journey from assistant researcher to professor of research. Many universities in Africa do not have the research track and cannot measure their researchers’ capacity or progress, leave alone incentives to ensure their retention. Respondents in Ethiopia and Uganda explained:
I don't think it [the human resource gap] is due to the in-availability of researchers in the country, but it is directly related to the motivation and retention mechanisms that we are using. KII E07
We are beginning to export our senior researchers to other places, [which] is a good thing, but … also a risk because I think the way we pay scientists remains unacceptable. So, scientists fend for themselves, have to scavenge for research and keep themselves going. But there is no structured way to keep them. KII U3
Resources for HR
HR fundamentally relies on funding, and as such attempting to quantify expenditure on HR can be an important indicator of HR performance. However, there are several challenges to using it as a performance indicator. First, there is no standardised source of data or database of research expenditure for health in many countries. Individual nations may have budget lines for HR, but many are not clearly indicated. Global data sets on research and development funding typically do not specify health-related research and may suffer from gaps in coverage across the African continent.
Data issues aside, there are also conceptual challenges in terms of what is actually captured by gross levels of expenditure. A great deal of research in Africa is funded from external sources, and while this no doubt supports some important research, it has been argued that donor-financing biases research agendas to certain preferred countries, activities or diseases.31 32 Funding streams for HR in Africa have been described as fragmented, characterised by several small and short-term grants that may not contribute to the long-term development of a holistic system.33 34 Stakeholders in our empirical work greatly emphasised the importance of ownership and local direction of HR as key.
We also posit that expenditure as a metric may also risk missing strategic investments that facilitate HR—such as the building of roads, utility provision and maintenance including electricity and water, provision of broadband internet and so on. These may be important but neglected if not captured in indicators of progress. Finally, empirical work identified non-financial resources that simple evaluations of expenditure could not capture. This included a culture of scientific research—capturing elements such as the perceived value of research or a strong research community; or ensuring a competitive research environment.
Policies and regulation
Unlike other metrics discussed above, policies and the state of regulation for HR is not as standard or common an indicator for performance/capacity. Scholars are increasingly highlighting the importance of investing in, and measuring governance indicators of HR systems.35 In theory, the existence of national guidelines and legislation in regard to research, or the existence of bodies such as institutional review boards (IRBs—used for ethnical review) can capture key elements of the enabling environment for research activity. Yet the guidelines and legislation lack a clear indication of how to assess or measure them. Whether they are well developed, help drive research activity or implemented at all would all be critical to know beyond simply presence or absence. The presence of IRBs may also reflect the development of clinical or epidemiological work, as opposed to other forms of HR. While it is the general understanding that IRB oversight effectively protects research subjects and environments from ethical breaches and other risks, many IRBs do not go through any kind of systematic appraisal and there is a paucity of data on the quality and performance of these review boards.36
In this paper, we provide insights arising from literature, conceptual and empirical explorations into HR in Africa to reflect on the usefulness or limitations of the variety of quantifiable indicators available internationally one might use to consider evaluating performance on developing health sciences. Our findings highlight a range of challenges that currently exist to any ultimate goal to develop a metric of HR capacity. These challenges arise from the conceptualisation, measurement or aggregation of metrics. For example, respondents in this study highlighted how there are difficulties in conceptualising and measuring quality in individual research capacity, leaving questions lingering about what constitutes research staff and how to measure the progress of certain cadres. Furthermore, the aggregation of the different attributes of research capacity and performance into a single metric that can be compared across jurisdictions remains a major challenge. Some of these challenges have to do with the complexity of HR systems themselves—whereby the ultimate goal of research capacity and productivity is a collective product of multiple inputs, supporting factors and processes working in ways that suit local contexts. Yet many quantifiable measures say little about support factors, local contextual appropriateness and interactions between elements. The table below distills some of the key insights from our empirical and conceptual discussion above, using the categories of conceptualisation, measurement or aggregation to pull out some of these key issues.
In addition to the aggregation of individual indicators, there is the overarching issue of aggregation into a single metric or score that can be used to compare countries or judge progress. From our empirical insights, there was a recurrent concern that the production of such a score risks obscuring local contexts and specifics in the search for a single applicable metric of ‘success’ in relation to national HR evaluation. Stakeholders working within governments were particularly cognizant of the fact that the selection of indicators to define an entity is very much a subject of who is doing the measuring, and could have implied agendas or values that may not align with local views.
On reflection, it may be that improving the use of metrics may require process changes as well as simply different indicators. That is to say, institutions applying metrics may need to carry out a more comprehensive contextualisation around the entity to be measured and the indicators to be used. There may also need to be more explicit reflections on why one is measuring a phenomenon, and what information would be most appropriate—including how to measure them and how and why to aggregate if necessary. This may require drawing on the input of topical or content experts to analyse the contextual meaning placed on the entities. But formalising processes and expectations such as these around the use of indicators may be a useful way to avoid oversimplistic application of data.
The sociologist William Bruce Cameron famously highlighted the fact that not everything that is counted counts, and not everything that counts usually gets counted,37 highlighting the critical need to reflect on purposes and values, and to embed measurements in context, whether institutional, setting or otherwise. This paper recognises the importance of monitoring, guiding and evaluating progress in relation to improving HR within and across countries. Our reflection highlights gaps and opportunities for stronger measurement efforts. Furthermore, our reflection challenges stakeholders, especially recipients of results of measurement of complex phenomena like HR capacity to demand and provide for appropriate measures and indices. While our work presents important reflective ideas, we acknowledge that because of the methodology we used, it may not answer pertinent questions about validity, reliability and accuracy of given metrics in any one context. However we hope that it begins the conversation that would lead to further work on this. Furthermore, while this paper ultimately identifies a number of critical challenges in the use of existing indicators, it does so to help to improve efforts that may be working to achieve improvements in HR systems—in African nations, and beyond.
Data availability statement
Data are available on request.
Handling editor Seye Abimbola
Twitter @rhona_ona, @justinparkhurst, @clarewenham
Contributors RM conceived of the presented idea. All authors participated in developing the theoretical concept further. CW and JP were in charge of overall direction and planning of the overarching research. RM, CJ, PAJ and JLS-T collected the primary data to feed into the study. RM, JP, CJ, PAJ, JLS-T and CW participated in the peer-to-peer learning sessions with the policymakers on this research. RM, JP, CJ, PAJ, JLS-T and CW were involved in data analysis and the writing of the manuscript. RM, JP, CJ, PAJ, JLS-T and CW were involved in reviewing different drafts of the manuscript and contributed to the final version of the manuscript.
Funding This study was funded by Wellcome.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.