Abstract Improving the evidence in public health is an
important goal for the health promotion community. With
better evidence, health professionals can make better
decisions to achieve effectiveness in their interventions.
The relative failure of such evidence in public health is
well-known, and it is due to several factors. Briefly, from
an epistemological point of view, it is not easy to develop
evidence-based public health because public health inter-
ventions are highly complex and indeterminate. This paper
proposes an analytical explanation of the complexity and
indeterminacy of public health interventions in terms of 12
points. Public health interventions are considered as a
causal chain constituted by three elements (intervention,
risk factor, and disease) and two levels of evaluation (risk
factor and disease). Public health interventions thus differ
from clinical interventions, which comprise two causal
elements and one level of evaluation. From the two levels
of evaluation, we suggest a classification of evidence into
four typologies: evidence of both relations; evidence of the
second (disease) but not of the first (risk factor) relation;
evidence of the first but not of the second relation; and no
evidence of either relation. In addition, a grading of inde-
terminacy of public health interventions is introduced. This
theoretical point of view could be useful for public health
professionals to better define and classify the public health
interventions before acting.
Keywords Causality � Complexity � Epidemiology � Epistemology � Public health
Introduction
Public health interventions, complexity, and indeterminism
are strictly interrelated concepts. Briefly, public health
interventions are considered health and epidemiological
activities of high complexity. The main epistemological
feature of these complex systems is their indeterminacy;
the indeterminism of complex systems represents a strong
limitation to the claim of evidence-based public health.
Evidence-based public health (EBPH)
Improving the evidence in public health is an important
goal for the health promotion community. With better
evidence, health professionals can make better decisions
to achieve effectiveness in their interventions. EBPH is
‘‘the conscientious, explicit, and judicious use of current
best evidence in making decisions about the care of
communities and populations in the domain of health
protection, disease prevention, health maintenance and
improvement’’ (Jenicek 1997). In more recent years, the
perspectives of community members have been included,
which has helped foster a more population-centered
approach (Brownson et al. 2009); thus, EBPH has also
become ‘‘a process of integrating science-based inter-
ventions with community preferences to improve the
health of populations’’ (Kohatsu et al. 2004). Some crit-
ical points related to EBPH are as follows: What kind of
evidence is it possible to produce with regard to both
observational and experimental studies? How should this
evidence be graded (Tang et al. 2008)? How should this
evidence be translated into public health practice and
have a bearing on the decision-making process (Rychetnik
et al. 2012)? And how should this whole process of
EBPH be constructed (Briss et al. 2000)?
F. Attena (&) Second University of Naples, Naples, Italy
e-mail: [email protected]
123
Med Health Care and Philos (2014) 17:459–465
DOI 10.1007/s11019-014-9554-0
Complexity
The concept of complexity is elusive and with uncertain
boundaries. It can be defined as ‘‘a scientific theory which
asserts that some systems display behavioral phenomena
that are completely inexplicable by any conventional ana-
lysis of the systems’ constituent parts’’ (Hawe et al. 2004).
More particularly, the complex systems ‘‘are highly com-
posite ones, built up from very large numbers of mutually
interacting subunits (that are often composites themselves)
whose repeated interactions result in rich, collective
behavior that feeds back into the behavior of the individual
parts’’ (Rickles et al. 2007).
For the purpose of this paper, we consider the following
characteristics of complex systems (Plsek and Greenhalgh
2001; Pearce and Merletti 2006; Galea et al. 2010;
Tremblay and Richard 2011): the large number of inter-
acting components; self-organization; circular causality or
feedback; and emergent properties. The indeterminism of
complex systems and their unpredictability are the episte-
mological consequences of these properties.
Complex systems differ from complicated systems. The
latter also have many interacting components, but they
interact in a mechanistic manner (mechanical causality), in
which the whole is equal to the sum of their parts and
where the rules are linear causality and predictability.
Computers and airplanes are very complicated—but not
complex—systems because they must have strictly pre-
dictable behaviors. Therefore, whereas public health
interventions are closer to complex systems, clinical
interventions are less complex, having characteristics more
similar to those of complicated systems (Attena 1999).
Clinical interventions might be also more complicated than
public health interventions, but they must be less complex
(i.e. more binding, deterministic and predictive) because
they deal with ill people who need, with a high degree of
certainty, to heal or improve their disease status. Public
health interventions are less bound to this requirement, and
they can afford a greater degree of error.
Indeterminism and unpredictability
As previously shown, if indeterminism is an epistemolog-
ical consequence of complexity, and public health inter-
ventions are highly complex systems, then public health
interventions involve a high level of indeterminacy. The
indeterminism of public health interventions may be
interpreted in two ways. According to epistemic indeter-
minism, seeming indeterminism is only a consequence of
the lack of knowledge about an underlying determinism
(chance due to ignorance). Ontological indeterminism, in
contrast, describes indeterminism as a real, ontological
characteristic of nature (chance lies in the nature of things).
After a long period of dominance by the deterministic point
of view beginning with Laplace’s demon, ontological
indeterminism became dominant among the physics com-
munity in relation to quantum theory. In the field of life
sciences, in contrast, epistemic indeterminism remained
implicitly dominant. Epistemic indeterminism may be
included in the wider conception of general determinism,
which states that ‘‘everything is determined in accordance
with laws by something else’’ (Bunge 1979). Owing to
complexity, the indeterminism of complex systems—and
consequently of public health interventions—is certainly
irreducible in principle. However, though it is very difficult
to determine whether the indeterminism is epistemic or
ontological, here the paradigm of epistemic indeterminism
is sustained.
Public health interventions involve a higher degree of
indeterminism than do clinical interventions. Furthermore,
for a given degree of indeterminism, the outcomes of
public health interventions present a given degree of
unpredictability, with scientific prediction defined as the
‘‘deduction of propositions concerning as yet unknown or
unexperienced facts, on the basis of general laws and of
items of specific information’’ (Bunge 1979). Moreover, if
the indeterminism is epistemic, even predictability has
epistemic significance, so it may be gradually reduced by
repeating experiments and improving methodologies.
To this point, these considerations about the controversy
between determinism and indeterminism have very much
concerned risk factor epidemiology (Susser 1973; Kar-
hausen 2000; Parascandola and Weed 2001; Olsen 2003;
Parascandola 2011).
This paper presents an analytical explanation of the
complexity, and consequently the indeterminism and the
unpredictability, that underlie public health interventions.
Then, starting from the characteristics of public health
interventions, four typologies of evidence are discussed.
Complexity of public health interventions
Public health interventions are intended to promote health
or prevent ill health in communities or populations (Ry-
chetnik et al. 2002). They are distinguished from clinical
interventions (mainly clinical trials), which are intended to
treat groups of ill people. Public health interventions
include a wide range of activities: policies, laws, and reg-
ulations; organizational or community developments;
education of individuals and communities; engineering and
technical developments; service development and delivery;
and communication, including social marketing (Rychetnik
et al. 2004). More synthetically, public health interventions
can be divided into three broad categories: clinical;
behavioral; and environmental.
460 F. Attena
123
Clinical prevention interventions are those conveyed by
health-care providers, often within a clinical setting (e.g.,
vaccines), and interventions using some kind of drug for
preventive purposes (e.g., statins in primary prevention).
They differ from clinical interventions because they
involve healthy people. Behavioral strategies include
health-promotion interventions, in which people are moti-
vated to modify unhealthy behavior (e.g., stopping smok-
ing). Environmental interventions are those that society can
impose by acting on the environment (e.g., water purifi-
cation) (Haddix et al. 1996). When a public health inter-
vention provides an evaluation of outcomes, we enter the
field of observational or experimental epidemiology. Both
within the context of observational and experimental epi-
demiology, the characteristics of public health interven-
tions with respect to clinical interventions—and
particularly with respect to randomized controlled trials
(RCTs)—are reported below and synthesized in Table 1.
Here, we largely make an evaluation of health-promotion
interventions, though sometimes also of environmental
ones. Clinical prevention interventions, which are similar
to clinical interventions, have been excluded.
Length of the causal chain
Theoretically, and with the exclusion of confounding fac-
tors, in clinical trials the causal chain is constituted by two
elements: the treatment or clinical intervention (cause) and
the outcome (effect). Sometimes, so-called intermediate
variables are considered and evaluated, such that the chain
can appear to comprise three elements. By contrast, in
public health interventions, the causal chain is fundamen-
tally constituted by three elements: the preventive inter-
vention, the risk factor, and the outcome. The intervention
(cause) acts on the risk factor (effect), and the risk factor
(cause) acts on the outcome (effect). To evaluate the
effectiveness of the intervention, it is possible to examine
the first and/or the second relation (Fig. 1).
Dual outcomes
The consequence is that the public health interventions
have two levels of evaluation—reduction of the risk factor,
and reduction of the corresponding disease. For example,
in a health-promotion intervention to reduce childhood
obesity by modifying the diet, evaluation can involve either
the intervention ability to modify the diet (first level) and/
or the reduction in obesity (second level). The choice to do
neither, one, or both evaluations depends on several fac-
tors, such as the degree of context dependence, the reli-
ability of causal relations, available resources, and whether
any experimental setting has been designed.
Weakness of the causal chain
Often despite extensive research efforts, both causal rela-
tions can be uncertain: the first one simply because of the
characteristics of indeterminacy and unpredictability of
public health interventions outlined above; the second one
for the well-known limitations of risk factor epidemiology.
Starting from such issues, four typologies of evidence in
public health are presented below.
Role of other confounding/risk factors
When a public health intervention operates on a single risk
factor of disease, such as air pollution for lung cancer and
diet for obesity, we know that other risk factors also act on
the disease. Therefore, an intervention to reduce the inci-
dence of a disease can work on one or more risk factors.
For example, to reduce the incidence of cardiovascular
diseases, it is possible to achieve effects simultaneously,
such as through health education about smoking, diet, and
sedentary habits.
Role of other diseases
Likewise, when it is the intention of a public health
intervention to reduce one or more risk factors of a disease,
it can also act on other diseases that have the same risk
factor or factors. An intervention directed at smoking, diet,
and sedentary habits can also influence other diseases.
Table 1 Characteristics of public health interventions
1. Length of the causal chain
2. Dual outcomes
3. Weakness of the causal chain
4. Role of other confounding/risk factors
5. Role of other diseases
6. Plurality of interventions
7. Context dependence
8. Words as causal factors
9. Low incidence of the outcome and length of the observation
period
10. Difficulty in statistical analysis
11. Difficulty in obtaining compliance from the study population
12. Difficulty in applying the RCT design
Fig. 1 Causal chains in clinical and public health interventions
Complexity and indeterminism of evidence-based public health 461
123
Plurality of intervention
Finally, to reduce a risk factor with greater efficacy, it is
possible to achieve simultaneous acts. The fight against
smoking includes health-promotion interventions, banning
smoking in public places, and increasing the price of ciga-
rettes. Figure 2 presents a simplified model and an example of
the complex interaction of causes and effects. Other available
causal models, such as the traditional web of causation
(MacMahon and Pugh 1996) and causal diagrams (Joffe et al.
2012), do not address the whole web or chain that starts with
one or more public health interventions. Thus, though the web
of causation is a finite, definite model, because it considers
only one disease, this model can be extended to infinity for the
extensibility of all three causal elements.
Context dependence
This characteristic has been extensively investigated (Pickett
and Pearl 2001; Dobrow et al. 2004; Jackson et al. 2005;
Kemm 2006) because public health interventions work in an
environment in which social, cultural, economic, and political
factors—as well as the competence of the operators—interact.
Accordingly, it is very difficult to apply the principles of
predictability and repeatability. In other words, and unlike the
case with clinical trials, the request to yield the same result
wherever and whenever they are carried out cannot be met. Of
course, within public health interventions, health-promotion
interventions are the most context dependent—mostly
because words, not drugs or equipment, are employed.
Words as causal factors
Unlike clinical interventions and other public health
interventions, health-promotion interventions, i.e., health
education, use words as causal factors to reduce risk fac-
tors. Thus, the first relation of the causal chain is funda-
mentally a bidirectional relationship that involves a deep
interaction between delivering and receiving the interven-
tion. In a broader sense, we have entered the realm of soft
science, such as psychology and sociology, where the issue
of determinism/indeterminism is still more difficult to deal
with and where the indeterminism (ontological or episte-
mological) of human actions and interactions is deeper than
in hard science.
When we need to assess the effectiveness on the disease
outcome rather than on risk factor reduction, an additional
three well-known items of more complexity are implicated.
Low incidence of the outcome and length
of the observation period
Though in clinical trials, a high incidence of outcome is
expected over a short period, the incidence of diseases to
be prevented in public health is much lower and is over the
long term. The major consequence, in the longitudinal
studies, is the need for a very high number of participants
and a very long period of observation.
Difficulty in statistical analysis
In general, following the definition used in this paper for
complexity, statistical models are (very) complicated—not
complex—systems, as they are deterministic and predict-
able. Indeed, if we repeat the same (very) complicated sta-
tistical calculation starting from the same initial conditions,
we always obtain the same result (predictability). Moreover,
the difficulty of applying and interpreting statistical models
concerns both public health interventions and clinical
interventions. However, for two reasons there is somewhat
more difficulty when public health interventions are con-
cerned. First, the more substantial role of confounders in risk
factor epidemiology makes it more difficult to establish
causal association. Second, the outcomes of public health
interventions yield, generally, only small differences
between the treated group and the control group, making it
difficult to obtain statistical significance (Buring 2002).
Difficulty in obtaining compliance from the study
population
It is well known that the length of the observational period
and the magnitude of the sample size are related to
increased difficulty in obtaining compliance and to a high
risk of sample attrition throughout the follow-up period.
The main consequence is the introduction of selection bias,
which contributes to uncertainty and unpredictability in the
outcomes.
Fig. 2 Interaction model and an example with public health inter- ventions, risk factors and diseases
462 F. Attena
123
From the above the last well-known issue is derived.
Difficulty in applying the RCT design
The wide discussion about this issue (Campbell et al. 2000;
Rychetnik et al. 2002; Dobrow et al. 2004; Hawe et al. 2004;
Victora et al. 2004; Petticrew et al. 2012) addresses the fol-
lowing questions: because the RCT design was originally
developed for interventions that are independent of context, it
is necessary for the public health intervention to incorporate
contextual information into the design of the RCT (Pickett and
Pearl 2001). Similarly, it has been suggested that in addition to
the outcome evaluation, a process evaluation should be
included. A process evaluation can help distinguish between
interventions that are inherently faulty (because of concept or
theory) and those that are badly delivered (implementation
failure) (Oakley et al. 2006); another way is to achieve a high
level of standardization of the whole intervention (Craig et al.
2008). When it is impossible to carry out RCTs, i.e., when the
unit of intervention is a community or when it is very difficult
to implement randomization or double-blind conditions, it is
important to look at and evaluate other forms of evidence.
Returning to the characteristics of complex systems, we
have seen that public health interventions, particularly
health-promotion interventions, are composed of a large
number of interacting components and by an intricate web of
causation. These components can produce self-organization,
retroaction, and emergent properties owing to human inter-
action between delivery and receipt of the intervention;
finally, they are poorly predictable for the above character-
istics, mostly because of context dependence.
Four typologies of evidence
As we have seen, starting from the causal chain when a
public health intervention is performed, it is possible to
carry out two levels of evaluation: the effect of the inter-
vention on risk factor reduction (first relation), and the
effect of risk factor reduction on disease reduction (second
relation). Each of these relations may be evidence-based or
not according to previous knowledge or the condition of
self-evidence. When a relation is evidence-based or self-
evident, it is useless in making an evaluation. On this basis,
the opportunity to evaluate neither, one, or both relations
arises (Table 2).
Evidence of both relations
Much environmental epidemiology is included in this cat-
egory. For example, water disinfection ? microorganism elimination ? infectious disease reduction; application of filters ? reduced dioxin intake in the atmosphere ? can- cer reduction. In both these examples, it is clearly unnec-
essary to carry out any evaluation because the causal chains
are evidence-based through the contribution of current
knowledge; in the second example, the first relation (fil-
ters ? reduced intake) can even be defined as self-evident. This typology is the only one in which it is possible to
speak of evidence-based public health.
Evidence of the second but not of the first relation
Evidence only of the second relation occurs typically in
health-promotion interventions. For example, health edu-
cation interventions to reduce smoking habits ? smoking habit reduction ? decrease in lung cancer incidence. The weakness of the first relation depends on the high context
dependence of health education interventions. In fact, we
may never be sure that a successful intervention in a given
situation is successful whenever and wherever. Instead, the
second relation is ensured by risk factor epidemiology. In
such cases, it is sufficient to check the reduction of risk
factors as a good result of the preventive intervention.
Table 2 The four typologies of evidence in public health interventions
The four typologies of evidence
I II III IV
First relation Evidence-based Grading of
evidence
Evidence-based Grading of
evidence
Second relation Evidence-based Evidence-based Grading of evidence Grading of
evidence
Evaluation of risk factor reduction No Yes No Yes
Evaluation of disease reduction No No Yes Yes
Evaluation by local public health departments No Yes No No
Evaluation by experimental research centers No Yes Yes Yes
Prevalent typology of public health
interventions
Environmental,
clinical
Health promotion Environmental,
clinical
Behavioral
Complexity and indeterminism of evidence-based public health 463
123
Evidence of the first but not of the second relation
This category occurs when the risk factor involved is still
putative. For example: relocation of power lines away from
towns and populations ? reduction in extremely low-fre- quency electromagnetic field exposure ? childhood leu- kemia reduction. The first relation is self-evident, whereas
the second relation is still uncertain. In these cases, it should
be necessary to conduct an evaluation of the real reduction
in disease incidence after the preventive intervention.
No evidence of either relations
The lack of evidence of all causal chains would require an
evaluation of both relations. For example, health education
to the owners of food stores and restaurants ? improve- ment in sanitary conditions ? foodborne disease reduc- tion. It is known that health education does not always
change behavior and that improving sanitary conditions in
food stores and restaurants is insufficient to produce a
reduction in foodborne disease in the general population;
however, this second relation is a little more consistent than
the first. This typology also occurs when there is a lack of
previous studies and knowledge about the two relations.
This model can be integrated with other models that
evaluate and grade evidence in public health (Briss et al.
2000; Tang et al. 2008). Such grading can be applied
separately to both causal relations.
The four typologies are useful in distinguishing pre-
ventive interventions conducted in local public health
departments and in experimental research centers. With
minor expertise and resources, the former types of insti-
tution can apply the first and second typologies, whereas
the latter types are also able to apply the third and fourth
typologies, which involve more complex evaluations of the
second relation.
Outcome Measurement in Economic Evaluations of Public Health Interventions: a Role for the Capability Approach?
Paula K. Lorgelly 1,*, Kenny D. Lawson 2, Elisabeth A.L. Fenwick 2 and Andrew H. Briggs 2
1 Centre for Health Economics, Building 75, Monash University, Clayton 3800, Victoria, Australia 2 Section of Public Health and Health Policy, 1 Lilybank Gardens, University of Glasgow, Glasgow,
G12 8RZ, UK; E-Mails: [email protected] (K.D.L.); [email protected] (E.A.L.F.); [email protected] (A.H.B.)
* Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +61-3-9905-8411; Fax: +61-3-9905-8344.
Received: 5 March 2010; in revised form: 30 April 2010 / Accepted: 4 May 2010 / Published: 6 May 2010
Abstract: Public health interventions have received increased attention from policy makers, and there has been a corresponding increase in the number of economic evaluations within the domain of public health. However, methods to evaluate public health interventions are less well established than those for medical interventions. Focusing on health as an outcome measure is likely to underestimate the impact of many public health interventions. This paper provides a review of outcome measures in public health; and describes the benefits of using the capability approach as a means to developing an all encompassing outcome measure.
Keywords: economic evaluation; outcome measures; public health; capability approach
1. Introduction
Public health interventions are intended to promote health or prevent ill health in communities or populations, and can be distinguished from clinical or medical interventions which intend to prevent or treat ill health in individuals [1]. The nature of public health interventions and programmes has evolved considerably over time. This evolution has been summarized by Eriksson as four generations
OPEN ACCESS
Int. J. Environ. Res. Public Health 2010, 7 2275
or paradigms [2]: single factor interventions have given way to multifactorial interventions which then became community based, and now we have nearly come full circle with a returned focus on policy and environmental actions; from the ‘old public health’ to the ‘new public health’ [2,3]. As a consequence interventions are becoming more complex [4], where the complexity lies in the intervention, the outcomes and the evaluation itself.
Limited budgets and competing demands have resulted in a growing need for economic evidence to guide decision making regarding the funding of both health (medical) technologies and public health programmes. A review of the literature from 1966 to 2005 has documented the growth in economic evaluations of public health interventions; only 12% of nearly 1,700 papers pre-dated 1999 [5]. The review found that the majority of papers were concerned with evaluating the prevention of communicable diseases (60%), or the evaluation of screening or diagnostic tools for cancer (35%); and 78% of all papers undertook cost effectiveness analyses (CEA) or cost consequence analyses (CCA) (see Simoens [6] in this special issue for a health economics primer which reviews the different types of economic evaluation approaches). The importance of evaluating public health programmes has been noted at a policy level in the UK. The Wanless Reports [7,8] suggested that an efficient approach to improving the health of the population and reducing health inequalities includes the generation of evidence on the cost-effectiveness of public health strategies. As a result of these recommendations the National Institute for Clinical Excellence (NICE) subsumed the role of the Health Development Agency (HDA) and now the National Institute for Health and Clinical Excellence (NICE) is tasked with providing guidance on the effectiveness and cost effectiveness of public health interventions in the UK.
While Wanless argued that the “[e]conomic evaluation of public health interventions is not inherently different from the evaluation of other health interventions. Standard principles are the same” [8, p. 146], there are in fact a wealth of complexities in the application of the methodology that need to be addressed before economic evaluations of public health interventions are able to produce quality evidence to inform rationing decisions; especially when the decision makers are comparing different types of public health programmes, and comparing public health interventions with other health care interventions. This paper initially reviews the methodological challenges that have been previously identified [9,10], and then specifically focuses on the issue of the measurement of outcomes. A methodological discussion of outcome measurement is presented, which includes a selective review of a number of recently published economic evaluations of public health interventions. The paper then focuses on operationalising Sen’s [11,12] capability approach as a means of measuring benefit, before concluding with a discussion of a number of the outstanding issues, and avenues for future research.
2. Methodological Issues in the Economic Evaluation of Public Health Interventions
The problems of applying economic evaluation to public health interventions were first outlined in an HDA briefing paper [13]. This highlighted the need for a common framework for consistent and transparent decision making, which was flexible enough to capture the multi-dimensional, complex and layered outcomes of public health policies and interventions. These issues and others were further explored in a paper co-authored by members of NICE, the so-called decision-makers [9]. They detailed
Int. J. Environ. Res. Public Health 2010, 7 2276
the challenges of producing public health guidance under NICE’s expanded remit. Seven issues were identified which were labeled as research priorities. These include:
Measuring benefit, the use of quality adjusted life years (QALYs) (and EQ-5D) and the possible need for evaluations to have more than one outcome measure; Public versus individual, the role of individual choice in population based interventions, and how to account for any resulting externalities; Equity versus efficiency, public health programmes frequently target health inequalities, such that the issue of weighting outcomes may need to be addressed together with other distributional concerns; Perspective, in the NICE Reference Case [14] the perspective for public health evaluations has been broadened to include the public sector, this may lead to inconsistencies when making comparisons with clinical interventions; Extrapolation, what is the appropriate time horizon and how meaningful will such extrapolations be in the absence of robust evidence; Quality of evidence, the evidence base is weaker in public health, and controlled trials are often impossible; Cost effectiveness threshold, should the same threshold be applied to both clinical and public health interventions.
A more rigorous assessment of the issues of applying standard economic analysis techniques to public health evaluations has also been undertaken [10]. Weatherly and colleagues considered existing reviews of the literature, which included the Wanless Reports [7,8], and identified what they regard to be the main methodological challenges facing health economists in this area. These include:
Attribution of outcomes, how best to obtain true estimates of effect, what can the existing literature offer by way of evidence, how can primary research generate quality evidence, and what is the appropriate time frame within which to measure success; Measuring and valuing outcomes, what can be measured versus what should be measured, the need for a more generic measure of wellbeing, and sector-specific generic measures of outcome, as well as greater consideration for alternative evaluation approaches; Intersectoral costs and consequences, quantifying the intersectoral impacts of public health interventions, assessment of a general equilibrium approach to the evaluation of public health interventions; Equity considerations, a need for heath inequality impact assessment and research on equity weighting.
Given these four areas they then undertook a review of the empirical literature in eleven public health domains (accidents, alcohol, ante natal and post natal visiting, drug use, HIV/AIDS, low birth weight, obesity and physical activity, sexually transmitted infections, smoking, teenage pregnancy and youth suicide prevention) and concluded that the published literature offers few insights and there is little in the way of best practice, suggesting more methodological research is required.
Int. J. Environ. Res. Public Health 2010, 7 2277
In the rest of this paper we wish to focus on a single issue identified by both Chalkidou et al. [9] and Weatherly et al. [10] that of the measurement (and valuation) of benefits (outcomes). A number of the other issues have already been addressed in the literature, including work on systematic review methods [15], the implications of alternative perspectives [16,17], and the incorporation of equity considerations [18], as well as more general guidance on how to evaluate complex interventions [19,20].
3. Outcome Measurement in Economic Evaluation
Economists are often merely seen as experts on costing, but outcome measurement is a key issue in economic evaluation [21]. Outcome measures used in economic evaluations can generally be categorized as one of the following:
Condition specific, for example episode free days, which would have different meanings for say asthma [22] and gastroesphageal reflux disease [23]; Morbidity, clinical measures of, say, prevalence or events [24], generally expressed in natural units; Generic health or quality of life, such as the SF-36 [25] or Sickness Impact Profile [26]; Mortality, so to estimate life years gained; Preference based, either generic like the EQ-5D [27] or SF-6D [28] or condition specific [29,30], which allow for the estimation of QALYs; Monetary, as measured in a contingent valuation exercise to elicit an individual’s willingness-to-pay for an intervention [31].
The choice of outcome measure is very dependent on the research question being addressed (which includes the perspective employed) and the type of economic evaluation that is being undertaken. A CEA, where outcomes are expressed in natural units, remains a relatively common approach within health technology assessment, and this is also true for public health interventions. McDaid and Needle [5] found in their review that 57% of all published evaluations were CEAs, while Weatherly et al. [10] found in their more selective review that 36% of studies were CEAs. Some examples of published CEAs of public health interventions include evaluations of: targeted screening for cardiovascular disease which estimate the cost per case [32]; vaccination programmes which estimate the cost per hospitalization avoided [33]; surgical interventions for obesity which estimate the cost per pound lost [34]; and behavioral interventions for smoking cessation which estimate the cost per quitter [35]. An example of a more complex public health intervention is the recently published evaluation of an intervention for vulnerable families [36]. Here the authors estimated the cost per improvement in maternal sensitivity and cost per improvement in infant cooperativeness (components of the CARE Index).
While CEA are commonplace they are limited in that they can only inform decisions within individual disease or intervention areas. In order to facilitate comparisons across a range of topics, diseases and interventions, including both life saving and life enhancing interventions, a common generic outcome measure which incorporates the effects of both quality and quantity of life was
Int. J. Environ. Res. Public Health 2010, 7 2278
developed. A QALY combines both mortality and morbidity measures of health by weighting a year of life by the quality of life (that is utility) experienced [37]. This quality adjustment explicitly involves an expression of preference which can be elicited by employing a range of preference elicitation techniques (like time-trade off or standard gamble [38]), but generally off the shelf instruments are used like the EQ-5D (a five dimension questionnaire) [27], or the more recent utility values extracted from the SF-36, using the SF-6D [39]. Once estimated QALYs are compared to costs in the form of an incremental cost effectiveness ratio (ICER) and comparisons across interventions and disease areas can be made using cost per QALY gained, thereby informing decisions as to whether an intervention can be considered value-for-money.
Some examples of published cost utility analyses (CUAs) of public health interventions include those that estimate the cost per QALY gained for diabetes screening [40], vaccination programmes [41], surgical interventions for obesity [42], and smoking cessation [35]. There are also CUAs that employ disability adjusted life years (DALYs) as an outcome measure, thereby estimating the cost per DALY saved [43], averted [44] or recovered [45].
QALYs (and suggested alternatives, such as the healthy year equivalent (HYEs) [46] and the saved young life equivalent (SAVE) [47]) are, however, not without their critics. As discussed above, one of the limitations is that they focus on health outcomes [48,49], and there is now a need to evaluate interventions that seek to improve an individual’s quality of life beyond health. Many public health interventions seek to impact on broader aspects of quality of life, not just health, but also non-health outcomes such as empowerment, participation and crime. Therefore, QALYs and their associated quality of life measures like the EQ-5D or SF-6D are likely to underestimate the relative benefits of public health interventions when compared to health care interventions.
An alternative approach to valuing outcomes, which can potentially overcome this bias, and capture all benefits (both health and non-health) of interest is the contingent valuation method. Contingent valuation (CV) is a means by which outcomes are valued in monetary terms. The most common approach to eliciting monetary valuations is to use the willingness-to-pay (WTP) approach [31]. In its simplest form, individuals are asked how much they would be willing to pay to obtain the benefit of an intervention. If this monetary valuation is greater than the cost of providing the intervention, then a cost-benefit analysis (CBA) would suggest that the intervention is worthwhile. There are a number of practical and methodological problems with the CV approach [50], in particular there is a strong relationship between income and WTP, whereby those on low income provide low valuations. In the context of evaluating public health intervention this could be problematic, as many interventions are targeted at deprived individuals, such that the use of WTP could undervalue the true benefit. While, Kelly et al. [13] conclude that, at a societal level, CBA is the ideal method, as it permits tradeoffs across different sectors of the economy, they admit that there are problems with this approach, and they go on to suggest that within a pragmatic framework cost-consequence analysis (CCA) may be able to capture the layered outcomes of public health interventions. Note that few real world examples of CBA exist, indeed of the four studies that were initially identified as CBA in the review by Weatherley et al. [10] three were subsequently deemed to be CCAs and one was a CUA; although they did identify one study which elicited WTP values for a water fluoridation programme, but it did not estimate the costs of the programme [51].
Int. J. Environ. Res. Public Health 2010, 7 2279
A CCA [52], unlike the approaches described above, does not explicitly compare the costs of an intervention with its outcomes (thus is not an economic evaluation in the strict sense). Multiple outcomes are presented, often in a tabulated approach with costs, and while it cannot be used to rank interventions, it has been heralded as a better way to present (often confusing) economic information to decision makers [53]. CCA has been used previously to evaluate complex interventions where outcomes cannot easily be summarized in a single measure (see Byford and Sefton [54] for some examples).
Sen’s capability approach [11,12] could provide a possible solution to the limitations discussed above, in that it expands the evaluation space to consider whether a programme enhances an individual’s capability. While there is much (theoretical) discussion of the application of the ‘capability approach’ within the health economics (including economic evaluation) literature, there are few applications of the approach. We first review the approach as put forward by Sen and his supporters, before providing a discussion of the theoretical literature within the health economics/economic evaluation domain. We then provide a short discussion of the applied approaches to measuring so-called capability sets.
4. The Capability Approach
The capability approach, as put forward by Sen [11,12], suggests that wellbeing should be measured not according to what individuals actually do (functionings) but what they can do (capabilities).
“Functionings represent parts of the state of a person—in particular the various things that he or she manages to do or be in leading a life. The capability of a person reflects the alternative combinations of functionings the person can achieve, and from which he or she can choose one collection. The approach is based on a view of living as a combination of various ‘doings and beings’, with quality of life to be assessed in terms of the capability to achieve valuable functionings.” [12, p. 31]
Comim neatly described the approach as “a framework for evaluating and assessing social arrangements, standards of living, inequality, poverty, justice, quality of life or wellbeing” [55, p. 162]. Of importance is the evaluation space; it diverges from narrow utility space, which is concerned with the pleasure obtained from the consumption on goods and services, and instead encapsulates an informational space, where evaluative judgments occur according to an individual’s freedom. Therefore, Sen’s approach is based on value judgments, which ultimately relate to an individual’s capability set, and in this sense it can be described as ‘extra-welfarist’ [56-58].
The capability framework for evaluation is based on two distinctions, that between a person’s agency goals and their own wellbeing (where agency goals refer to the notion that individuals may have objectives which relate to the well-being of others and to commitments entirely outside themselves [59]), and that between achievement (functioning) and the freedom to achieve (capabilities). Arguably one of the limitations of the approach is that “Sen has not specified how the various value judgments that inhere in his approach and are required in order for its practical use (whether at the micro or macro level) are to be made” [60, p. 3], as he believes that value selection and discrimination are an intrinsic part of the approach. Nussbaum [61], however, has identified what she regards as central human capabilities, and provides a list of ten capabilities: life; bodily health; bodily integrity; senses, imagination and thought; emotions; practical reason; affiliation; other species; play;
Int. J. Environ. Res. Public Health 2010, 7 2280
and control over one’s environment. Other prescriptive lists also exist, which have varying degrees of abstraction and generalization [62]. The existence of such lists are crucial in the evaluation of capability sets (that is the identification of freedoms) and the subsequent operationalisation of the approach (that is evaluating whether such freedoms are achievable).
4.1. The Application of the Approach to Health Economics—Theoretical Literature
The first insight to the significance that the capability approach might have within the health economics domain became apparent when Culyer [56] used Sen’s theory to develop his own extra-welfarist perspective to economic evaluation (which provided some justification for using QALYs). This perspective, as discussed above, is limited in that it focuses on health, while Sen’s capability approach is much broader. Furthermore, Culyer’s approach is largely concerned with functionings (the achievement of health states) compared with Sen’s ideas on the ability to function [63].
Anand has advanced the approach, first discussing the application of the approach to health care rationing and resource allocation [64,65] (including editing a special issue in Social Science and Medicine [66]) and more recently by attempting to operationalise the approach [67,68]. However, it was Cookson [69] who first explored the possibility of applying the approach to outcome measurement within economic evaluation. He suggests that there are three ways it could be used:
(a) direct estimation and valuation of capability sets; (b) ‘merging’ preference-based measurements, such as willingness to pay, with capabilities; (c) re-interpreting the QALY approach.
Cookson dismisses the first approach as unfeasible at present, arguing that there is no agreed list of functionings, and that any movement from functionings to capabilities is problematic due to different preferences. While the second approach is also dismissed due to “the adaptive and constructed nature of individual preferences over time and under uncertainty” [69, p. 818]. Subsequently, Cookson proposes re-interpreting QALY data generated from a standardised instrument so that the re-interpreted data (the ‘capability QALY’, note others refer to this all encompassing concept as the ‘super QALY’) represents the value of an individual’s capability set. He argues that responses to questions in generic health state valuation instruments can be taken to reflect the value of an unspecified capability set, because health affects an individual’s freedom to choose non-health activities.
Anand [70] disputes Cookson’s conjecture that capability measurement is not yet feasible. In particular he claims while early attempts to measure capability concluded that it was immeasurable, it is now much more feasible to measure capability (indeed the UN’s Human Development Index has it’s foundations within the capability approach). Anand identifies Nussbaum’s list of ten domains as a good starting point, and then shows that many of these are well represented by questions in the British Household Panel Survey (BHPS), a large longitudinal survey extensively used by economists and social scientists alike [67,68].
Int. J. Environ. Res. Public Health 2010, 7 2281
Recently Coast et al. [63,71] have sought to reignite the debate surrounding the application of the capability approach within health economics. Whilst tracing the origins and impacts of extra-welfarism on health care policy, they discuss a number of the issues surrounding further integration and application of the approach for use in economic evaluation. They highlight the issue identified above, that the capability approach has a wider evaluative space, but also focus on the fact that extra-welfarist approaches seek to maximize health, whereas the capability approach is more concerned with issues of equity, distribution, and the equality of basic capabilities [72]. Thus, while we are presenting the capability approach as a means to overcoming the problems associated with measuring outcomes for public health, the approach could also provide a solution to addressing a number of the equity issues that have been raised [9,10].
4.2. The Application of the Approach to (Health) Economics—Empirical Literature
The literature on capabilities, whilst extensive, remains largely conceptual. Robeyns in a review of the literature in 2000 noted that “despite the fact that Sen published Commodities and Capabilities in 1985, the number of empirical applications is still quite limited” [73, p. 26] (see Kyklys and Robeyns [74] and Comim et al. [75] for more up-to-date reviews). Despite this there have been some empirical applications, the majority of which relate to poverty, development, social justice or gender inequality (see [74,76]), although there are a (growing) number in the health economics field.
As discussed above, Anand has sought to operationalise the approach by assessing capabilities using secondary data. He (and colleagues) exploited data from the BHPS and estimated the relationship between wellbeing and capability [67]. They concluded that secondary data sources can provide some information on capability. The incompleteness led them to consider other data sources and they subsequently developed further indicators, which are aligned within Nussbaum’s list of ten capabilities [61]. These indicators were included in an internet survey, along with measures of wellbeing, and the indicators of capability were found to perform well in terms of being strong predictors of wellbeing [68]. The drawback of their approach, however, in terms of outcome measurement for economic evaluation is that there are over 60 indicators of capability, making its usability limited.
Further research sought to reduce and refine Anand’s survey, so to provide a summary measure of capability which could be used when evaluating complex public health intervention [77]. The reduction and refinement of the questionnaire took place across a number of stages, using both qualitative (focus group discussions and in-depth interviews) and quantitative (secondary data analysis and primary data collection using postal surveys) approaches [78]. The final stage tested the validity of the questionnaire. The questionnaire was reduced from its original 65 questions to 18 specific capability items, which remain aligned with Nussbaum’s list of central human capabilities. The finalised questionnaire, and a weighted index of capability (whereby each item was given the same weight), was found to be responsive to different groups of individuals (as categorised by age, gender and deprivation), and measure something additional to health and wellbeing (as measured by the EQ-5D and a global QoL scale, respectively), although was still highly correlated with these measures. This research shows the potential to operationalise the capability approach, despite Cookson’s reservations [69]. However, one drawback was that a preference based index was not developed (that
Int. J. Environ. Res. Public Health 2010, 7 2282
is one that represents trade-offs and choices across the capability set), as it was deemed beyond the scope of the project; this is however wholly possible, and other researchers have successfully developed a preference based index.
Coast and colleagues have developed an index of capability specifically for use in the elderly [79-81]. While eliciting attributes for a generic quality of life measure for older people (by interpreting in-depth qualitative interviews), a similarity became apparent between the resulting attributes (attachment, role, enjoyment, security and control) and Sen’s capability approach. The attributes were valued using best-worst scaling within a discrete choice framework [82]; and were combined to form an index, whereby 0 represents a state of no capability and 1 is a state of full capability. Their approach has many merits, especially their choice of valuation technique, but it is limited in its generalisability beyond the elderly. The research team has since been funded to undertake a similar exercise with a broader scope, and they are now seeking to measure capabilities in the general adult population [83].
Other health economists have used the approach to assess the quality of life of sufferers of chronic pain [84]. This project used a multi-attribute value method [85] to scale the levels within functionings and quantify trade-offs between capabilities [86]. Within the broader area of health (but not specifically health economics), there have been a number of papers which have also attempted to estimate capability. In particular, disability appears to readily lend itself to the capability approach [87], and there have been attempts to estimate the additional income needed by a disabled person to reach the wellbeing of a non-disabled person [88,89].
4.3. Outstanding Issues With Regard to Operationalising the Approach for Use in Economic Evaluations
A fundamental issue with operationalising the capability approach for use in economic evaluations is the need to develop a preference based measure, such that it reflects the relative value placed on the various dimensions and components of capability. However, the method by which values should be elicited remains unclear [80]. Cookson [69] dismissed the valuation of capability sets as unfeasible, citing Sen [11,90] who rejects the use of either choices or desires to value capabilities, and instead suggested that perhaps views on value judgements be elicited instead. Coast and colleagues [80] argue that their best-worst scaling approach, because respondents are asked to only specify the attribute levels which they think are the best and worst, elicits ‘values’ (as Cookson suggests) rather than ‘choices’, because the elicitation exercise does not ask individuals to risk or sacrifice, as would be the case in a standard gamble or time trade-off exercise, respectively.
There has been some research comparing cardinal valuation methods which elicit the degree of preference (as is the case with standard gamble, time tradeoff and visual analogue scale methods) to ordinal methods which elicit information on the ordering of preference using conventional Discrete Choice Experiment (DCE) models [91] and best-worst scaling approaches [92]. DCEs were found to have a number of practical advantages, including the fact that less abstract reasoning is required on the part of the respondents (compared to time tradeoff and standard gamble exercises); but due to their ordinal nature the elicited values require rescaling, that is they need to be anchored. To produce a measure which can be used to weight length of life (as is the case with QALYs) [93] and allow for
Int. J. Environ. Res. Public Health 2010, 7 2283
interpersonal comparisons [94], it is necessary that the scale is anchored at zero which is dead, with full health being one. While Ratcliffe and colleagues [91] show that it is possible to undertake such rescaling using rank and DCE data for a condition-specific health measure, within the context of the capability approach there is a further unresolved issue. Although it is generally accepted that the absence of health is the same as the absence of life (that is it is appropriate that dead be given a value of zero), there has been little debate about whether the absence of capability is the same as the absence of life; but if a capability index is to be used in a similar manner to a QALY such a discussion is required. Notably, Coast et al. [80] take a philosophical approach where the absence of capability is given a value of zero, thereby avoiding the need to value death.
Adaptation, where individuals may not recognise their own lack of wellbeing because they have adapted to their situation, is also an issue when undertaking a preference based valuation. This is despite the fact that Sen used the issue of adaptation as the basis for, rejecting utilitarian approaches which seek to value wellbeing and, replacing utility with an informational space based on functionings [90]. Adaptation is an issue when using the public to value hypothetical health states as the valuation often reflects the initial shock response, rather than the long term (patient) experience; this means that the general public often give lower values compared to patients who may have adapted [95]. The measurement of preferences for different capability states could also suffer from similar adaptation issues (with the additional issue of functioning in one domain potentially over compensating for a lack of capability in another). Burchardt [59] has shown that agency goals are adaptive, and that an assessment of inequality based on agency goals may be bias because of lower aspirations (when setting goals), and therefore greater success in achieving goals. One solution to adaptation that Sen has advocated is using an expert-centred approach; such that in the public health context, public health professionals or policy makers would be used to provide values for different capability states [71]. This is entirely plausible but conflicts with the current movement towards patient and public involvement in decision making.
5. Conclusion
The need to undertake economic evaluations across a wider range of interventions, which encompass both health and non-health outcomes, requires an alternative to the conventional cost per QALY gained approach. Sen’s capability approach, although theoretically challenging, could provide a possible solution.
The benefits of using the capability approach are numerous. It offers a much richer set of dimensions for evaluation, which given the nature of public health and social interventions, with their many and complex outcomes, makes the approach ideal for capturing all these outcomes, rather than focusing solely on health status. The equitable underpinnings of the approach are also appropriate for use with public health interventions which often involve reducing inequalities across groups (namely improving deprivation) as an overriding aim.
To operationalise the approach for use in economic evaluations, it is necessary to generate an index whereby an individual’s capability (or capability set) is described by a single composite number. This involves a number of challenges. Key among these is the need to identify a legitimate capability space and then to accurately measure relative preferences for each capability. Indices, and preference
Int. J. Environ. Res. Public Health 2010, 7 2284
measurement more generally, raise the issues of which valuation technique to use, whether and how to anchor the index, and how to control for adaptation.
Future research will contribute to the removal of many of the conceptual challenges; however, in the long run a potential institutional barrier to the adoption of a capability approach is that the QALY-based extra-welfarist approach is now the norm in health economics. For instance, within the UK NICE has a clear recommendation that QALYs should be used as the reference case, and research on methods for cost effectiveness analysis (as opposed to outcomes research) continues to grow. Although there are a number of alternative approaches (experienced utility [96] and happiness/life satisfaction/wellbeing [97,98]) which provide possible competition if support for the extra-welfarist approach was to waiver, the capability approach would appear to have strength as a means of measuring the effectiveness (and thus cost effectiveness) of public health interventions.