4. Etapa final
4.1 Hallazgos y conclusiones
4.1.1 Hallazgos (Triangulación de saberes)
The majority of the currently available PROMs were originally designed for use in research to ensure that the patient’s perspective was integrated into assessments of the effectiveness and cost-effectiveness of care and treatment.2Their use in this context stemmed from the argument that clinical or biomedical measures of
treatment impact did not capture outcomes that matter to patients; treatments may be deemed successful on the basis of biomedical criteria but may have little or even a detrimental impact on patient functioning.245
A classic example of this tension was the findings of the Diabetes Control and Complications Trial,246which
compared intensive therapy administered either with an external insulin pump or by three or more daily insulin injections together with frequent blood glucose monitoring with conventional therapy with one or two daily insulin injections for people with insulin-dependent diabetes. The trial showed that although intensive therapy improved metabolic control and reduced the incident of long-term complications, it also increased the incidence of hypoglycaemia in the short term. This demonstrates the trade-off between long-term and short-term outcomes in assessing the effectiveness of treatments for diabetes. It was assumed that there was a strong link between clinical end points and a patient’s‘quality of life’but
numerous studies have shown only a weak link between the two.245As many treatments for chronic disease
focus on improving not just the length of patients’lives but also the quality of their lives, a strong argument was made that clinical trials should also assess the impact of treatments on a patient’s own perceptions of his or her health.247
These concerns led to a rise in prominence of the concept of‘health-related quality of life’(HRQoL) and the proliferation of instruments designed to measure it. Precise definitions of the concept of HRQoL were contested; some defined HRQoL as‘those parts of quality of life that directly relate to an individual’s health’(p. 25),248while others argued that it is was not possible to separate HRQoL from quality of life and
criticised the concept for a lack of consensus regarding its definition.249At the heart of these debates lay
the challenge of attribution; some aspects of a patient’s quality of life were not amenable to change through interventions focused at improving patients’health status and, as such, it was questioned whether broader measures of quality of life were useful markers of treatment success. Attempts to address this conundrum included the development of models to show the pathway through which changes to clinical variables impacted on symptoms, which in turn impacted on a patient’s functional status, general health perceptions and, ultimately, overall quality of life.250
Instrument developers, while not ignoring these debates, to a large extent did not resolve these conceptual problems but instead focused on the task of developing instruments. Consequently, the last 30 years have seen an exponential rise in the number and type of such instruments designed to measure HRQoL.251Over time,
these instruments sought to measure a whole range of constructs including, for example, HRQoL, symptoms, functioning and activities of daily living. Consequently, a broader categorisation of instruments emerged: patient-reported outcomes (PROs) in the USA and PROMs in the UK. These are defined as questionnaires that measure patients’perceptions of the impact of a condition and its treatment on their health.1
Research efforts centred on testing the psychometric properties of specific instruments in different patient populations: for example, generic measures such as the Short Form Questionnaire-36 items (SF-36),252
conditions, and the European Organisation for Research and Treatment of Cancer Quality of Life
Questionnaire Core 30 (EORTC QLQ-C30)255and the Functional Assessment of Cancer Therapy–General
(FACT-G)256for cancer. Key psychometric properties included validity–the extent to which an instrument
measures what it intends to measure; reliability–the extent to which an instrument is free from random error and produces consistent results either within observers (test–retest reliability) or between observers (intra-rater reliability); and responsiveness–the ability of an instrument to detect change over time. This last criterion was particularly important for instruments used in RCTs.165,257
This burgeoning of research effort also saw the emergence of bodies such as the International Society for Quality of Life Research (ISOQOL) in 1994 and their associated journalQuality of Life Research, which focus on supporting the development of such instruments and holding conferences to support advancements in the science of measurement. Alongside the development of measures came work to establish appropriate methods, criteria and minimum standards for the psychometric properties for use in research settings.257–260One example of these criteria, developed by ISOQOL, is reproduced inTable 3.
TABLE 3 The ISOQOL’s minimum standards for the use of PROs in patient-centred and comparative effectiveness research
Criteria Minimum standard
Conceptual and measurement model
A PRO measure should have documentation defining and describing the concept(s) included and the intended population(s) for use. In addition, there should be documentation of how the concept(s) are organized into a measurement model, including evidence for the dimensionality of the measure, how items relate to each measurement concept and the relationship among concepts included in the PRO measure
Reliability The reliability of the PRO measure should preferably be at or above 0.70 for group-level comparisons, but may be lower if appropriately justified. Reliability can be estimated using a variety of methods including internal consistency reliability, test–retest reliability or item response theory. Each method should be justified
Content validity A PRO measure should have evidence supporting its content validity, including evidence that patients and experts consider the content of the PRO measure relevant and comprehensive for the concept, population, and aim of the measurement application. This includes documentation of (1) qualitative and/or quantitative methods used to solicit and confirm attributes (i.e. concepts measured by the items) of the PRO relevant to the measurement application; (2) the characteristics of the participants included in the evaluation (e.g. race/ethnicity, culture, age, gender, socio-economic status, literacy level) with an emphasis on similarities or differences with respect to the target population; and (3) justification for the recall period for the application Construct validity A PRO should have evidence supporting its construct validity, including documentation of
empirical findings that support predefined hypotheses of the expected associations among measures similar or dissimilar to the measured PRO
Responsiveness A PRO measure for use in longitudinal research study should have evidence of responsiveness, including empirical evidence of changes in scores consistent with predefined hypotheses regarding changes in the measured PRO in the target population for the research application Interpretability of
scores
A PRO measure should have documentation to support interpretation of scores, including what low and high scores represent for the measured concept
Translation of the PRO measure
A PRO measure translated to one or more languages should have documentation of the methods used to translate and evaluate the PRO measure in each language. Studies should at least include evidence from qualitative methods (e.g. cognitive testing) to evaluate translation Patient and investigator
burden
A PRO measure must not be overly burdensome for patients or investigators. The length of the PRO measure should be considered in the context of other PRO measures included in the assessment, the frequency of PRO data collection, and the characteristics of the study population. The literacy demand of the items in the PRO measure should usually be as 6th grade education level or lower (i.e. 12 years old or lower); however, it should be appropriately justified for the context of the proposed application
Reproduced with kind permission from Springer Science+Business Media: Quality of Life Research, ISOQOL Recommends Minimum Standards for Patient-Reported Outcome Measures Used in Patient-Centered Outcomes and Comparative Effectiveness Research, vol. 22, 2013, pp. 1889–90, Reeve B, Wyrwich KW, Wu AW, Velikova G, Terwee CB, Snyder CF, et al., figure 5, © Springer International Publishing AG, Part of Springer Science+Business Media.260
An ongoing criticism of PROMs was the ordinal nature of many of the instruments, meaning that the gap between scores of 5 and 6 on a particular PROM was not necessarily the same as the gap between scores of 6 and 7 and, therefore, strictly speaking, they should not be analysed using parametric statistics. In addition, the requirement for instruments with robust psychometric properties had led to the production of instruments with many items that were onerous for patients to complete. Furthermore, PROMs varied in their
appropriateness for patient groups with different levels of severity, and often suffered from floor and ceiling effects, which limited their responsiveness to change. In response to these problems, a new generation of instruments was developed based on item response theory or Rasch analysis. These instruments differed from traditional psychometric methods by testing how far items within a measure fitted the requirements for interval level measurement along a single dimension. Each item could then be plotted on a‘ruler’, allowing a more precise ordering of items according to their level of‘difficulty’or severity.
Criticisms were also raised about the degree to which standardised HRQoL instruments adequately captured the patient’s perspective.261,262Many of the early measures were not developed in collaboration
with patients and items were developed based on clinical perspectives of what was important to patients.263Furthermore, the standardised nature of many existing PROMs assumed that all items were
equally relevant to patients and there was little scope for patients to indicate how important each dimension was important to them. This gave rise to the development of a number of individualised measures, such as the Schedule for the Evaluation of Individual Quality of Life (SEIQoL),264the Measure
Yourself Medical Outcomes Profile,265the Disease Repercussion Profile266and the Patient Generated
Index.267These instruments all allow some flexibility for patients to select problems or domains that are
particularly important to them and/or rate how important a domain is to them individually. Furthermore, a consensus guideline for the development of standardised PROMs specified that patients should be directly involved in the item generation process.268
Thus, in summary, there has been a sharp increase in the number of PROMs available to measure almost every aspect of a patient’s health status, symptoms, functioning and HRQoL. Most of these have been developed for use in research rather than in routine clinical practice. There have also been significant developments in the methodologies underpinning their development and testing, and research has largely focused on the development and psychometric testing of these instruments. As a result of these endeavours described in previous paragraphs, a large number of different types of instruments now exist, summarised in Table 4. We now turn to consider their role in the care of individual patients in routine clinical practice.