• No se han encontrado resultados

4.4. CARACTERIZACIÓN DE LOS PROCEDIMIENTOS OPERATIVOS DEL

4.5.1. Procedimiento para acreditación de laboratorios clínicos

I will focus on models that have predicted the development of CVD (i.e. fatal/nonfatal CHD, +/- stroke, +/- other atherosclerotic CVD, +/- other non-atherosclerotic CVD), among participants free from CVD at baseline.

1.4.1.1. In Western Europe

The first CVD model in my opinion, was probably derived by Kannel et al. in the mid-1970s from the Framingham study (hereinafter the “Fra-Kannel” model).(181) This paper is remarkable for fulfilling many of the modern criteria for model development (incl. arguing against dichotomizing continuous variables; Fitting nonlinear effects such as age-squared in an appropriately theory-driven manner; Suggesting a suitable ratio of

90

predictors to events; and the accurate identification of 6 core predictors that later replicated in practically all subsequent CVD models).

In 1982 a refined version of the model was summarized onto a simple clinical reference card.(182) This innovation allowed continuous risk factors (such as age, blood pressure and cholesterol) to be categorized into around 10 categories each, so that clinicians could use them without a calculator. This card also was remarkable in operationalizing the age- cholesterol interactive term (figure 10), a feature that then disappeared from subsequent CVD models until its reintroduction in 2014. Curiously, this innovative publication received little clinical or academic attention (for example, by attracting just 19 citations over 35 years).

Figure 10. The first semi-A4 sized risk guide, to calculate CVD risk in clinical settings, from (182).

91

In 1991, Anderson et al. again slightly refined the Framingham models, and persuaded the American Heart Association to endorse them. This lead to their adoption by a wider pool of clinicians.(183, 184)

In 2003 The European Society of Cardiology published and endorsed the SCORE model.(185) This was the first attempt to offer one model for use across multiple countries. The European cohorts that were used to derive SCORE were separated into “low risk” and “high risk” regions, with each deriving its own model. It is unclear by what criterion they decided on the allocation of countries into these two categories. For example, figure 11 illustrates how there is a negligible difference in the incidence of the primary outcome between Sweden (a high-risk country) and Belgium (a low-risk country). Perhaps the authors were intending a 50:50 split in the sample size, as this turned out to be 56:44 for the low- and high-risk models, respectively. (This intention could be further criticized, since the high-risk model, having more events, was better powered for model

derivation than the low risk model.) In essence, already from inception, the allocation of countries into “high” and “low” risk status appears relatively arbitrary, thereby limiting its real life calibration. This is further aggravated by the passing of time. Most of the CVD events occurred in the late 1980s, by which point international differences between Russia and capitalist European countries were not half as large as they were in the 2010s.

92

Figure 11. Derivation of the male SCORE model. Countries that contributed to the “high risk equation” are highlighted in red box, while countries that contributed to the “low risk equation” are not highlighted. Adapted from (185).

2007 saw the publication of the ASSIGN model in Scotland, which additionally incorporated family history and area-level deprivation.(186) Shortly thereafter, QRISK was published for England being the first and only CVD model derived from electronic healthcare records, during the period 2007-2017.(187) One of the strengths of the QRISK project is how models are derived in settings virtually identical to where they will

ultimately be used, which is likely to lead to much better calibration when compared to deriving models from cohorts studies with greater selection bias. However, the QRISK project is only able to explore predictors that GPs already consider important and record in their notes. As such, it is not able to test the potential benefits from asking GPs to consider new

predictors. In contrast to models from USA where the hazard of age increases with age (using an age2 term) the English model included the

93

hazard of old age may additionally be captured by comorbidity markers, like Diabetes and Rheumatoid Arthritis, in QRISK.

One year later, QRISK was updated into QRISK2.(188) As these researchers had access to a large database (with 96 709 events) they added details of further comorbidities, as well as negative interactions for most risk factors with age. The latter could be a marker of latent gene- environment interactions, whereby smokers who survive until age 70 without CVD are unlikely to contract CVD due to smoking alone, after the age of 70. However, as cholesterol was missing for most participants, this prompted considerable methodological critique. In particular, when

imputing cholesterol and omitting the outcome from the predictor matrix (as was originally done), the imputed cholesterol values were substantially different as opposed to using a predictor matrix that included the outcome (as was later recommended). From this high-profile example, the former is now recommended as standard established practice for dealing with missing data in prediction settings. Omitting the outcome biases the imputed cholesterols towards the mean, thereby inducing a type of regression dilution bias that attenuates the beta coefficient of cholesterol artificially downwards towards the null.

In 2014 the American Heart Association switched its endorsement away from the Framingham models, towards the Pooled Cohorts Equation (PCA).(189) This added to the Original Framingham and Framingham Offspring Studies three further cohorts: ARIC (Atherosclerosis Risk in Communities); Cardiovascular Health Study; and CARDIA (Coronary Artery Risk Development in Young Adults). Most of the participants were recruited in the 1980s, with events occurring during the 1990s.

A summary of the predictors used by these better established models is shown in table 1.

94

Table 1. List of predictors used in popular CVD risk prediction models from Europe and the USA

(Fra=Framingham; PCE=Pooled Cohorts Equation;

ECG=Electrocardiography; AF=Atrial Fibrillation; BMI=Body Mass Index; HDL=High-Density Lipoprotein; CVD=Cardiovascular Disease)

Importantly, in each of these popular risk prediction models above, the publication which detailed the model derivation did not include any data about external validation. Best practice in model development suggests that external validation is probably one of the most rigorous steps one can take, to demonstrate that the models are not overfitted and will perform

95

similarly well when adopted to real-life clinical situations (Annex 2). The importance of this cannot be overstated, since most of the caveats and considerations around deriving accurate risk prediction models are

centred around trying to minimize overfit (e.g. how to select variables, any departures from linearity, interactions, and ensure sufficient power). However, external validation is difficult to conduct, which is why it is rarely reported papers that derive new models. One external validation study found that the Framingham performed well in the UK among those of mid- to-high risk, but underestimated risk among those of lowest risk.(190) This is a relatively rare finding. In contrast, most external validation studies find how risk is overpredicted in the validation dataset, where events are less common than expected. This has happened in the USA after testing the PCA model.(191-193) Typically this is because the validation dataset is healthier, for example due to period effects (194), or alternatively due to overfitting.

Of note, early models like Framingham omitted socioeconomic status. If this model was implemented, then this would have widened

socioeconomic inequalities in CVD.(195, 196) By same token, it is

plausible that widespread use of an international model such as SCORE, which does not sufficiently capture the underlying difference in baseline rates between Russia and Western Europe, may widen international health inequalities, by encouraging underuse of preventative interventions in those countries of greatest risk.

Documento similar