• No se han encontrado resultados

Traditional methods were employed using Stata (version 12.1; StataCorp LP, College Station, TX, USA) to develop the following models:

l risk model 1 – the point of admission to hospital l risk model 2 – predicting new AKI at 72 hours l risk model 3 – predicting worsening AKI at 72 hours. Traditional model development

Risk model 1: the point of admission to hospital

The data analysis considered emergency admissions for patients without pre-admission AKI. Patients with pre-admission AKI were omitted from the analysis. However, patients whose pre-admission AKI status was unknown were kept in the data analysis. Non-emergency admissions were also omitted, as were admissions associated with childbirth and pregnancy. Patients with no information on AKI at admission were also omitted from the analysis.

After omissions, the full data set was split into a ‘development’ data set for constructing the risk model and a ‘validation’ data set on which to evaluate the performance of the model. To allow a larger number of data on which to construct the model, a 3 : 1 ratio was employed, with the development data set being the larger of the two. Selection of admissions to one of the two data sets was done at random.

The outcome variable was AKI stage, which was considered as an ordinal measure, the categories being:

l no AKI

l AKI stage 1

l AKI stage 2

l AKI stage 3.

To allow for the ordinal nature of the outcome, all analysis was performed using ordinal logistic regression. There were multiple admissions for some patients; to allow for this in the data analysis, robust standard errors were used.

Initially, the individual association between each factor and AKI stage was examined individually in a series of univariable analyses (see Chapter 3, Risk model 1: the point of admission to hospital).

There were no missing data for the key demographics (e.g. age, sex). In some cases the primary diagnosis was missing, and these patients would have been excluded from the final model. For the blood test variables missingness was deemed to be informative (i.e. data missing not at random) and hence missing values were given their own category.

Subsequently, the joint association between the factors and AKI stage was examined in a multivariable analysis. Variance inflation factors were used to assess collinearity between the predictor variables. Where collinearity was found, action was taken to allow for this. This included either excluding variables from this stage of the analysis or combining variables together. A backwards selection procedure was used to retain only the statistically significant variables in the final model.

Risk model 2: predicting new acute kidney injury at 72 hours

A specific patient group was selected for risk model 2, to predict AKI at 72 hours based on data available up to the end of the first 24 hours after hospital admission. The following patients were excluded from the data set:

l non-emergency admissions

l patients with pre-admission AKI

l patients with AKI at admission

l childbirth/pregnancy admissions

l patients with no information on AKI at 72 hours.

After omissions, the full data set was split into a ‘development’ data set for constructing the risk model and a ‘validation’ data set on which to evaluate the performance of the model. To allow a larger number of data on which to construct the model, a 3 : 1 ratio was employed, with the development data set being the larger of the two. Selection of admissions to one of the two data sets was done at random using pseudo-random numbers.

The outcome variable was AKI stage at 72 hours, which was considered as an ordinal measure, the categories being:

l no AKI

l AKI stage 1

l AKI stage 2

To allow for the ordinal nature of the outcome, all analysis was performed using ordinal logistic regression. There were multiple admissions for some patients; to allow for this in the data analysis, robust standard errors were used.

Initially, the individual association between each factor and AKI stage was examined individually in a series of univariable analyses.

Subsequently, the joint association between the factors and AKI stage was examined in a multivariable analysis. Variance inflation factors were used to assess collinearity between the predictor variables. Where collinearity was found, action was taken to allow for this. This included either excluding variables from this stage of the analysis or combining variables together. A backwards selection procedure was used to retain only the statistically significant variables in the final model.

Risk model 3: predicting worsening acute kidney injury at 72 hours

The third risk model is designed to predict worsening AKI by 72 hours in patients with AKI stage 1 or AKI stage 2 at admission. Note that AKI stage 3 patients could not get any worse and so were not included in the analysis.

A specific patient group was selected for the 72-hour increase in AKI risk model. The following patients were excluded from the data set:

l non-emergency admissions

l patients with pre-admission AKI

l patients with no AKI at admission

l patients with AKI stage 3 at admission

l childbirth/pregnancy admissions

l patients with no information on AKI at 72 hours.

After omissions, the full data set was split into a ‘development’ data set for constructing the risk model and a ‘validation’ data set on which to evaluate the performance of the model. To allow more data on which to construct the model, a 3:1 ratio was employed, with the development data set being the larger of the two. Selection of admissions to one of the two data sets was done at random using pseudo-random numbers.

The outcome was defined as a worsening in AKI, considered as a binary variable. A worsening in this case was regarded as a higher AKI stage. In other words, worsening was defined as a change to AKI stage 2 or AKI stage 3 for patients with AKI stage 1 on admission, and a change to AKI stage 3 for patients with AKI stage 2 on admission.

As there were repeated admissions for some patients, the analysis was performed using multilevel statistical methods. Two-level models were used, with admissions nested within patients. To allow for the binary nature of the outcome, all analysis was performed using multilevel logistic regression.

Initially, the individual association between each factor and AKI stage was examined individually in a series of univariable analyses. Subsequently, the joint association between the factors and AKI stage was

examined in a multivariable analysis. Variance inflation factors were used to assess collinearity between the predictor variables.

Traditional model validation Validation in this population

The risk models for AKI were developed using three-quarters of the original patient group meeting the eligibility criteria. The model was, therefore, validated on the remaining one-quarter of the data. The validation concentrated on two aspects of the model, the ability to discriminate between the cases with a high and low risk of AKI, and the calibration of the model, whether or not the risk of AKI from the fitted model matches that in the observed data.

The model was fitted considering AKI on a 4-point scale: no AKI, AKI stage 1, AKI stage 2 and AKI stage 3. Thus, the model can be used to obtain predicted probabilities of being in each of the four AKI categories. Although this is useful, it is harder to validate the model with a 4-point outcome scale. Thus, for the purposes of validation, two different cut-off points were used. First, the probabilities were combined to give the probability of AKI (AKI stage 1, 2 or 3), which was compared with the occurrence of AKI in the data. A second set of analyses split the data into no AKI and AKI stage 1 versus AKI stage 2 and AKI stage 3, and compared this with the occurrence of AKI stage 2 and AKI stage 3 in the data.

The first approach used was to split the validation data set into risk groups based on the predicted probabilities. For each analysis, four different risk categories were considered. Within each risk category, the actual occurrence of AKI was assessed and compared with the predictions; this assesses both the discrimination and the calibration of the model.

Second, the discrimination between high- and low-risk cases was assessed by calculating the area under the receiver operating characteristic (ROC) curve. The area under the ROC (AUROC) curve was calculated and interpreted. One suggested interpretation of the area under the curve (AUC) values is:

l 0.5–0.6: no discrimination l 0.6–0.7: poor discrimination l 0.7–0.8: fair discrimination l 0.8–0.9: good discrimination l 0.9–1.0: excellent discrimination.

Note that the AUROC value is equivalent to the c-statistic, sometimes used for model assessment. A final set of analyses examined the difference between the observed outcome and that predicted by the model using the Hosmer–Lemeshow test. This divided the admissions into the same four categories described earlier and compares the observed number and the predicted number in each category. A non-significant result would imply little difference between observed and expected numbers, and thus a good fit of the model to the data.

Validation in a second population

The population demographic in East Kent is older and has fewer members of ethnic minorities than the general population of England. It was, therefore, important that we validated our models in a second population in order to assess the generalisability of the models across the NHS. For our second population we chose Medway NHS Foundation Trust, which constitutes both a different demographic population and a different NHS trust from which to extract data.

The method of validation is equivalent to that used in the East Kent data set. The model assessed differs slightly from that detailed in the East Kent data set, as it excludes the number of drugs given, which was not measured in this data set.

Documento similar