Discusión - RESUMEN GLOBAL DE LAS PUBLICACIONES PRESENTADAS

2. RESUMEN GLOBAL DE LAS PUBLICACIONES PRESENTADAS

2.3. Discusión

3.4.7.1 Rate of surgical site infection (SSI)

SSI rates were calculated by dividing the number of SSI by the number of participants and presented per 100 participants. 95% confidence intervals around this estimate were then calculated using Minitab 18 (Minitab Inc, USA).

3.4.7.2 Length of stay

As a non-normally distributed continuous variable, the median and interquartile range was used to summarise length of stay data. Comparison of median values between those with and without the outcome of interest was performed using Mann-Whitney U nonparametric test. A p-value less than 0.05 was considered statistically significant. Analysis of length of stay data was undertaken in SPSS version 22

3.4.7.3 Development of the model exploring undernutrition and surgical site

infection

The statistical modelling framework used to explore the effect of undernutrition on SSI was logistic regression (Hosmer et al, 2013) since the outcome of interest (occurrence of SSI) was binary. Some patients underwent more than one operation and so, in theory, the data structure could be multi-level with variability at the patient level and variability at the

operation within patient level. In this case, account would need to be taken in the modelling of the correlation between multiple operations on the same patient and therefore a multilevel binary logistic regression model would have been appropriate.

It was decided a priori that if laboratory data were missing, imputation of missing values would not be undertaken because there was no sound basis for doing this. Consequently, cases with variables with missing data were excluded from the statistical analyses when the numbers of missing values were small, and it was acknowledged that this would reduce sample size by a small amount. If the extent of missing values was deemed large, then the variable was not

62

included in any analyses. For example, 49% of values for preoperative ferritin were missing, therefore this potential predictor variable was excluded from any analyses. Figure 3.1 shows a summary of the steps in model development. A univariate model is defined as a model with a single predictor and is often called a univariate analysis. It is the simplest method of describing a relationship between a predictor variable and the outcome. It acts as a screening process in order to identify potential predictor variables (or risk factors). However, it does not adjust for the combined effect of other predictor variables which may be impacting on outcome. An adjusted model, therefore, is one which includes several predictor variables, and is often called a multivariate analysis, and regression analysis can predict a dependent variable (outcome) from one or more independent predictor variables (Myles & Gin, 2001).

3.4.7.2 Variable selection strategy

Potential predictor variables were excluded from univariate analysis when rates of incidence were less than 3%, to prevent small numbers of cases becoming highly influential. This excluded two potential patient predictor variables (prematurity and other chromosomal abnormalities apart from Trisomy 21) from univariate analysis. The chosen operative predictor variable ‘location of operation’ was collapsed and entered modelling as a binary variable – main theatre or temporary theatre – as maintenance work within the operating theatre department necessitated a change of cardiac theatre location for part of the study period. Operations occurring in the Cardiac Catheter Laboratory were excluded due to small numbers (n= 2).

A summary of the variable selection process is provided in Figure 3.1. Univariate logistic regression was used to assess the potential association between each predictor variable and the presence of surgical site infection (step 2). For the next step, potential predictor variables (risk factors) were selected if the p-value for Wald’s test was less than 0.15 in the univariate analysis (step 3) or excluded if the p-value > 0.15 (step 4). This p-value criterion of 0.15 was chosen because this was an exploratory study into potential risk factors for SSI. The p-value of 0.15 for forward selection of predictors is virtually equivalent to using the Akaike Information Criterion (Akaike, 1974) for forward selection of a single variable, and this threshold p-value is in common usage. Selected variables then entered a multivariate logistic regression analysis to identify independent risk factors for SSI using the same p-value criterion of 0.15 (step 5). This model was termed the ‘maximal model’. If a predictor variable had more than two categories, it was entered in the model if any category showed a p-value less than 0.15.

The test p-values in the maximal model were scrutinised and potential predictors were removed if the p-value exceeded 0.15 for all categories. All variables achieving a p-value less than 0.15 in the maximal model were retained to form the ‘minimal model’ (step 6). In the

63

final stage, all predictor variables that were excluded from the maximal model at step 4 were added individually to the minimal model (step 7). If any of these showed a p-value less than 0.15 it was selected for inclusion in the final model (step 8). The final model (step 9) thus contained all the variables in the minimal model plus those variables additionally selected in the final stage.

It is important to determine a variable selection strategy in advance of data analysis to avoid subjective bias in the selection of variables, although some variables (such as age and gender) might be chosen a priori.

Figure 3.1: Flow diagram for variable selection

3.4.8.3 Goodness of fit

Model evaluation and goodness of fit included Receiver Operating Characteristics curve and Hosmer-Lemeshow’s chi-square test statistic (Hosmer et al, 2013).

Classification tables were used to assess the sensitivity and specificity of the final model in predicting risk of SSI. A receiver operating characteristic (ROC) curve shows sensitivity plotted against 1-specificity for the entire range of possible risk cut points. This measure is now the standard for evaluating a prognostic model’s ability to assign, in general, higher probabilities of the outcome to the subgroup who develop the outcome than it does to those who do not (Pepe, 2004). Therefore, the area under the curve (AUC) was used to evaluate the final model’s ability to discriminate between those children experiencing SSI and those children that did not.

64

All regression analyses were performed using IBM SPSS Versions 21 and 22, with Excel used for plotting of the final ROC curve.

In document UNIVERSIDAD MIGUEL HERNÁNDEZ DE ELCHE (página 135-139)