4. Trabajar con ESET NOD32 Antivirus
4.5 Interfaz del usuario
The generalisability of the findings of the meta-analysis could not be determined without knowing how consistent the results of the studies were and, therefore, how consistent the influence could be in future studies in other samples (Higgins, 2003). Traditionally, high heterogeneity in meta- analyses was interpreted as an indication of the inappropriateness of the meta-analytic method
68 for the data, but it is now widely recognised that meta-analyses can be used to examine the inconsistency in the data (Higgins, 2011, Section 9.5.3).
Cochran’s Q test is the standard test of heterogeneity, where a p value less than 0.05 indicates significant between study variability. However, the test is recognised as overpowered in the estimation of clinical heterogeneity and is poor at detecting significant true heterogeneity among studies (Higgins, 2002). The results of the test are less reliable in meta-analyses of a small number of studies (Hardy, 1998). In this review, the planned subgroup analyses used the information from as few as three studies (described in Section 3.9.4) and significant heterogeneity was expected due to the clinical diversity of the outcomes and study samples.
Accordingly, as well as providing the results of the test of heterogeneity (referred to the chi squared distribution), I also used a measure of the extent of the heterogeneity. The I2 statistic is
used to indicate the proportion of the observed variance that reflects real differences in the effect size; and is not inherently dependent on the number of studies in the analysis. It gives an
indication of the amount of observed dispersion within and between studies, thus providing a measure of consistency of effect across the studies in the review (part 4, (Borenstein, 2009). As a standard, 25%, 50% and 75% are considered the thresholds for low, moderate and high
heterogeneity, respectively (Higgins, 2003). It is recommended as a summary measure of the impact of heterogeneity on the findings and possible recommendations (Higgins, 2002). Heterogeneity attributable to clinical or methodological diversity was expected to decrease in subgroup analyses as the subgroups were homogenised on a common characteristic (e.g. the same outcome or same disability diagnosis). If high heterogeneity was observed in the overall and subgroup analyses, the heterogeneity was unexplained, which limited the generalisability of the findings.
3.9.5.5 Predictive intervals
To produce an estimation of the average true effect, random effects models assume that there is variation in the average effect between studies, but do not accommodate the width of the distribution of the effect across the individual studies (IntHout, 2016). The summary confidence intervals for the pooled estimate can be misleading as they do not account for the within-study variation in the outcome. Particularly where there is high heterogeneity, a statistically significant pooled estimate should be treated with extreme caution as the confidence intervals do not give a realistic indication of the estimated true range of effect (IntHout, 2016). This can lead to the findings being overgeneralised.
69 Predictive (also known as credible) intervals can be calculated to present the expected range of true effects in subsequent similar studies. The predictive interval effectively converts the heterogeneity into the same metric as the effect size to give the range within which the effect would be situated in a new study with 95% certainty. Given the heterogeneity, this interval facilitated a more realistic interpretation of the effect and its clinical implications. Originally used to summarise the effects of clinical trials, the use of predictive intervals has grown in the field of epidemiology as a means of presenting more accurate results from data with high heterogeneity (Cole, 2003; Kane, 2011; Tham, 2014).
I calculated predictive intervals for the overall and subgroup pooled estimates (Riley, 2011). In Stata, the interval was generated using the rfdist command which is part of the metan package. The interval incorporated uncertainty in the location and spread of the effect using the formula: t(df) x sqrt(se2 + tau2). This is the t-distribution with k-2 degrees of freedom, where k is the
number of studies, se2 is the squared standard error and the heterogeneity statistic is tau2
(StataCorp, 2017).
The predictive intervals were shown on the forest plots. Stata required a minimum of three standardised mean differences to estimate a predictive interval, as fewer data points (effectively) results in an infinite distribution. Inestimable intervals were illustrated with dotted lines from the diamond (forest plot interpretation described in Section 3.9.2) (Sterne, 2009a).
3.9.6 Outliers
Given the expected high heterogeneity of the data, no data points were excluded as outliers. Any suspected outliers may have been accurate data points illustrating diversity rather than e.g. measurement error (Higgins, 2011, Section 10.4.1).
3.9.7 Data management
I describe the method of data management and key decisions made for specific studies/analyses included in the review.
3.9.7.1 Transformation
Some outcome measures use high scores to indicate greater ill-health, whilst others use low scores. Most of the outcome scales in the studies in this review used higher mean values to indicate greater ill-health. In two studies, where lower scores indicated poorer health, the means were multiplied by -1 to change the direction of the effect (Oelofsen, 2006; Eker, 2004).
70 3.9.7.2 Imputation
Where standard deviations were available for a study using the same outcome measure and version, the largest standard deviation was imputed for the missing values for the study and comparison groups (Higgins, 2011, Section 9.2.3.2). Values were imputed for Glenn et al.’s (2009) study from another study using the same version of the Parenting Stress Index (Roach, 1999; Abidin, 1995a).
For Scott et al. (1997), the average of the standard deviations for the mean scores of the same symptom were imputed (Higgins, 2011, Section 9.2.3.2). As no other study had used the same version of the depression outcome measure and there were substantial differences between versions, the missing values could not be imputed from a single other study using the Beck Depression Inventory. Scott et al. also did not provide standard deviations of the mean scores for the outcome of psychological distress. As this outcome was not assessed in any other study, the standard deviations could not be imputed, so the outcome was dropped from the meta-analysis. 3.9.7.3 Over-representation
Due to multiple analyses being included in some of the studies included in the review, there were issues of the over-representation of data from some studies in the meta-analyses.
For the longitudinal studies, the standardised mean difference was calculated for the latest data collection point only (Higgins, 2011, Section 9.3.4). In meta-analyses of studies with differing study designs, the inclusion of one standardised mean difference for one time point per study is recommended to prevent the overrepresentation of multiple results from longitudinal studies with multiple data collection points in the pooled estimates (Higgins, 2011, Section 17.1). Results could not be combined across time-points without introducing a unit of analysis error. In this review, three studies had multiple data collection points within the preschool period: four data points for three studies were excluded from the meta-analysis (Gowen, 1989; Laxman, 2015; Norlin, 2013).
Standardised mean differences were calculated for every disability group included in the study which met the study inclusion criteria. One study had multiple specific diagnosis groups (Eisenhower, 2005). The inclusion of all three groups increased the precision of the pooled estimate by increasing the amount of data, but also introduced bias as the study was overrepresented in the meta-analysis.
The overall and disability diagnosis subgroup pooled estimates were biased towards studies which have measured more than one outcome. Although this over-weighting of these studies in the
71 estimates introduced bias, the inclusion of these studies was valuable because of the contribution of additional data with which to answer the research questions.
3.9.8 Test of Significance
The test of significance (a z test called the test of standardised mean difference in Stata) provided a p value which is the probability of obtaining the observed pooled estimate by chance. If the p value was smaller than 0.05 (indicating statistical significance), the null hypothesis of no effect (on average) was rejected as there was evidence in the pooled data of a significant relationship between caregiving and ill-health. As the 0.05 threshold is largely arbitrary, the Cochrane Handbook (Section 12.4.2) recommends reporting the p value for the test of significance (z test) together with the confidence interval (Higgins, 2011, Section 12.4.2). I reported the p value for the overall and subgroup pooled estimates z tests alongside the corresponding confidence interval; but the test was not performed for the predictive intervals (by Stata). Instead, my interpretation of the results focused on comparisons between the confidence and predictive intervals.
3.9.9 Publication bias
In a meta-analysis, it is standard practice to include the assessment of publication bias as a potential source of heterogeneity in the data. This is the well-documented greater probability of (often small) studies which have statistically significant results being published than studies evidencing little or no significant effect of the exposure to the outcome of interest (Sterne, 2004; Sterne, 2009b). This was assessed by examining the extent to which studies providing evidence of an effect had smaller sample sizes than those with smaller or no effect, thus biasing the results of the pooled estimate.
It was necessary to assess publication bias in the investigation of the effect of caregiving on ill- health because of the general acceptance of the assumption of ill-health associated with caregiving (discussed in Section 1.4.1) and proliferation of smaller studies in caregiver-health research (Plant, 2007; Miodrag, 2015). Alternatively, evidence of the inconsistency of high stress in parent-caregivers and the publication of studies rebuffing the expectation of caregiver ill-health due to caregiver burden may have reduced the effect of publication bias (Plant, 2007; Swain, 2010).
Publication bias is evaluated visually using a funnel plot - a scatterplot of the effect sizes
estimated from individual studies against the standard error of the effect size, which is a measure of precision of the effect estimate relative to the study size. If there was a low possibility of publication bias the plot would be symmetrical, resembling an inverted funnel. An Egger test
72 assesses the asymmetry of the funnel plot in meta-analyses of standardised mean difference. The test assesses how far the intercept for the line of best fit for the studies deviates from zero (the linear relationship between intervention effect and its standard error). The line of the null hypothesis of no bias would be vertical on the forest plot. There was significant publication bias if the p value for the bias coefficient was p<0.05. (Sterne, 2004). These procedures were executed using the ‘metafunnel’ and ‘metabias’ commands in Stata (Sterne, 2004; Harbord, 2009).