a) Definition
The first level of research which this thesis undertakes consists of an assessment of the efficacy of the intervention. Efficacy refers to the effects of an intervention when delivered under ideal or experimental conditions. Efficacy studies are concerned with assessing if an intervention can work—i.e. whether it can do more good than harm— under ideal conditions [40, 97]. Further, efficacy trials aim to investigate how and how an intervention works. For this reason, they are also referred to as explanatory trials [40, 97]. Assessment of efficacy is considered a necessary step in the development and diffusion of a new technology or intervention [40]. If an intervention is shown to have no or negative effect under ideal conditions then it is unlikely to be effective in routine practice [40, 50]. On the other hand, once an intervention is shown to be efficacious, it is then useful to carry out an effectiveness trial to investigate if it can work in real-life settings; and implementation research to assess if all components can be delivered to acceptable standards in routine practice in a manner that is acceptable to the target audience [40].
b) Design of efficacy trials
Efficacy studies are designed to optimise performance or effects of an intervention. Therefore, an efficacy trial is characterised by (a) a well defined or standardised
intervention that (b) is made available in a uniform fashion (c) within standardised and well-resourced settings (d) to specific target audience which (d) completely accepts, participates in, complies with or adheres to the intervention delivered [40, 98]. In addition, efficacy trials usually apply strict exclusion criteria so as to recruit participants with similar characteristics. Further, efficacy trials closely monitor the frequency with which interventions are applied and carefully measure outcomes of participants at various points in time [40, 98]. A typical study design for assessing efficacy consists of a randomised controlled trial (RCT) which involves (a) random assignment of participants to comparison arms, (b) conceal of allocation of study participants and (c) blinding of study participants and study teams to the interventions provided as well as blinding of outcomes assessment. Typically the comparison intervention consists of a placebo. However, for most public health interventions, the comparison intervention often consists of the best known or the current intervention. The assessment is then
36 compared with the intervention currently or previously in use. For practical reasons, test of efficacy is sometimes carried out using non- or quasi-experimental clinical trials which may not involve randomisation or blinding and may use historical controls, although they result in weaker causal inferences [40, 98].
c) Strength of RCTs
RCTs are considered the best design for attributing outcomes to an intervention. If well conducted, randomisation ensures that, on average, all potential confounders are equally distributed between comparison arms. Thus any significant difference between the study arms in the outcome assessed can be attributed to the intervention and not to a
systematic difference between the two groups [50, 98-100]. d) Major limitations to RCTs
Threats to external validity
A typical RCT takes place under atypical (ideal) conditions characterised by standardised, well resourced settings, motivated research staff and participants, homogeneous population, and intense application and monitoring of the intervention [40, 98]. As such evidence from RCTs may not be generalisable to the general
population in routine practice [98]. However, threats to external validity would be an issue if adoption decisions were based on results of efficacy trials only. According to diffusion of innovation model described above, adoption decisions are meant to be informed by evidence from effectiveness and demonstration trials. Efficacy trials are necessary but not sufficient for adoption decision-making [40]. Knowledge of efficacy is necessary to inform the decision as to whether or not it is necessary to carry out an effectiveness study and to aid interpretation of evidence from effectiveness trials. Therefore, in the context of this study, this limitation is irrelevant. Further, generalisable conclusions may be gained from systematic reviews and meta-analysis that identify similar effects in various populations [98]
Sample size, design effect and cost
The sample size required to detect a difference in effect between two groups is inversely proportional to the treatment effect squared [101]. Therefore, in order to detect small differences in effects, randomised trials require large sample sizes.
Many trials of public health interventions use cluster Randomised Controlled Trials (cRCTs) because individual randomisation is not feasible [98, 99]. If analysis is undertaken at individual level, sample size calculations for cRCTs need to take into
37 account the correlations among the individuals within the clusters (design effect) [98, 102, 103]. Randomisation can be effective in addressing baseline imbalance in cRCT if the number of clusters randomised in large. With fewer clusters, as is the case in most cRCTs, randomisation has less statistical power. The general rule of thumb is that more clusters with fewer individuals per cluster help to minimise the design effect [98]. Although use of stratified or pair-matched methods may minimise design effect in cRCTs, these methods also require large numbers of clusters that are similar enough to be paired or grouped [98].
Because RCTs require large samples, they are very costly to conduct—irrespective of whether they are individual or cluster randomised.
In assessing HW performance and guideline implementation, a common approach is to observe HW making observation at a few clusters (health facilities) [62, 90-92, 104-118]. Data on HW performance tend to be correlated because of the similarity in case
management as all patients are often seen by only one or two HWs [102]. Randomisation of patients does not address the correlation in data on HW
performance. Therefore, sample size calculation in such surveys need to account for potential design effect in the data on HW performance [102].
Contamination
Allocation and blinding is usually not practical or necessary in many public health interventions. Therefore, control conditions cannot be guaranteed: aspects of the intervention may be implemented in control settings. For example, in one cRCT designed to compare effects of implementation of an RDT-based guideline versus a conventional symptom-based guideline, some control facilities were later noted to have received and used RDTs provided through other supply chains [35]. Such
contaminations can reduce the statistical power of the trial to detect effects of the intervention [98].
Ethics
Ethical challenges may arise by withholding potentially beneficial intervention from control population units, particularly if the intervention is associated with large positive effects [98].
38 Complex interventions
Many public health interventions are complex, consisting of several components [73, 74]. When interventions are associated with significant effects, it is important to identify which components of the intervention contributed to the outcomes. RCTs are
unsuitable for attributing effects to components of interventions: it is difficult to identify the components of the intervention which is responsible for an observed effect [74]. If linked to outcomes assessment, implementation evaluation (demonstration studies) can be more useful in explaining positive, modest and insignificant results [73, 74].
e) Choice of a systematic review of RCTs to assess efficacy
A systematic review is a method of systematically collating all empirical evidence that fits pre-specified eligibility criteria in order to answer a specific research question. It uses explicit inclusion criteria that are selected to minimise bias and increase internal validity of the findings [119].
A systematic review was chosen because it was not feasible to undertake an RCT in the context of this thesis due to financial and time constraints. Secondly, a systematic review pools together all possible evidence from a broader context. Therefore, it represents the best evidence for decision making at a broader level. Although RCTs are criticised for its limited external validity, systematic reviews can lead to generalisable conclusions by identifying similar effects in various populations [98].
Evidence from non-randomised trials was excluded from the review. Selection biases (confounding) are likely to be greater for non-randomised studies than for RCTs [119]. Inclusion of evidence from non-randomised trials in a meta-analysis can lead to a shift in the estimate of the effect of an intervention (systematic bias) and excessive
heterogeneity among studies [119]. Potential limitations
The research quality criteria used to select studies for inclusion in a systematic review of effects of interventions has been criticised for being overly biased against studies that may show large effects of an intervention, but whose designs are deemed to be sub- optimal as per the selection criteria [73]. On the hand, it may be biased towards
interventions with marginal effects because the designs of the studies used in evaluating effects meet the selection criteria.
39 Many studies used in evaluating public health interventions employ cluster randomised trials in which analysis is done at the level of the individual patients, leading to unit of analysis error which may lead to over-precise results. If included in a meta-analysis without correcting for clustering, estimates of effects from cluster randomised trials may carry more weight on the pooled result of the meta-analysis and lead to a biased
40