• No se han encontrado resultados

II. MARCO METODOLÓGICO

2.7. Técnicas e instrumentos de recolección de datos

The evaluation research design is a research strategy that allows findings to be deduced from the evaluation data. It encompasses the method and procedures employed to conduct scientific research.

The independent third-party evaluator must select an evaluation design that is compatible with the objectives of the evaluation study as well as

desired level of rigor and resources or time The evaluation contractor must constraints. The three types of evaluation research propose a logical approach to design typically used in EERE evaluations are: inferring answers to the evaluation

questions from the data collected by • Experimental designs the evaluation study. This logical • Quasi-experimental designs approach plus the data collection • Non-experimental designs, with and without method(s) and analytical method(s)

counterfactuals. constitutes the “research design.”

This step discusses the different Each evaluation design, its guidelines for use, and its forms of research design available for relative defensibility, are briefly discussed below. impact evaluations.

4.3.1 Experimental Designs

The most important condition for establishing causality between a program activity and its effect is randomization of the individuals or objects who/that will be exposed to the program’s

activities and outputs (the treatment group) and a group from the same population who will not be exposed to the program (the control group). Randomized controlled trials (RCTs) are

designed purposely to meet these evaluation conditions. Randomization, implemented correctly, allows for the creation of two or more program groups that are similar to each other on all characteristics and experiences other than the fact of exposure to the program’s intervention (the treatment). This ensures that any outcome differences observed between the groups at the end of

the evaluation are likely to be due to the treatment and not to differences between the groups or to other factors besides the treatment.

As such, they constitute the strongest of the evaluation research designs and lead to the greatest confidence in the estimated program outcomes. RCTs with properly measured and evaluated metrics can produce an estimate of the size of a program’s effect that has desirable statistical properties (estimates of the probability that the true effect falls within a defined confidence interval). 25

RCTs are used whenever strong confidence in a program’s actual effect is highly important to national, departmental, or other similarly major interests. The Office of Management and Budget (OMB), which must address evaluations across all government programs, recommends

experimental designs whenever feasible.26 RCTs are common in the education, health, and

agriculture fields but are rarely seen in the energy field.

4.3.2 Quasi-Experimental Designs

When a RCT cannot be used, the independent third-party evaluator must develop an approach that approximates an experimental design.27 In such cases, group assignments for evaluation

purposes are often made to approximate the scientific benefits of randomization. For example, program non-participants (called a “comparison” group for quasi-experimental designs) and program participants may be matched on characteristics believed to be correlated with the observed outcomes. There are a variety of research designs using comparison groups. They may be broadly categorized as:28

• Before-After (Pre-Post) Comparison Group Design: Compare program-participants and non-participants on pre- and post-treatment measurements. Program participants and comparisons are measured at the same time periods. The program effect is deduced by comparing the performance of the participant group pre- and post-intervention with the comparison group. Regression-discontinuity is the strongest of these variations.

• After-Only (Post-Only) Comparison Group Design: A less defensible variant of this design simply compares the two groups at the same point after the participants participated in the program, usually because it was not possible, for a variety of reasons, to get pre-test

25 Shadish WR, Cook TD and Campbell DT. (2002). “Experimental and Quasi-Experimental Designs for

Generalized Causal Inference.” Belmont, CA: Wadsworth Cengage Learning.

26 Office of Management and Budget, “What Constitutes Strong Evidence of a Program’s Effectiveness?”, p. 1.

www.whitehouse.gov/omb/part/2004_program_eval_pdf.

27 When participants voluntarily participate in a program, a type of bias called “self-selection bias” enters into the

results. This bias alludes to the probability that the participants have a predisposition to be interested in the

program’s intervention (e.g., have a prior interest in energy efficiency). This creates two issues for the validity of the evaluation results: (1) it is expensive to find non-participants with similar predispositions for a comparison group, and (2) even if they were identified, the results could not be generalized to the broader population group that the program might be targeting because the part of the population without this predisposition would not be included in either group. When this issue is relevant, the independent third-party evaluator should acknowledge that it is a potential source of unknown bias.

28 The names for these designs have been adapted from D. T. Campbell and J. C. Stanley, Experimental and Quasi-

measures. The program effect is deduced by comparing the outcomes from the two groups, but only in the period after the intervention, not before.29

4.3.3 Non-Experimental Designs

Research designs that do not use control or comparison groups are considered to be “non- experimental” designs.30 Non-experimental designs can be implemented with or without a

counterfactual (some means of estimating what might have been in the absence of the

intervention). In non-experimental evaluation designs with counterfactual, the evaluator seeks to obtain an estimate of what might have occurred in the absence of the intervention through the use of one or more approaches, such as using time series for participants only, interviewing the participants, interviewing independent experts, or constructing a statistical comparison group. A mixed method non-experimental approach applies more than one of these non-experimental designs in the same evaluation, with the aim of bolstering the overall findings through

complementary lines of evidence, especially if each method points to the same estimates. The use of results from more than one non-experimental approach to develop evaluation findings adds subjective credibility to the findings. Such use of multiple methods in an evaluation is often called “triangulation.”31

Although non-experimental designs do not establish causality, given the nature of public sector investments, there are instances where non-experimental designs are the only option for

evaluating impacts. If they are properly executed and include a method of estimating a

counterfactual outcome, they will provide reasonably valid findings on the contribution made by the program intervention on the outcome. At a minimum, non-experimental evaluation designs must include a counterfactual; EERE strongly discourages the use of non-experimental designs without counterfactual.

In sum, per OMB guidelines, experimental designs are the best type of research design for demonstrating actual program impact.32 RCTs, however, are not always feasible, and in some

cases, they are actually illegal or immoral. When RCTs cannot be used, quasi-experimental designs represent the next best category of research methods, followed by non-experimental methods with counterfactual. Several factors – from the intended uses of the evaluation results to the findings from the evaluability assessment – come together to determine what is the most

29There are several types of experimental and quasi-experimental designs. Determining which is best for different

evaluation findings is beyond the scope of this guide. If you have not had prior training in experimental research design, but believe you need to conduct an impact evaluation, it is recommended that you seek expert assistance in assessing the options, or leave the choice of approach to the evaluation expert(s) who propose(s) the evaluation. A good introduction is found in chapter 3 of GAO’s “Designing Evaluations,” (GAO-12-208G: Published: Jan 31, 2012). http://www.gao.gov/assets/590/588146.pdf.

A more technical, but understandable and short overview is presented in: Shadish WR, Cook TD and Campbell DT. Experimental and Quasi-Experimental Designs for Generalized Causal Inference: 2nd Edition. Belmont: Cengage

Learning, 2002.

30 Office of Management and Budget, “What Constitutes Evidence of a Program’s Effectiveness?” p. 3.

www.whitehouse.gov/omb/part/2004_program_eval.pdf. The second definition of non-experimental design given in the OMB document, “indirect analysis” using an independent panel of experts, is more appropriate for R&D projects.

31 Greene, J., and C. McClintock. 1985. “Triangulation in Evaluation: Design and Analysis Issues. Evaluation

Review, v9, no. 5. (October): 523-45.

appropriate research design. The design chosen by the independent third-party evaluator will influence the data collection options described in the next section.

Documento similar