iments before being exposed to the complex topic of randomization tests. Also, we want preservice teachers to gain experience in making informal inferences before making formal inferences either in the context of probability models or that of data analysis. Because of these learning trajectories, randomization tests are introduced at the end of our courses.
5.6.1 Summary of Some Findings
We have distinguished two domains of knowledge, the statistical knowledge needed to fully understand the randomization test procedure, and the software knowledge needed carry out the randomization test simulation using TinkerPlots™. Many pre- service teachers in our study were able to conduct the majority of steps to carry out a randomization test using TinkerPlots™. This is a pleasing outcome since the partic- ipants had not previously been exposed to randomization tests and given the limited amount of time in the courses to teach these methods.
The results of this study also suggested that the participants had gaps in the statis- tical knowledge underlying the randomization test. For example, participants had a difficult time generating an adequate null hypothesis, setting up the null model, and interpreting a p-value. It is, perhaps, not surprising that these struggles have been previously documented in the literature (e.g., Garfield & Ben-Zvi, 2008; Vallecillos, 1994). These findings have implications for the re-design of the learning trajecto- ries for both courses, and we will address this at the end of this section. Before this, however, we will summarize what we found concerning our research questions.
How well are the preservice teachers able to model a randomization test exper- iment with TinkerPlots™? What is the role of TinkerPlots™in their thinking?
The data suggests that the participants in Course 2 (which emphasizes probability and only covers minimal data analysis) have more statistical knowledge. This pro- vides some evidence that a course where simulations of chance experiments and hy- pothesis testing are taught explicitly before entering a learning trajectory to random- ization tests might lead to a better understanding of randomization tests as compared to a course where randomization tests are immediately preceded by data analysis.
The technical features of TinkerPlots™ do not seem to be problematic for the preservice teachers. Problems only occur at the interface of the software and statis- tical world (TP Steps 1 and 3). When TP Step 1 was performed incorrectly, it was because participants populated the sampler using an equal proportion of women and men instead of reproducing the sample proportions. When TP Step 3 was performed incorrectly, participants’ mistake was in sampling “with replacement.” The data from Conrad and Maria’s transcript suggests that TinkerPlots™ can help support students’ reasoning, at least in the sense of refining their null hypothesis see also delMas et al. (2013). The crucial point seems to be the transition between the statistical and the software level (Figure5.18), particularly the construction of the null model.
In which way do the preservice teachers accomplish the steps of a random- ization test? How do the preservice teachers interpret the results of the ran- domization test? Most of our preservice teachers were able to conduct the steps
DISCUSSION AND IMPLICATIONS 159
Figure 5.18. Excerpt from “software cycle when conducting chance experiments”. of a randomization test with TinkerPlots™ when supported by the randomization test scheme. Some of the participants struggle at typical crucial points, like not be- ing able to formulate an adequate hypothesis (similar to results of the study of Liu and Thompson (2009). We also observed common difficulties when interpreting p- values, as Garfield and Ben-Zvi (2008) reported, for example.
Despite these difficulties and gaps in their statistical knowledge, the preservice teachers were able to make inferences about group comparisons. But, doubts re- main about how deep their conceptual understanding of hypothesis testing actually is. Since we see (c.f. Figure5.13) that the participants of Course 2 performed better in several steps and in several aspects, the approach with minimal data analysis and an emphasized probability component might be better suited for learners approach- ing the randomization test method. This conclusion is not more than a suggestion as we did not do a randomized comparative experiment. Furthermore since participants taking Course 2 were more experienced in simulating chance experiments and mak- ing conclusions from given p-values, this might suggest the need to simulate several chance experiments before introducing hypothesis testing—randomization tests in particular. For getting a better understanding of the randomization process itself, it might be helpful to add a hands-on activity such as that proposed by Arnold, Budgett, and Pfannkuch (2013).
A further important finding is that courses need to put more emphasis on relating the statistical and the contextual world (see Figure5.4). Since we identify typical mistakes such as the false reproduction of the sample (in the sense of a number of draws unequal to the number of cases in the sample) and not drawing without re- placement (as it is necessary, when doing a randomization test), we recommend that the null model should be discussed in detail before simulating a chance experiment with software. This might include a discussion of suitable null models for different situations. One specific redesign of the learning trajectory might be trying to improve connections between generating the null hypothesis and conducting a TinkerPlots™ simulation by explicitly discussing the construction of the null model. It might also