V
IOLINISTSA rather reassuring feature of this experiment was that, although the materials for performance were composed (or more precisely arranged) and edited by myself, the rating of performances (the coding of errors) was entirely carried out by the students, Maya Amin-Smith and Guy Edmund-Jones. Furthermore, they carried out their two ratings independently, and proceeded then to implement an a posteriori Inter- Rater reliability test which, based on a random cross-section of the recordings, showed a very strong positive correlation between their two marks [r(16) = .82, p < .001]
Only one of the participants (part. 7) could have been considered an outlier, since he performed with clearly more mistakes than the rest of participants. However, statistical tests implemented with and without his data yielded results that were not different in terms of significance of effects, so he was kept in the analysis. Further, and contrary to what had been the case with the percussionists (where 3 participants did not complete the test correctly), this participant did formally complete the test as all other colleagues, albeit with a noticeably higher number of mistakes.
As in the experiments with percussionists, all recordings were coded for mistakes in the Pitch Domain and for mistakes in the Rhythm Domain, which were subsequently added up to produce a Total Number of Mistakes. Results are discussed for each of these three sub-sections. Tests were implemented with the IBM SPSS Package Version 21.
Analysis of the numbers of mistakes was carried out using Generalized Linear Models with the numbers of mistakes as dependent variables, and assuming (see SECTIONS 4.3. and 5.3.) a Poisson Probability Distribution and a Log Link Function.
6.3.1. PITCH DOMAIN RESULTS
An initial model was run with Version (Conventional or Modified), Reading (1st, 2nd or 3rd Reading), and Piece (Piece 1 - g minor; Piece 2 - b minor) as categorical factors. The model found a highly significant effect of Reading [c2(2, N = 16) = 48.06, p < .001] and, unfortunately (in spite of our precautions) of Piece
[c2(1, N = 16) = 51.02, p < .001], with Piece 1 (in g minor) eliciting significantly fewer pitch mistakes than
Piece 2 (b minor) [M = 2.54, SE ±.37, vs. M= 5.67, SE ±.81, respectively]; the effect of Version appeared in this context to be non-significant [c2(1, N = 16) = .14, p = .706].
A second Model was then implemented, using Piece as an Offset Variable. Piece becomes thereby a ‘structural’ predictor, and its coefficient is not estimated by the model. This is an especially useful
procedure in Poisson regression models, where each case may have different levels of exposure to the event of interest. In our experiment, each participant was exposed to Pieces of different levels of difficulty, which meant that the participant would be more or less prone to make mistakes. The effect of other factors can then be evaluated, taking into account the ‘intrinsic’ difficulty of each Piece. More specifically, the numbers of pitch mistakes per Piece were added up (summing the mistakes in the 1st, 2nd, and 3rd Readings), and the means thereof calculated: M = 7.64 (SD = 6.70) for Piece 1, and M = 17.00 (SD = 14.71) for Piece 2, confirming that indeed the second Piece was generally more difficult, even if
there was a high variability in the number of mistakes as a function of the participants’ abilities. Since a Log Link function was assumed for the model, these quantities were used as logarithmically transformed values (Ln).
MAIN EFFECTS. The model, with Version, Reading, and Order of Presentation (Conventional first or Modified first) as categorical factors, and using Piece as Offset Variable, showed a marginal significance for Version [c2(1, N = 16) = 2.97, p = .085], high significance for Reading [c2(2, N = 16) = 52.69, p <
.001], and high significance for Order of Presentation [c2(1, N = 16) = 38.94, p < .001].
INTERACTIONS. The interaction of Version by Reading was only marginally significant [c2(2, N = 16) = 4.64, p
= .098], with the differences between Versions not being significant in the first two Readings, and reaching marginal significance in the third (see Figure 6.2.). Similarly, the interaction between Version, Reading, and Order of Presentation reached only marginal significance [c2(1, N = 16) = 2.97, p = .085],
with the patterns for both orders being somewhat similar (see details below, SECTION 6.3.1.c., and Figure 6.4., for Totals of Mistakes, which presented the same trends).
CHAPTER 6 EXPERIMENT IV
177 Figure 6.2.
Pairwise Comparisons of Estimated Means of PITCH Mistakes between performances using Conventional Versions (Conve.) or Modified Versions (Modif.). From left to right, comparisons of mistakes made in the 1st Readings, 2nd Readings and 3rd (rehearsed) Readings. The vertical axis represents logarithmically transformed values (Ln), having used the Ln of the difficulty of each Piece as an offset variable.
Error bars represent Standard Error. Significance using the Sidak correction: n. s. non-significant
+ p < .100
6.3.2. RHYTHM DOMAIN RESULTS
As was the case with the pitch domain results, a model with Version (Conventional or Modified), Reading (1st, 2nd or 3rd Reading), and Piece (Piece 1 - g minor; Piece 2 - b minor) as categorical factors found a highly significant effect of Reading [c2(2, N = 16) = 18.68, p < .001] and a significant effect of Piece [c2(1,
N = 16) = 4.53, p = .033], with Piece 1 eliciting in this case more rhythm mistakes than Piece 2 [M = 2.56,
SE ±.28, vs. M= 1.92, SE ±.27, respectively]. The effect of Version was nonetheless highly significant [c2(1, N = 16) = 15.83, p < .001].
A second Model was implemented, using Piece as an Offset Variable, and with Version, Reading, and Order of Presentation (Conventional first or Modified first) as categorical factors.
0.050 0.135 0.368 1.000
Conve. Modif. Conve. Modif. Conve. Modif.
1st READ. 2nd READ. 3rd READ.
Means of PITCH mistakes (Ln) Readings n. s. + n. s. n. s.
MAIN EFFECTS. The model showed a high significance for Version [c2(1, N = 16) = 16.17, p < .001], high
significance for Reading [c2(2, N = 16) = 17.52, p < .001], and high significance for Order of Presentation
[c2(1, N = 16) = 15.64, p < .001].
INTERACTIONS. The interaction effect of Version by Reading was non-significant [c2(2, N = 16) = 3.67, p =
.159], with the differences between Versions showing the same patterns in the three Readings, with fewer mistakes elicited by the Modified Versions (see Figure 6.3.). The interaction between Version, Reading, and Order of Presentation showed no significance [c2(1, N = 16) = 1.42, p = .922], with the patterns for
both orders being quite similar (see details below, SECTION 6.3.1.c., and Figure 6.4., for Totals of Mistakes, which showed the same trends).
Figure 6.3.
Pairwise Comparisons of Estimated Means of RHYTHM Mistakes between performances using
Conventional Versions (Conve.) or Modified Versions (Modif.). From left to right, comparisons of mistakes made in the 1st Readings, 2nd Readings and 3rd (rehearsed) Readings. The vertical axis represents logarithmically transformed values (Ln), having used the Ln of the difficulty of each Piece as an offset variable.
Error bars represent Standard Error. Significance using the Sidak correction: n. s. non-significant * p < .050 0.050 0.135 0.368 1.000
Conve. Modif. Conve. Modif. Conve. Modif.
1st READ. 2nd READ. 3rd READ.
Means of RHYTHM mistakes (Ln) Readings * n. s. n. s. n. s. n. s.