2 BASES EPISTEMOLÓGICAS Y EVOLUCIÓN DE LA PRÁCTICA DEL URBANISMO
2.4 Años 70: Fracaso del intento de flexibilización del planeamiento.
2.4.6.1 Participant recruitment: A further 200 children were recruited, mostly from schools in the North West but also from other areas of the country. In their pilot study for the Assessment of Comprehension and Expression 6-11 (Adams et al, 2009) the 20 typically developing children tested did not then go on to participate in the main data collection. The 50 children in the current researcher’s pilot study were included in the final study (using their initial data converted into the final HICIT format), making a total of 250 children tested.
The researcher carried out 122 final HICIT assessments in two primary schools in the North West and eight Manchester Metropolitan University (MMU) student volunteers in their final year of study carried out 78 final HICIT assessments, some in the North West and some in other areas of England. More details about the numbers, age groups and locations in which they tested are given in the following results chapter (chapter 3). Cain et al (2009)used 25 undergraduate students from Lancaster University to collect data for their idiom study in the Northwest of England.
The sample size in this study is smaller than that of some other published language tests. For example the ACE standardisation was carried out on 117-145 children in each age group (6:00-11:11), compared to 50 children per age group in the current study. However, taking into account the time constraints of carrying out a part-time PhD study and the majority of the assessments being carried out by the researcher, 250 is a realistic sample size.
The relatively large sample size of 250 in this study ensured that any skewness and kurtosis effects would not significantly affect the analysis (Pallant, 2013).
2.4.6.2 Free School meals (FSM) Index: The researcher used 2 primary schools in the North West, 1 with a Free School Meals (FSM) percentage of 9.6 and one with a FSM percentage of 53.6. Students from MMU carried out assessments in schools in their hometowns. The percentages of FSMs in these schools ranged from 4.8 to 50.3. 2.4.6.3 Procedure: The researcher requested involvement from the schools in the North West using the letter in appendix xiv.The students were given the letter in appendix xv to send to their local school. The procedure was the same as for the Pilot study. The researcher briefed the staff in her schools at twilight staff meetings. The student volunteers were sent procedural instructions to follow (see appendix
xvi). They were asked to send the completed forms back to the researcher for her to score. However, they were also asked to mark one of the completed assessments themselves, using the marking criteria provided. This provided a measure of student- researcher inter-rater reliability.
The children were briefed in the same way as for the pilot study. In addition, 10% of these children were audio-recorded for moderation purposes. The parents of the children selected for audio-recording had sent in a written consent form agreeing to this (appendix vi).The children were asked verbally for their assent prior to the recording. No child refused to be audio-recorded.
2.4.6.4 Equipment : The equipment used in the final study comprised: the final assessment forms; a digital tape recorder; a pen; the marking criteria; stickers to give out as rewards for those children who wanted one.
2.4.6.5 Challenges to data collection in the pilot and final study:
Space was a premium in all schools so it was often difficult to get a designated quiet room in which to do the assessments. Some of the researcher’s testing had to be carried out in noisy school corridors or large cloakrooms where there were frequent distractions and interruptions. Locating the children to assess was occasionally problematic as the classes sometimes moved to a different location for certain lessons and activities (eg PE, IT, assembly). The length of the assessment session was challenging for some of the younger children as their attention span was shorter than that of the older children. Consequently, the assessment sessions for the 4 year- olds and some of the 5 year-olds were split into two separate sessions throughout the day with the child having a rest in between the sessions. When scoring up the
assessments there were two examples of a missing unrecorded response from two different children. The researcher was able to go back to the setting the day afterwards to obtain these missing data.
2.4.6.6 Data Analysis Methods
Descriptive statistics: The following were calculated for the total final scores by age and gender, and for the section scores by age: the mean, the 5% trimmed mean, the median, the variance, the standard deviation, the range, the interquartile range, the skewness, the kurtosis, and the Kolmogorov-Smirnov significance value.
Inferential statistics: Parametric tests are used where data is drawn from a normally distributed population; there is homogeneity of variance (data are drawn from
populations with approximately equal variances); and the data are measured on an equal-interval scale. Non-parametric tests are distribution-free and no assumptions are made about the data. Parametric tests are more powerful than non-parametric tests (Pring, 2005). As the variables in this assessment are measured in interval scales then parametric statistical tests are used (ie T-tests, Analysis of Variance (ANOVA) (McCauley, 2001).
A one-way Analysis of Variance (ANOVA)was carried out to explore the effect of socio-economic status on the total test scores. A two-way ANOVA was used to explore the effects of gender and age group on the total test scores.
Validity: Exploratory and confirmatory factor analyses were carried out. Reliability: Internal reliability was assessed by applying Cronbach’s alpha and carrying out Rasch analysis of the data. Rasch is a type of Item Response Theory test analysis. Jabrayilov et al (2016) would recommend the use of this over Classical Test Theory (CTT) as the HICIT contains over 20 items. External reliability was tested by measuring inter-rater reliability.
2.4.6.7 Qualitative analysis of the HICIT responses
The main data analysis in this study is quantitative. However, although the responses are scored correct/incorrect (1 or 0), many questions allow a wide range of correct responses. Norbury and Bishop (2002)developed nine categories of incorrect responses for their qualitative typology of inference error types. These were: failure of literal comprehension, a wrong inference, a wrong immature inference, an immature reference, an odd inference, a ‘because he did’ answer, a ‘scope’ answer (along the right lines, but too specific or too vague to be correct), lack of expressive ability, and no response. The current researcher finds these categories too vague or overlapping. Consequently she developed the very detailed marking criteria specifying a wide range of possible correct answers. However, there will still be occasional answers that necessitate the use of the tester’s own judgement. Some qualitative data on the test responses are presented at the end of the results chapter.
2.5 Chapter summary
This chapter has covered: the primary and secondary research aims and hypotheses; qualitative and quantitative methodologies; research epistemology; the principles of quantitative test design; and the research design of this study. The latter section (2.4) is divided into the following sub-sections: the development of the HICIT and its marking criteria; methodological procedures common to the pilot and final study; the pilot study procedures; the final study implementation; and the data analysis methods to be used in analysing the results.
The following results chapter will commence with a review of the study aims and hypotheses and will then give a summary of the results of the quantitative and qualitative analyses described above.
CHAPTER THREE: RESULTS