Selection criteria
The texts included in the experiments were not complete articles but selected passages. This was done in order to lessen the amount of time and effort required from the participants (especially those with ASD) to assess all 27 texts. The selected passages are self-contained and coherent, meaning that they do not refer to information given in the rest of the article and can be comprehended independently of it. The rest of the selection criteria are outlined below:
Prior or specialised knowledge: Texts requiring a high level of prior general or specialised knowledge were discarded. Control of this variable was necessary so as to ensure that lack of comprehension would not be due to external factors such as insufficient general knowledge. Although it is hard to measure how much prior knowledge is needed for the understanding of a concept (knowing the meaning of a word, term or a named entity, can also be regarded as prior knowledge), it was ensured that, as far as possible, all facts and events in the selected texts would be non-specialised or would be explained in the text. All events are self-contained.
Controversy of the topic: Texts containing events or opinions related to religion, sexuality, violence or to other sensitive topics were not included
in the experiments.
Terminology: None of the selected texts contained highly specialised terms, unless those terms were explained in the text.
Culture: The selected materials referred to world news and events which did not require a particular cultural background in order to be successfully comprehended.
Sources: The sources of the texts were as miscellaneous as possible in order to avoid bias based on the source. They adhered to three main registers: educational, news and general-informational articles. In total eight texts were obtained from leaflets targeted at people with cognitive disabilities, seven of which were easy-to-read leaflets produced by the National Healthcare System (UK) and one was a school leaflet. School materials comprised of eight texts from the BBC-Bitesize website1, which contains short educational articles levelled for children from the age of seven to the age of sixteen. Three texts were obtained from the VU Amsterdam Metaphor Corpus (Steen et al. 2010), three from online personal blogs, four from various UK newspapers and one from the novel “Sense and Sensibility” by Jane Austen.
Characteristics of the selected texts
A total of 27 text passages with varying complexity were obtained from the web. The genres were miscellaneous, covering educational (seven doc- uments), news (ten documents) and general articles (three documents), as
1BBC-Bitesize. available at: http://www.bbc.co.uk/education [online] [Last accessed:
Table 3.1: Characteristics of the ASD corpus
Text Genre Words FKGL Flesch
T1 Educational 163 4.93 79.548 T2 Educational 178 4.671 80.22 T3 Educational 206 7.577 65.437 T4 Educational 189 9.276 56.758 T5 Newspaper 226 11.983 40.658 T6 Newspaper 160 8.866 59.82 T7 Newspaper 163 8.765 66.657 T8 Newspaper 185 14.678 45.34 T9 Newspaper 188 9.823 58.298 T10 General 108 4.243 82.305 T11 General 141 4.561 79.108 T12 Newspaper 166 10.344 57.859 T13 Educational 209 6.087 70.124 T14 Educational 151 5.783 60.258 T15 Educational 158 6.102 57.2013 T16 Newspaper 198 13.204 46.481 T17 General 147 11.035 51.965 T18 Newspaper 227 10.171 49.093 T19 Newspaper 242 7.812 67.79 T20 Newspaper 150 9.523 64.953 T21 Easy-read 77 8.16 60.11 T22 Easy-read 96 6.73 67.33 T23 Easy-read 74 2.71 92.54 T24 Easy-read 178 5.52 75.33 T25 Easy-read 77 5.79 70.67
well as easy-to-read texts (seven documents). The mean number of words per text was m = 156 with standard deviation SD = 49.94. The mean number of sentences per text was m = 10.15, SD = 3.6. The texts covered a range of readability levels, where the average was m = 65.07 with SD = 13.71 according to the Flesch Reading Ease (FRE) score (Flesch 1949), which is expressed on a scale from 0 to 100 (the higher the score, the easier the text). Details about the individual texts are presented in Table 3.1. The Flesch- Kincaid Grade Level (FKGL) in Table 3.1 is proportional to text difficulty. Conversely, the Flesch Reading Ease (FRE) score, which is expressed on a scale from 0 to 100, is inversely proportional to text difficulty.
Below is an example of an educational text from the ASD corpus. “Before the industrial revolution in Britain, most peppered moths were of the pale variety. This meant that they were camouflaged against the pale birch trees that they rest on. Moths with a mutant black colouring were easily spotted and eaten by birds. This gave the white variety an advantage, and they were more likely to survive to reproduce. Airborne pollution in industrial areas blackened the birch tree bark with soot. This meant that the mutant black moths were now camouflaged, while the white variety became more vulnerable to predators. This gave the black variety an advantage, and they were more likely to survive and reproduce. Over time, the black peppered moths became far more numerous in urban areas than the pale variety.”
Another example of a text from the ASD corpus this time from a news article, follows:
“The season finale of The Great British Bake Off was the third most popular programme on television last year outflanked only by two World Cup football matches. The final episode of this sea-
show of 2015. Over the last five years, in fact, Bake Off has so thoroughly entangled itself with the consciousness of the nation that it has become easy to forget how very, very strange it is that 10 million Britons switch on their TV sets each Wednesday evening to watch a baking contest filmed in a tent in the country- side. No one predicted the scale of its success. Richard McKerrow and Anna Beattie, who founded Love Productions, which makes the show, tried to sell the idea for four years before BBC2 finally picked it up. Their original inspiration, they told me, was the rural baking competition at a village fete; they liked the idea that bakers were naturally generous making delicious things for others.”
Finally, an example of a general-informational text from the ASD corpus is presented below:
“Secondhand smoke (SHS) comes from burning cigarettes, pipes, or cigars. That smoke has many chemicals in it. Experts say that breathing SHS can harm a person’s body. It can also cause headaches and make some illnesses worse. People breathe sec- ondhand smoke when that smoke is close by. Use this countdown to help you breathe cleaner air! 1. Open a window to get some fresh air. 2. Tell the smoker how smoking affects them and YOU! 3. SHS bothers the eyes by making them burn and feel dry. 4. SHS raises the chances of getting lung diseases.”