substrates of pharmacological interest
4. Diketopiperazine based-‐ligands of prion protein
4.2 Library synthesis
Crymble compares three different methods of identifying the Irish in the Old Bailey Proceedings (hereafter OBP) between 1801 and 1820: nominal record linkage, geographic keywords (for instance ‘Irish’ or ‘Dublin’), and surname analysis.10 Nominal record linkage involves locating individuals within a source and then corroborating details about them (their nationality for instance) by finding mention of the same individual in a different source (such
9 B. Walter, ‘Strangers on the Inside: Irish Women Servants in England, 1881’, Immigrants &
Minorities, 27:2 (2009), pp. 279-299 (p. 287) <https://doi.org/10.1080/02619280903128160>.
10 Crymble, ‘A Comparative Approach’, p. 141.
73 as the census). Crymble dismisses the method for a number of practical reasons, but primarily because he is loath to be limited to the study of ‘periods and contexts in which corroborating records exist’.11 He finds the second method, geographic keyword searching, to be ill suited to his source, as the OBP rarely contained national adjectives. However, he finds the final method, surname analysis, to show promise as a means of locating first- and second-generation migrants.12
‘Surname detection’ saw Crymble analyse the records of 279,949 adult males recorded by the 1841 census as living in the hundred of Ossulstone in Middlesex in order to give each of their names an ‘Irishness’ score.13 This resulted in the creation of a list of 283 root surnames that were deemed ‘Irish’ enough to reliably identify ‘probable Irish individuals’ located in London towards the end of the long eighteenth-century (defined for the purposes of the study as 1777 to 1820). The surname list is included in the appendix of Crymble’s article for use by future researchers.14 To test its efficacy on nineteenth-century newspapers, I used the list to search the Pall Mall Gazette. Aside from it containing some of the best-quality OCR data of any of the titles in the newspaper sample, a London paper seemed fitting since the surname list was formulated using census data from the region.
The list of 238 surnames returned a promising 185,300 results in the Pall Mall Gazette.
However, it swiftly became apparent that as in Beals’ findings, homonyms were a serious issue.15 Many of the results did not relate to people, Irish or otherwise. The surnames that returned the most results included ‘Early’ (which occurred 53,095 times), ‘Coffee’ (8,946 times), and ‘Divine’ (7,488 times), all of which, as their collocates confirm, have other meanings that are more prominent than their use as surnames. As discussed in section 2.4.4, the corpora in CQPweb have been ‘marked-up’ with ‘part-of-speech’ tags. I therefore searched
11 Crymble, ‘A Comparative Approach’, p. 141.
12 Crymble, ‘A Comparative Approach’, p. 143.
13 Crymble, ‘A Comparative Approach’, p. 145.
14 Crymble, ‘A Comparative Approach’, p. 152.
15 J. Beall, ‘The Weakness’, pp. 438-444.
74 for instances of ‘Early’, ‘Coffee’ and ‘Divine’ which had been tagged as single proper nouns (the tag ‘NP1’), the tag under which surnames would fall. The results of this search seemed to indicate that there were no Irish surnames amongst the 69,529 instances of the three words in the Pall Mall Gazette (although, some instances of the words being used as surnames could have been missed by the automatic part-of-speech tagger). ‘Early’ and ‘Divine’ were never tagged as single proper nouns and whilst ‘Coffee’ was tagged as such 59 times, all instances related to one individual, King ‘Coffee Kalcalli’ of Ashanti, an empire that existed in present-day Ghana.
Once these homonyms were excluded, many of the results did appear to be surnames.
Furthermore, the appearance of Irish place names in the concordance lines indicates that some of the individuals identified were, most likely, Irish. Though it is worth noting that they were more likely to be Irish citizens resident in Ireland than Irish migrants to Britain. However, several other potential issues arose.
One was the weighting of results towards famous individuals. As perhaps expected, amongst the surnames that appeared most frequently in the Pall Mall Gazette were those of high-profile figures such as politicians. These included ‘Kelly’ which occurred 4,209 times, many of which referred to Fitzroy Kelly the Tory MP, and ‘Burke’ which occurred 3,957 times, usually in relation to the Irish statesman Edmund Burke or John Bernard Burke, the genealogist responsible for Burke’s Peerage. That Edmund and John Bernard were Irish or of Irish descent is a testament to the accuracy of Crymble’s list. However, although this provided an insight into the types of people likely to be named by the press, these high-profile figures’ strong association with other discourses made it unlikely that they would reveal much about the newspapers’ wider attitude towards migrants.16
16 It does, however, seem to reinforce Galtung and Ruge’s suggestion that celebrities are more likely to feature in newspapers than non-celebrities.
75 Ultimately, although surname analysis proved fruitful for searching the OBP because
‘every single defendant [in the source] has a known name’, it is not necessarily the most practical means of searching newspaper data, where individuals rarely feature unless they already have a level of renown or notoriety.17 This was confirmed by the discovery that most mentions of non-famous Irish individuals in the newspapers seemed to occur within reporting upon criminal trials, the context in which the list was originally intended to be used. Court or criminal reporting appears to have been one of the main contexts in which non-famous individuals gained a high enough profile to feature in the pages of the press in their own right, rather than as part of a group such as ‘aliens’.
It is also doubtful whether this method could be scaled both geographically, to non-London newspapers, and temporally, to the later nineteenth century. Although Crymble manages to successfully apply his method to earlier sources, the Middlesex Vagrancy Removal Records (1777 and 1786), he acknowledges the importance of using the ‘appropriate list of names’ and adds that the large numbers of Irish migrants who arrived in Britain during the Famine era (the 1840s and 1850s) ‘dramatically changed the demographic’.18 Therefore, if surname detection were to be used upon later nineteenth-century texts, a new list of names would probably have to be created that drew upon later census data.19 Since this thesis is not just interested in the Irish, but in the newspapers’ representation of migrants more generally, it would be necessary to create new lists of names that could be used to locate migrants of different nationalities. All these factors combined to mean that, on balance, searching by surname was not the most suitable method for use in this research.
17 Crymble, ‘A Comparative Approach’, p. 144.
18 Crymble, ‘A Comparative Approach’, p. 144.
19 This is also the focus of Smith and MacRaild’s earlier study which attempted to create a corpus of Irish surnames. M. Smith and D. MacRaild, ‘Paddy and Biddy No More: An Evolutionary Analysis of the Decline in Irish Catholic Forenames among Descendants of 19th Century Irish Migrants to Britain’, Annals of Human Biology, 36:5 (2009), pp. 595-608 <https://doi.org/10.1080/03014460903117459>.
76
Searching using Categories
The other two types of query draw upon categories, that is the division of people into groups based upon shared characteristics or attributes. One draws upon the shared characteristic of nationality, and the other upon shared citizenship status or the shared act of migrating. Categories are a necessary means of simplifying and digesting the world around us, including the people within it. Indeed, Fowler notes that as a medium that purports to make sense of events, there is a particularly ‘dense presence in newspaper discourse of category labels’.20 This means that categories are so commonplace they become difficult to consider reflectively. Foucault documented his encounter with ‘a certain Chinese encyclopaedia’ that made the following distinctions between animals: ‘(a) belonging to the emperor’, ‘(f) fabulous’, and ‘(n) that from a long way off look like flies.’21 It was only when confronted with this unfamiliar, and seemingly ridiculous, classification system that Foucault was able to see the flaws inherent in his own and begin to question it as an outsider would.22
As they group things together based on shared characteristics, categories are inherently reductive. This has its advantages; categories can give the historian a useful insight into the relationship between the enquirer and object of enquiry. In the case of newspaper reporting upon migrants, this relationship is often oppositional. For instance, the category
‘alien’ contains people who were born in a country different to that of the person doing the categorising. However, it can also be a disadvantage because, as discussed, using categories can skew the historians’ findings towards the negative. Although not inherently derogatory, categories are often used to emphasise difference and otherness. In this sense, categories
20 R. Fowler, Language in the News: Discourse and Ideology in the Press (Abingdon: Routledge, 1991), p. 93.
21 M. Foucault, The Order of Things: An Archaeology of the Human Sciences (London: Routledge, 2002), p. xvi.
22 Foucault, The Order of Things, p. xvi.
77
‘establish psychologically significant dividing lines between the perceiver and the target (i.e.
out groups).’23