Índice de contenidos
F. SERNA, TOMÁS 14 FUERTES LOPEZ, SARA 29
The BKM method of cluster analysis requires that the analyst specify the desired number of clusters prior to running the procedure. To compare the effect of requesting different numbers of clusters a range of solutions was produced and their results compared. Iterations were produced ranging from 2 to 30 clusters. Table 8.2 shows the between-cluster variance/ total variance ratio for each of these cluster solutions, and the iterative improvement between each one. The table suggests that a 14 cluster solution offered the best ratio of between-cluster variance and total variance. This co-efficient rises steadily as the number of clusters increases (although there are a number of stages that result in a temporary reduction in this metric, for example partitions 8, 11, and 18). With the aim of selecting a parsimonious solution a 14 clusters solution was initially explored. However, to consider the implications on analysis and interpretation, the sub-clusters produced by a 30 cluster solution were also inspected.
Table 8-2 Comparing the between-cluster variance to total variance ratio of cluster solutions
No. Clusters Ratio
159
Figure 8-1 Dendrogram showing the hierarchical structure of clusters
Figure 8.1 presents a dendrogram of the clusters identified by the analysis. In this diagram, each numbered vertical line represents a partition at which an existing cluster is split into two. It is, therefore, possible to ‘roll’ partitions backwards and forwards to consider the different structures that would be produced for different numbers of partitions. This is an advantage of hierarchical methods over pure k-means clustering. As noted above, the first iteration of the analysis stopped at 14 clusters. A second iteration drilled down further to produce a 30 cluster solution. The clusters identified in the second iteration have been labelled as sub-clusters of their parent cluster. Thus the sub-clusters that are produced by the further division of cluster 14 become 14a, 14b, 14c, 14d and 14e and so on. Overall, 20,608 cases were successfully assigned to one of the 30 clusters, leaving 7%
of cases unclassified35. Guided by the hierarchical structure that is a characteristic of BKM cluster analysis, the 30 clusters have been presented as 14 ‘parent’ clusters with a variable number of sub-clusters. The clusters and sub-clusters that were identified are listed in Table 8-3 below. The current
35 Inspection of unclassified cases did not reveal any specific patterns within the group. Many of these MO descriptions were very brief or incomplete and the group included many MOs in which administrative and audit relevant information dominated the crime description.
160
section will describe the clusters produced by the analysis, Chapter 9 will consider the integrity of the cluster solution as a whole.
Table 8-3 Clusters identified from theft from the person and robbery of personal property MO descriptions
Cluster No. Cases
1 Pickpocketing at Venues 692
2 Bag snatches 2392
2a including resistance and injury 110
2b from pushchairs and including verbal aggression 144
2c offender on cycle 1583
2d including following victim home 555
3 Jewellery Snatches 682
4 Theft of Cycles 757
5 Theft from Bags (Shops) 2185
5a theft from Bags (Shops) 1388
5b theft from Bags (with distract) 285
5c including banks, cards and withdrawals 424
5d thefts from shop staff 88
6 Robberies within Dwellings 1355
7 Demands and Threats 1472
7a not involving knives 1010
7b involving knives 462
8 Involving taxis 313
9 Vehicle related crimes 2318
9a Carjacking and thefts from cars 1645
9b Car thefts from dwellings 673
10 Robbery involving young people 1262
10a in parks 378
10b on footpaths, alleyways and in schools 884
11 Theft from bags at Licensed Premises 689
12 Robbery with Assault 2842
12a with assault and injury 837
12b with assault no injury 204
12c with assault domestic 1801
13 Mobile phone theft 1355
13a with communication 989
13b without communication 366
14 Thefts on buses and at bus stops 2294
Not assigned to a cluster 1585
Grand total 22193
161
Table 8-3 and the descriptions of each cluster below highlight that the analysis did not produce a one-to-one mapping of clusters to scripts or tracks. Some of the clusters can be interpreted as a script track in a straightforward manner. Other clusters contain a small number of tracks relating to the same script or reveal different permutations through which the same track is performed. In some cases the clusters contain different tracks that are not related to each other and have grouped together cases that could not, through logic or experience be considered part of the same script.
Thus the identification of different script tracks stems partly from the clusters, but also from additional inspection and interpretation of the data. The relationship between the clusters, sub-clusters, scripts and tracks are depicted in Figure 8.31 at the end of this chapter.
The following sections will describe each of the 14 clusters and their sub-clusters. Script and CCO frameworks were used to organise the description of the distinctive features of each cluster. Script frameworks helped to organise the information, available from the analysis relating to the distinct script tracks available to accomplish different theft from the person scripts and the combinations of script procedures that are required to complete these tracks. The CCO framework helped to identify causal factors, predominantly those in the situation, that potentially shape offenders' actions and the procedures they elected to follow.
Each cluster description will be accompanied by a series of figures, graphs and tables. This includes a graphic depiction of the tokens that are characteristic of each script. This is represented in a
wordcloud. The wordclouds include only those tokens that are characteristic of a given cluster i.e.
those that occur significantly36 more than would be expected if all clusters shared a similar
proportion of all MO features. The greater the significance of a token for the cluster, the larger the font size in the diagram – the font size is weighted by the chi-square score and not by the raw frequency with which token occurs in the cluster. Each cluster description also includes a table highlighting examples of the cases that best represent the cluster. These cases were selected on the basis of the cluster membership score (see section 4.4.1). The cluster membership scores for each case are proportional across the 14 parent clusters. Therefore a hypothetical case that had equal probability of being assigned to any cluster would have scores of around 7% for each cluster. A case membership score of 100% indicates total confidence that a case is member of a given cluster (and all other cluster membership scores would be zero). Finally, a tree a diagram has been produced for
36 Chi Square > 10.83 (p<.001 df=1) see Methodology Section 4.4.1.
162
each cluster showing the relevant script tracks and CCO elements most indicative to that cluster (see Appendix 2).
Throughout the discussion of clusters, example MO descriptions are faithfully reproduced from the police data, the only changes to raw text being the removal of any identifying information, the conversion of multi-word phrases to one token and the highlighting of significantly distinctive tokens in red.
Cluster 1: Pickpocketing at venues
Cluster 1 Mapped to one theft from the person track: Pickpocketing at Venues
Figure 8-2 Significant tokens in cluster 1 (font size relative to chi-square score).
Figure 8-3 Significant tokens in cluster 1 – top 5 tokens removed (font size relative to chi-square score).
163
Table 8-4 Examples of MO descriptions assigned to cluster 1: pickpocketing at venues. Includes MO descriptions that had (a) high (b) average and (c) low cluster membership scores.
Cluster Membership Score stolen from his jeans pocket by unknown offender.nightclub ips left jogging bottom and stole cash from_his_pocket estimated
ii bmt at offence_location ip was in main floor of a large concert when ip had mobile_telephone stolen from jeans pocket by uk [unknown] offender. picked ip taxi badge off ip's front sweater "road
CMS = 7%
ii ip walking out_of concert when persons unknown took
mobile_phone from out_of pocket of jacket ip was wearing.
entertainment offenders x 2 located in close proximity
CMS = 8%
Words highlighted in red are significantly characteristic of this cluster (Chi-Square p<0.001)
Cluster 1 related to instances of theft from the person occurring principally in
entertainment/concert venues, but also, more generally in licensed premises. This was apparent in Figure 8-2 and Figure 8-3 which highlight these tokens as by far the most distinctive (i.e. those
164
significantly overused) in the cluster. Figure 8-2 presents all of the tokens that significantly represent the cluster. However, the significance values for entertainment, pocket, clothing and crowd are so high that all other words are obscured. In Figure 8-3 the top five tokens have been removed to allow the inspection of the remaining significant tokens. To further indicate the characteristics of this cluster, Table 8.4 provides a selection of cases, with the column A listing those cases that were most characteristic of this cluster (highest cluster membership scores) and column B listing cases that represent an average case in this cluster (average cluster membership scores). Column C provides examples of cases that were included in the cluster but may have been a poor fit i.e. they did not share many key attributes characteristic to the cluster.
This cluster was very clearly defined by the wider environment in which the offences take place.
Drawing from the literature it can be presumed that target search, selection and offence commission appear to take place in the same location. Entertainment was very distinctive to this cluster with the token used infrequently in the other clusters with the exception of cluster 11 below. Thus, while a significant number of pickpocketing crimes occurred in entertainment venues, it was also the case that the majority of thefts occurring in entertainment venues were pickpocketing; there were very few robberies in this setting.
Other significant tokens highlighted the features of the wider environment that were conducive to committing pickpocketing offences. Crowd featured highly amongst the tokens characterising this cluster, suggesting that offenders may have taken advantage of crowded locations to gain proximity to the victim. This is supported by the fact that there were no significant tokens relating to the offender approaching the victim. The very dominant role of specific environments, in this cluster, pointed to crime pattern theory’s concepts of crime generator and crime attractor. However, from the MO descriptions, alone it is not known whether offenders made the journey to these venues specifically to offend or travelled to the venues for other reasons and took advantage of
opportunities once they were there.
Other than their location in a suitable location, there was little to indicate why specific victims were targeted. Tokens such as dance_floor and stood_at_bar may suggest victims were distracted by other things at the time of the offence but this is not clear, even from inspection of the raw data.
Despite their location in entertainment venues and licensed premises, there was little reference to alcohol or intoxication.
There were few significant tokens relating to engagement between offenders and victims, in fact, the reverse is true, with significant tokens indicating that, in many cases, victims were aware of little
165
other than someone brushing them past and in a significant number of cases, the offender was not seen. There was some evidence of limited verbal engagement between victims and offenders in this script. There were 90 references to conversation, a statistically significant level for this script. There was little reference to physical contact, although, 17 cases included ‘Hug’ and this was a distinctive token for this cluster and not significantly overused in the other clusters. There is no significant mention of physical violence or threats in this cluster.
Other keywords in this cluster related to the target enclosure and indicated, that, in the majority of these cases, the property stolen was located within the victim’s clothing (including specifically in their pocket). These features were also shared with other clusters including 12 and 14. Of note, is the fact that the token 'bag' did not occur to a significant level in this cluster, it appears that, in this group of cases, bags were not stolen and items were not removed from bags. Amongst the poorer fitting cases in Table 8.4, clothing and pockets appear to be one of the few attributes shared with the cluster overall.
The targets or items stolen included mobile phones and wallets/purses37, again features shared with other clusters. With regards to the transfer of property, there was no reference to the snatching or grabbing of property. The transfer of property, from victim to offender, was conducted covertly and was dependent upon stealth. Thus, items appear to have been removed and pickpocketed.
Summary of cluster characteristics
In summary, the cases in this cluster all relate to one clear track for the commission of theft from the person; there was little variation in the cluster relating to how the track was performed. The
offences in this cluster are those in which the offender has taken advantage of the proximity afforded by crowded locations and the distraction of would be crime preventers, to remove commonly carried valuable items that are relatively accessible (in the context of a crowded situation) by being carried in pockets. The need for a specific tactic of approach is removed by the crowded nature of the situations; similarly engagement with the victim is not required, in fact, it may be avoided as the transfer of property relies on stealth and secrecy. This cluster, as will others, demonstrates the intrinsic link between the situation in which the offence takes place and the methods chosen to complete it. As can be seen in the dendrogram (Figure 8-1), this cluster was not split when further partitions divided the data. Eighty eight per cent of cases in this cluster were classified as theft from the person, this confirms that the cluster related to theft from the person
37 Purses were coded as wallets for the purpose of tokenisation
166
scripts but suggests from the remaining 18% that there were a sizable minority of cases that were a poor fit for the cluster.
Cluster 2: Bag Snatches
Cluster 2 Mapped to five bag snatch robbery tracks, four of which were robbery scripts and one theft from the person.
Figure 8-4 Significant tokens in cluster 2 (font size relative to chi-square score).
Figure 8-5 Significant tokens in cluster 2. Top 5 tokens removed (font size relative to chi-square score).
Cluster 2 is a relatively large cluster with 2392 cases, around 12% of the cases successfully assigned to clusters, however a fairly varied range of offences and related script tracks are contained within
167
the cluster. It is apparent from Figure 8-4 and Figure 8-5 that cases in this cluster are grouped together, primarily due to a common target i.e. the property stolen which, as is overwhelmingly apparent from Figure 8-4, is bag. Other tokens significantly overused in the cluster indicated that this cluster is characterised by theft offences in which bags are stolen by grabbing or snatching the entire bag from the victim, rather than dipping into the bag to retrieve other property.38
The wider environment in which offences take place (and most likely the identification of victims too) for this group of cases is characterised by roads, alleyways and footpaths. Underpass was also significantly over-represented here. In contrast to other clusters, tokens indicating an approach were significantly over-represented, although, often cases only stated approach and nothing more; cases in this cluster can still be distinguished from those where there was no clear approach. Where the vocabulary used provides more information, it appears to indicate approaches that are engineered to ensure surprise, run and run-past, behind.
Walk is a significantly over-represented token in this group; however, it is necessary to inspect the context in which the token is used to determine whether this relates to the victim, the offender or both. Inspection of cases revealed that, in the majority of cases, this token was used to describe the victim walking. This is one of the few clusters where some detail relating to victim characteristics was significantly overused, in this case the token lone_victim. While this was significant, it was not present in every case39.
The transfer of property, in this cluster, is predominantly achieved though grab, pull and grasp.
However both force and no force are significant within this cluster. This shows that there are different script tracks within the cluster40. The use of physical force is commonly directed towards the target property rather than the victim (as shown in example Ai in Table 8-5 below) and again, points to the role of surprise and speed to expropriate property. Similarly, both no_injury and injury were significantly over-represented in this cluster. Injury was frequently related to falls to the ground caused by the force of the grab (as in examples Bi and Bii below).
38 Note the absence of other target words such as mobile phone or wallet which would be present in a dipping script.
39 In the frequent MO descriptions that do not refer to either a lone victim, or a victim with friends/others it is not possible to determine whether or not the victim was alone, or whether this information has simply not been recorded.
40 There may also be different interpretations of ‘force’ within the data with some officers referring to physical force generally and other using the term to refer to forceful assault
168
No tokens relating to verbal communication were significantly overused in this cluster. Although verbal communication occurs in some cases (Table 8-5 Example Aiii below provides an example), the lack of speech shows that it is not necessary, and at times, counter to a script track hinged on the element of surprise.
Table 8-5 Examples of MO descriptions assigned to cluster 2: bag snatches. Includes MO descriptions that had (a) high (b) average and (c) low cluster membership scores.
Cluster Membership Score
A) High B) Average C) Low
i between_stated_times unknown offender has approached ip on [name of]
road and by using force has snatched ip's handbag unknown offender has then made_off towards [name of]
road with handbag. road CMS = 73%
offender a single white male has approached the ip whilst she was stood at the front door. offender has grabbed the ip's handbag and after a brief struggle where the ip has fallen and injury her head the offender has pulled the bag from her grasp and has then made_off towards the [name of]
road. road broken and the bag fallen to the floor. one of the offenders grabbed the money from the bag and the bag and ran_off in an unknown direction. road CMS = 73%
as ip was walking along [name of]
street with her sister as they have got to the junction of [name of] crescent a group of ic1 males aged 16 to 18 years have approached them one of the males has pushed the ip in her left_arm & grabbed hold of her shoulder_bag the ip has hung onto her bag until falling_on the ground the offender & other two males have run_off towards the direction of [place name]road
CMS = 31%
offenders unknown have bumped into ip spinning him round and at the same time taking his wallet and contents grabbed her bag. the ip has held onto her handbag. the offender has then said give_me the bag''" the ip has refused to let go and the her left shoulder causing the ip to fall onto her hands and knees and then onto her left shoulder. using unknown means took the ip's handbag from her left shoulder and made good escape running along [name of] road towards [name of] road. the ip suffered sorness and reddening to both palms and pain to her left shoulder. road
offenders unknown have bumped into ip spinning him round and at the same time taking his wallet and contents grabbed her bag. the ip has held onto her handbag. the offender has then said give_me the bag''" the ip has refused to let go and the her left shoulder causing the ip to fall onto her hands and knees and then onto her left shoulder. using unknown means took the ip's handbag from her left shoulder and made good escape running along [name of] road towards [name of] road. the ip suffered sorness and reddening to both palms and pain to her left shoulder. road