CAPÍTULO I. EL CONTEXTO ECONÓMICO Y LA PROBLEMÁTICA DE LA POBREZA
2.1 Enfoque utilitarista
7.1 Introduction
We define ‘hybridisation’ as the process of producing a hybrid dataset, with conviction records contributing from more than one source. We distinguish between an operational system, where the intention would be to include new
conviction records from the PNC onto the OI which would never have appeared on the OI – examples are Scottish convictions, convictions for non standard list offences and cautions - and this comparative work. The comparative work outlined below is concerned with how similar the two datasets are if similar criteria are used to restrict both datasets. We refer to this as the levelling of the datasets. Once the datasets have been levelled, the matching process can begin.
The procedure for preparing data for hybridisation is therefore as follows:
• to level the datasets so as to ensure a level playing field in any comparative work. For example, the OI contains mainly standard list offences; the PNC data contains all offences. The PNC will therefore contain more offences than the OI. Four levelling criteria were identified.
• to match the link file with the OI and PNC files at the level of the individual. This will produce a composite file with three sets of information – the link information, the PNC information and the OI information. This file provides information on whether the names, dates of birth, gender and other details are the same in the three files. • within each matched individual, to match the court disposal records on the OI to those on the PNC.
This section is concerned with the levelling of the datasets; Section 8 discusses in detail the matching process.
7.2 Producing a composite file at the level of the individual
There are three tasks in this process:
• Identify and delete duplicated individuals in the link file. Some link files, particularly those originating from Probation services, have the same individual entered more than once. Where possible, the latest entry was retained, with all other duplications being deleted.
• Delete duplicated records from the raw OI and PNC data files. This aids certain statistical packages in their aggregation processes and avoids any overcounting of offences.
• Match individuals by merging the link file with the PNC file aggregated on PNCID and with the OI file using the unique individual ID.
7.3 Levelling the court-level datasets
As the PNC holds many categories of records that are not the concern of the Offenders Index, any comparison of data- completeness requires a procedure to restrict both datasets to a common level of detail. This process, termed levelling, is represented in Figure 7.1. The four levelling criteria identified were:
Figure 7.1 The levelling of the PNC and OI datasets to a common level of detail
PNC Standard List OffencesCut-off date
Non court
disposals
Standard List Offences
Standard List Offences
Non England & Wales cases on PNC (eg
Scotland, Channel Islands, British Forces
and some Transport Police cases)
PNC
Time
Expected
Match
Cut-off date
Time
Non SLOs
Non SLOs
Offenders Index
(England & Wales
courts)
a) levelling criterion 1 – conviction
The PNC dataset contains information on cautions, warnings, and cases awaiting trial, as well as convictions. The OI only contains convictions. All court dates and associated offences not relating to a conviction were identified with an indicator and excluded from the comparison process.
b) levelling criterion 2 – standard list offence
As stated before, the OI contains mainly standard list offences; whereas the PNC data contains all offences. Both files were brought to a level playing field by excluding all non-standard list offences from both datasets. Note that the OI needs to have non-standard list offences excluded as non-standard list offences may be present where the principal offence is standard list. No ‘principal offence’ is a non-standard list offence on the OI. c) levelling criterion 3 – caution/conviction date not too recent or old for match
The datasets that we are analysing are generated by other research projects, and the requests for criminal conviction histories to PITO and to the OI were made at different times. We therefore need to take into account the date of request. Even if the date of request is the same for the PNC and OI, the OI is updated quarterly six months in arrears and is likely to contain less recent information. We account for these differences by finding the latest court record in the OI dataset, and the latest court record in the PNC, and taking the less recent of these two dates. All convictions more recent than this date would be excluded. For example, the OI sentencing sample has a most recent conviction date of 31 December 1997, the PNC date is more recent. For this dataset, all convictions from 1998 on will be excluded from the PNC file.
A separate issue relates to convictions before 1963. As the Offenders Index contains only convictions
recorded in 1963 or later, and 1963 dates on the OI are unreliable, all convictions with conviction dates prior to 1964 were excluded.
d) levelling criterion 4 – sentenced within same jurisdiction
The PNC data includes data from a wide number of different jurisdictions. These additional data records consist mainly of Scottish conviction data, but it is also possible to get conviction data from the Isle of Man, Jersey, Guernsey, Northern Ireland, the Ministry of Defence, the Army, the Navy and the British Transport Police. The OI, in contrast, include only convictions to civilian police in England and Wales. The datasets were levelled by making sure that only England and Wales convictions brought by civilian police forces were included in the two files.
Details of the number of records which meet each of the levelling criteria are given in Appendix 3 for the research datasets.