PARÁGRAFO III DE LOS SEDIMENTOS
MEDIO DE VERIFICACIÓN
care histories:
A number of procedures for processing the data were designed, tested and implemented within SPSS in order to identify patients who had a coded diagnosis of oesophageal or gastric cancer at any diagnostic position and at any time within the available dataset. Once individual cancer patients were identified, all episodes of inpatient care for that patient were identified, extracted and ordered
93
chronologically in order to compile a complete record of all hospital care for each case.
The first stage of this process involves generating a list of all relevant diagnostic codes for OG cancer from the available International Classification of Diseases codes (ICD-10 version). The codes selected are given in (Appendix 3). SPSS Syntax
(Syntax 3) was written to flag the occurrence of any of these codes, not only at the primary position (DIAG1), but also at any of the 13 other available diagnostic positions for each care episode (DIAG2 to DIAG14). This generated a new binary variable (OGCANCER, 1=Episode contains an OG cancer code). Episodes with this variable (OGCANCER=1) were extracted, sorted by unique patient identifier (HESID) and by admission date in ascending order. Using the “identify duplicate cases” function in SPSS, the chronologically first episode of care containing a cancer code for each patient was flagged with a new variable (PRIMARYFIRST=1). These episodes were selected using the “selected cases” function and saved as a separate file. This file (“OGC cases”) contains the first episode of care for each patient with an OG cancer code. Manual checks on a random sample of patients were completed to ensure that the syntax was identifying the correct codes in each diagnostic position.
Using the unique identifier for each cancer case, we then extracted all their care episodes from the main HES dataset. This involves identifying and extracting episodes where cancer was NOT a coded diagnosis for each cancer related HESID, by returning to the original dataset containing all care episodes under medical and surgical specialties (“Admitted Patient Care HES EPISODES dataset”). A new variable
94
(IDENTIFIER) was generated from the “OG cancer cases” file and the “merge” function in SPSS was used to flag all care episodes for the cancer patients; even episodes not coded with C15, C16 codes. These episodes were extracted and saved as a “MASTER FILE”, one for 2006-07 and another for 2007-08 data year.
Most cases of OG cancer are diagnosed by gastroscopy. It is expected that some cancer patients’ gastroscopy-related episodes might be coded with other diagnosis codes (such as “oesophageal stricture” or “gastric ulcer”) without any cancer codes appearing in that episode. If a cancer code appears at the subsequent care episode when, for instance, the patient is readmitted for treatment of the cancer, then we know that the original endoscopy procedure was the point at which the cancer was diagnosed and the patient’s “journey” began. This highlights the need to identify sub-groups of cases for analysis according to whether the HES dataset contains a related care information (preferably starting with a diagnostic procedure) or has missing elements suggesting missing information (e.g. coding problems) or a previously diagnosed “prevalent” case whose care had begun in an earlier year.
For these reasons, additional variables to act as a filter were flagged to show whether or not each patient had gastroscopy procedure highly related to the management of OG cancer within their episodes history. Such a filter was developed using published procedure codes and definitions.[266, 267] A syntax which includes these codes was then written up to mark these procedures
(Appendix 4, syntax 4). Again this study was keen to show internal linkages, not only with procedures coded at primary position (OPERTN 1), but also at any other positions (OPERTN 2 to OPERTN 14). Once the above syntaxes were processed,
95
(SPSS function: FILE > NEW > SYNTAX) a new field variable, called the Gastroscopy
filter, was produced. This method was applied for both 0607 and 0708 OG cancer
master dataset.
It also became essential to merge the 2006-07 and 2007-08 master data files for OG cancer patients. By do doing, we were able to trackback those patients who appeared in 2007-08 with no gastroscopy filter, to check whether the upper GI endoscopy code emerged in any episode relating to that particular patient within the 2006-07 data. To merge these two years’ worth of OG cancer patients’ history, it is important to make sure that all dataset variables are in the same order with the same variable names and format, and to have the same identifying HESID for each patient in both years before the merge. Additionally, prior to the merge, a new field was added to each master file called YEAR, to make it easier to identify which year the admission originated from. Afterwards, 2006-07 and 2007-08 master datasets were merged together to make a single dataset, again by using SPSS function (DATA > MERGE FILES > ADD CASES).
The newly merged file (2006-08 merged OG cancer patients’ history file) contains all the episodes of care that are coded with oesophageal or gastric cancer (ICD-10 codes: C-15s or C-16s), in addition to other episodes which these patients had before and after the appearance of these codes between 1st April 2006 and 31st March 2008. As a result, it was possible to flag the key milestones, such as the first OG cancer and the first gastroscopy procedure coding dates in the patient’s journey.
96
HES data does not include a date of diagnosis, and the first episode of care, coded with a definitive cancer code, is not a reliable starting point for the patient’s journey.[268] Manual review of the coding sequence for individual cancer cases revealed that some of the original primary diagnoses recorded at the time of the first (index) gastroscopy were non-specific symptom codes (eg, dysphagia) or non- malignant diagnostic labels that would be compatible with cancer (eg, oesophageal stricture or gastric ulcer). The first appearance of a cancer code for such cases was typically within a few days or weeks of the index diagnostic gastroscopy, when the patient attended for another hospital episode (eg, therapeutic gastroscopy or surgery). By selecting cases whose first endoscopy episode occurred within 3 months of the first cancer-coding episode (either as a day-case or during hospital admission), we extracted a cohort of patients with a sequence of care episodes and procedures compatible with a new diagnosis of OG cancer (Syntax 5).
2.4.4 Linkages of HES data to the statutory register of deaths data from the
Office of National Statistics (ONS)
Death in hospital is a recorded variable in HES but the dataset does not capture deaths occurring post-discharge from hospital. The present project was able to benefit from having access to further linkage of HES data to the statutory register of deaths held by the Office of National Statistics (ONS), and by using the same SPSS function mentioned above, to link the death date for each patient as another variable within the merged two-year file. Consequently, the DEATH DATE 0608
97
07/2007-08 ONS based on HESID. This new variable was used to give the number of cancer deaths and to enable crude mortality and survival analysis.