• No se han encontrado resultados

Tareas para la casa entregadas a los sujetos de la muestra

Figure 6.2: Healthcare process perspective in PPM

6.4

Data acquisition from PPM

An event log is the core item of process mining research and the first step of applying process mining is accessing the system and extracting required event logs. The PPM system is not a process-aware system which means event logs do not automatically exist in the system but a process miner needs to extract them through several steps as mentioned early in Chapter 3. The steps for acquiring event logs from PPM are explained below:

6.4.1

Creating an event log from the PPM

Creating an event log from the PPM can be done through a number of steps:

1. Firstly, getting access to the data. Access to the data is given after arrangements with PPM data providers. This research forms part of the work funded in the SBRI- 1 grant, (project number 1203SBRSB2DANRSBRI Application. doc 20504-149147). The work was hosted by Leeds Teaching Hospitals NHS Trust (LTHT). This work was sanctioned according to local LTHT research and development policy. Data extraction was carried out under strict information governance procedures, including anonymisation of patient- level data. The extract data of the PPM database has been made accessible to us in 13 individual files in the format of comma separated files. All files are stored on an en- crypted pinned secure hard drive that is accessible only for authorised researchers. A form contains the Standard Operating Procedure (SOP) for using this pinned secure drive is written and signed to ensure the appropriate use of the data. Also, patients data were anonymized to ensure patients confidentiality.

2. Secondly, creating the local database. In order to extract event logs from PPM files, a local version of the extract data of PPM is created and located on our secure drive. The 13 provided files are imported into the created database.

6.4.2

Extraction an event log for the required cohort of patients

PPM data reference model is constructed in this research using the Entity Relationship Diagram in PostgreSQL Database editor (ERD). This Diagram helps to extract the event logs and identify possible care events in the EHR. Figure 6.3 shows all the tables that have temporal fields which are needed for process mining. A process can be tracked using a patient’s ID. The extraction criteria for each case study and the description of care events are discussed in this chapter for case study 1 and Chapter 7 for case study 2 and 3.

Figure 6.3: PPM data reference model generated in this research

It should be noted that, the PPM extract data that we obtained might be affected by the provenance chain of getting the data where it was extracted for a previous research project, that is discussed in [16], with different aims. Initial exploring for the tables has shown that the table of ‘chemodrugs’, which contains data about the drugs labels, drugs doses and other drugs related information, has an ambiguous date of when a drug is given. In other words, the

111 6.4. Data acquisition from PPM

event of taken a drug is recorded with a date that was before a chemotherapy cycle starting. A discussion with an expert who works on the PPM data has suggested that this date is confusing where the chemodrug should be given on the same day of getting chemotherapy and the recorded date might be resulted from data quality issues. Therefore, events generated from ‘chemodrugs’ table are excluded in this research to avoid misleading results.

Time issues in the PPM extract data:

Although the extract of the PPM data has the three components of mining patients process which are patient id, event name and event date, the temporal fields were recorded in a number of days (integer format) as a step of date manipulation to protect patients confidentiality. This number corresponds to the age of a patient when an event happened. Therefore, a method for a valid timestamped format reconstruction is required.

(a) Reconstructing ‘timestamped’ format from age determined event

In order to construct ‘timestamped’ format from the number of days of an event occurring, we need to set a default day as a start point for that. The date ‘2020-01-01 12:00:00’ is chosen as an artificial default day and then the number of days is subtracted from the default date. This method has converted the age in number of days into a valid timestamped format that can be parsed by process mining tools.

Despite the successful reconstruction of the time format, a further issue relating to event order is raised as discussed below in (b).

(b) Reconstructing events order

One of the major event logs quality issues as discussed by [95] is the level of temporal resolu- tion. Some events are recorded by time resolution which can be down to the second or time resolution down to the day only. We found that all events are recorded on ‘day only’ time resolution. Consequently, knowing the order in which events have happened on the same day is not applicable.

Hence, the order of same-day events is given based on an expected sequence of those events which have been validated by a domain expert.

Examples of same-day events are:

1- ‘Admission’ and ‘Ward stay’, as a patient should be admitted first and then stay in a ward. 2- ‘Regimen start’ and ‘Chemotherapy session’, Regimen of treatment should be discussed and approved then chemotherapy treatments are given.

Documento similar