Capítulo 3 Evolución de la gestión del Programa en temas relevantes
3.3. Progresos en la estrategia de integración de cadenas y en la consolidación de los
0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Li ke rt s ca le ( 1 -5) Participants
Acceptability and Face Validity
A B C D E F G
131
Forty-four participants reviewed eighty-eight (surgical, orthopaedics and urology) patients during this study. These 88 cases were included in this validation phase. On each ward round performance was evaluated on two surgical patients using the SWAT. The clinicians’ total scores were evaluated as opposed to the individual scores for each step in the SWAT. Participants were evaluated using the SWAT for both task specific and global components of the scale.
5.3.5.1 Construct validity and reliability
SWAT demonstrated significant levels of construct validity, inter-rater reliability and inter- test reliability. These were evaluated separately for generic and task specific parts of the SWAT.
Task specific list: A significant difference amongst the specialists and trainees (registrars) has been demonstrated (p=0.001) (Figure 5.9). Specialists scored significantly higher than trainees and there was no overlap between the senior and junior groups.
The task list also demonstrated high inter-rater reliability. Intra-class correlation was 0.899 (p=0.001) for single items and 0.947 (p=0.001) for the total scores. A high inter-test (between two cases) reliability was demonstrated when measured using Pearson correlation (r=0.893; p=0.0001) (Figure 5.10).
132
Figure 5.9 – Construct validity. Box plots demonstrating significant difference in mean scores
between consultants and registrars.
Global rating scale: Significant level of construct validity (p<0.001) was established amongst the participants. A significant difference (p=0.001) was noted amongst the specialties (general surgery and urology). This could have been due to the fact that within urology group we had relatively more specialists.
The global scale showed high level of inter-rater reliability. Intra-class correlation was 0.970 (p=0.0001) for single items and 0.985 (p=0.0001) for the total scores. Inter-test reliability was also significant (r=0.574; p=0.0001) (Figure 5.7).
133
134
Limitations:
This research study has a few limitations. First, the participants are recruited from a single geographical area with a limited number. Second, FMEA effectively depends on the team leader and members who examine the process and map the potential failures; it could be limited by their clinical experience previous failures or failure modes. In order to overcome this bias, experienced team members were selected. Third, blinding could influence results but video based assessment in not feasible and acceptable in the real setting. One the assessors were blinded to grade of the participants. However, when the results were evaluated from the simulation settings using the same tool, the outcomes were equally good. Fourth, some of the processes (identified by modified HFMEA) in the task list did not apply to every patient. For instance, some patients did not have drains, lines or wounds to be checked. This was taken into account whilst scoring and tasks and average scores were analysed. Finally, the educational impact of SWAT was not measured. It can potentially be measured after giving feedback of performance to candidates and training them on how to improve. The SWAT reliably correlated with surgical ward-round competence and may be used as a valid assessment tool both in real and simulated environments.
135
5.4 Summary
This is the first study in which an established method of observation has been used to validate a non-technical skill assessment tool during ward rounds. SWAT not only allows assessment of task-specific components during the ward round but also provides an overall estimation of non-technical skills such as communication, decision making, team-working, situation awareness and leadership in various surgical specialties. One of the two raters was blinded to the participant identity. Taken together, we can suggest that SWAT demonstrates feasibility, validity, reliability and acceptability. It has a significant potential to be used in the context of communication, decision making and team-working assessment. This tool has ability to be used at both trainee and specialist levels.
Ward round assessment tool is validated in both simulated and real settings. It can be used to evaluate performance of trainees and specialists. Simulated settings can be created using the minimal resources. This chapter does not evaluate education impact of the SWAT. Further studies are in progress to evaluate its impact.
Chapter 4 and 5 identified and developed tools for assessment of technical and non-technical skills. Based on the conclusion from the interviews of the specialist (Chapter 3), the following chapter will identify the methods for evaluation of technical and non-technical skills that will be based on evaluation of clinical outcomes.
136
137
CHAPTER 6
SAFETY OF TRAINING AND PERFORMANCE ASSESSMENT IN
OPERATING THEATRES – A META-ANALYSIS
This chapter aims to establish systematic evidence for safety of training in the operating theatres by comparing performance of trainees against the specialists/trainers. It also explores the possibility of using early, intermediate and late outcomes of cardiac surgical procedures to determine performance of the clinicians.
138
6.1 Background
In addition to observation of clinical practice and its evaluation in measureable terms, quality of care and individual performance can also be measured by evaluating the clinical outcomes. The quality of care within a healthcare, both at individual and system level, can be assessed by evaluating the structures designed to provide healthcare, appraising the process of healthcare delivery, and measuring clinical outcomes (232). Procedural outcomes contribute to the clinical outcomes of a healthcare system (233). They can be used to assess clinician performance and healthcare effectiveness (12, 37, 234). The measurement of procedural outcomes can provide comprehensive feedback towards quality improvement for the speciality trainees, specialists and healthcare organizations (Figure 6.1).
Although the procedural training has traditionally focussed on the perceived gold standards of volume-based learning and observational assessment of skills, these are now limited due to the concerns regarding patient safety (235-237). It has been shown that the procedural experience gained through the volume of exposure reduces morbidity and mortality in procedures associated with a higher incidence of adverse events (238). Within these high risk disciplines, clinicians can be assessed using a combination of structure, process, and outcome measures (239). However, it is imperative to provide evidence that the workplace is safe for training and assessment. In order to establish this evidence, a meticulous method of data collection, analysis and dissemination is required.
With regards to procedural outcomes, cardiac surgery is one of the primary adopters of data collection, analysis and publication. They are the pioneers who took the initiative and developed a process to publish their outcome data. Cardiac surgeons’ data are widely available for
139
comparison, than any other craft discipline. Various surgical specialities such as general surgery, vascular surgery, orthopaedics and urology are in the process of developing a consensus to implement a similar system of reporting of the clinical outcomes.
Coronary artery bypass grafting (CABG) and valve surgery can be ideal examples for outcome measurement given both their frequency and their potential for serious morbidity and mortality (239). These procedures therefore can be used as a benchmark for the assessment of performance. However, there are a limited number of studies which have attempted to establish the safety of training by comparing procedural outcomes of trainees with that of the specialists or consultants (240-254). Once the safety is established, these outcomes can also be used for assessment of performance (58).
This chapter aims to establish systematic evidence for safety of training in the operating theatres by comparing performance of trainees against the specialists/trainers. It also explores the possibility of using early, intermediate and late outcomes of cardiac surgical procedures to determine performance of the clinicians.
140
Figure 6.1: An over-view of Workplace based assessment of performance
Pre-operative risk stratification
Direct observation of procedural, communication and decision making skills (Tools for assessment
of skills) Outcome based assessment (Early, intermediate, late outcomes)
System
Technical Skills Team Working, Communication & Decision making Healthcare finances Measurement & audit reporting system Organisational141
6.2 Methods
This chapter identified the studies that mentioned cardiac surgeons’ outcomes at both trainees and trainers (specialist) levels. In performing this study, guidelines from the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) were followed (115).
6.2.1 Eligibility criteria
Studies comparing quality indicators of cardiac surgical procedures such as CABG and valve surgery (aortic and mitral valve surgery) between the specialist and trainees were included in this study.
The outcomes of interest were mortality, morbidity and resource consumption. All non- comparative studies were excluded, as were those in which the outcomes of interest were not documented. When studies from the same authors and institution were published during the same or overlapping periods, the most recent or most informative article or the one covering the widest chronological period was included in order to avoid double publication.
6.2.2 Information sources and search
Medline (1950-2012), EMBASE (1980-2012) and PsycINFO (1967-2012) databases were searched.. A combination of the following MeSH terms and keywords was used: “education”
142
(MeSH) or “teaching” (MeSH) or “staff development” (MeSH) or “training” (keyword) and “cardiac surgical procedures” (MeSH) or “coronary bypass graft” (MeSH). Cochrane and DARE (Database of Abstracts of Reviews of Effectiveness) databases were also checked for any systematic reviews. No language restrictions were made. References from the selected articles were also reviewed.