7. PARALELOS Y TRANSMISIÓN
7.2. LOS DESTINOS DEL CUENTO DE LOS DOS HERMANOS
The Pittsburgh Sleep Quality Index (PSQI) was developed by Buysee ( 1988) with the stated aims of providing a reliable, valid and standardized measure of sleep quality, which could discriminate between good and poor sleepers. In addition, the index should be easy to interpret and provide a brief, clinically useful assessment of a variety of sleep disturbances that might affect quality. It has been used in 900 publications and assesses self-rated activity over the previous month in order to identify patterns of dysfunction. Nineteen items are grouped into 7 component scores and weighted equally on a 0-3 scale. Components are a combination of qualitative and quantitative items, empirical and clinical in origin rather than statistical and comprise: Subjective Sleep Quality, Latency, Duration, Habitual Sleep Efficiency, Sleep
Disturbances, Use of Sleep Medication and Daytime Dysfunction. Component scores are summed to provide a global score ranging from 0-21. A high score indicates poor sleep quality.
5.2.4.1. Factor Analysis
Buysse concedes that the tool is lacking in factor structure but goes some way to addressing this in a study examining the PSQI scores of 52 „good‟ sleepers (mean age 59.9, males 40), 34 „poor‟ sleepers with major depressive disorder (mean age 50.9, males 25) and 62 „poor‟ sleepers with the disorders of initiating and maintaining sleep (DIMS) or disorders of excessive somnolence (DOES) (mean age 42.2 years, males 24) (Buysse et al. 1988).The overall group mean score was 7.4 (5.1). The component scores revealed a Cronbach‟s alpha of 0.83 suggesting they were all measuring the same construct. The largest component-total coefficients were for „Habitual Sleep Efficiency‟ and „Subjective Sleep Quality‟ (0.76 for each). The lowest, at 0.35 was for „Sleep Disturbances‟ and may be due to the high number of items which make up this component and which may be susceptible to individual variation over time.
149 Cole et al. ( 2006) performed factor analysis on the scores of 67 elderly individuals (mean age 68.9, 55% female) who were currently depressed, 143 individuals in remission and 207 with no mental illness. The sample was split randomly, with exploratory factor analysis performed on one half and confirmatory factor analysis on the second half. The author concluded the best fit was found with the 2 factors identified by Buysse, but with a third factor of „Daily Disturbances‟. The relationship between each component score to its respective factor in this 3 factor model was significant and large, ranging from the standardized path coefficients of 0.43 (Sleeping Medication Use to Perceived Sleep Quality) to 0.91 (Habitual Sleep Efficiency to Sleep Efficiency). Correlations between the factors ranged from 0.42 (medium large effect) to 0 .82 (very large effect). Of course this may not hold true in younger subjects with different clinical presentations. Smyth ( 2008) reports on the usefulness of the tool in evaluating sleep quality in older adults, but provides no data to support this.
5.2.4.2. Validity, Sensitivity and Specificity
The PSQI appeared to distinguish well between the different groups in Buysse‟s study. Using ANCOVA with age and sex as covariates, the adjusted mean score of the control group was 2.7 (1.7), the depressed group 11.1 (4.3), the DIMS 10.4 (4.6) and the DOES 6.5 (3.0). A cut- off score of 5 gave a sensitivity of 89.6% and specificity of 86.5%, with a group-wide kappa of 0.75 (p < 0.001).Sensitivity for individual groups, although high, was different with a sensitivity of 97% for the depressed group, 84.4% for the DIMS and 88% for the DOES. It was concluded that a PSQI score above 5 indicates serious problems in at least 2 component areas or moderate problems in more than 3. A later study by Backhaus et al. ( 2002) confirmed these sensitivity and specificity findings with a score above 5 indicating sleep disturbances in 80 insomnia subjects and 45 healthy subjects with a sensitivity of 98.7% and specificity of 84.4%. A score above 5 is now generally accepted as indicating poor quality sleep.
5.2.4.3. Test-retest Reliability
Ninety one of Buyesse‟s 148 sample completed the PSQI twice, with no significant difference in global and component scores (Buysse et al. 2008). Scores were similar for each diagnostic group. The Pearson Product Moment Correlation Coefficient for global scores was 0.85, with individual component scores ranging from 0.65 to 0.85. Correlations for the controls
150 were generally lowest and highest for the DIMS, with both groups highest on the quantitative components measuring sleep duration and sleep latency, rather than the qualitative components which require a subjective assessment about sleep-related difficulties.
5.2.4.4. Stability
The stability of both the PSQI and the ESS were confirmed by Knutson et al. ( 2006a) when large numbers of black and white early middle-aged individuals completed both tools in the Cardia study. Six hundred and ten individuals completed the PSQI and 609 the ESS at
baseline and one year later. No significant difference was found between sample means (PSQI 5.7 (3.1) vs 5.9 (3.1) and ESS 7.4 (4.3), vs 7.2 (4.2)). For both tools, a high within-subject reliability was noted with intra-class correlation coefficients above 0.80 for both tools. Backhaus et al.( 2002) performed short term test-retest reliability studies on 80 individuals with primary insomnia. The test-retest interval ranged between 2 days and a few weeks. The overall PSQI global score Correlation Coefficient was good at 0.87. Validity testing using sleep logs and polysomnography in the same sample, show high correlations with sleep logs, but lower correlations with
polysomnography data.
The PSQI appears useful in measuring sleep quality in a variety of populations.
Carpenter & Andrykowski ( 1998) tested its use in chronic illness in a group of 155 bone marrow transplant patients, 56 renal transplant patients, 102 women with breast cancer and 159 with benign breast problems. Crohnbach‟s alphas were 0.80 across groups and correlations between global and component scores were moderately high. Algul et al. ( 2009) used the tool with 124 obese individuals (32 with BMI 30-34.9 kg/m2 (Class 1 obesity), and 92 with a BMI > 35 kg/m2 (class 11 obesity) ) and 106 healthy control subjects. The class 11 obesity group had a
significantly worse PSQI global score than the control group (7.3 (4.1) vs 4.9 (3.6) p< 0.001) 5.2.5. Screening for NES: The Night Eating Questionnaire (2008)
Stunkard‟s original NESQ was conceived as a 4 point Likert scale with 9 items, including visual analogue scales. This was developed into a 14 item NEQ (2004) with a 5 point Likert scale and no visual analogue scales. Validation studies published in 2008 have confirmed the 2004 version of the NEQ as an acceptable measure of severity of NES with one important revision. Item 13 which explores awareness whilst night-eating is now excluded from scoring giving a
151 possible score range from 0-52. The rationale for excluding this item is based on the assumption that individuals without awareness are not suffering from NES, but a sleep-related disorder. For the purposes of the prevalence study, in order to distinguish between the study screening tool and the unvalidated NEQ (2004) used in the identification study, the validated NEQ will henceforth be described as the NEQ (2008).
Evidence for use of the NEQ (2008) was evaluated together from three separate NES studies (Allison et al. 2008b;Backhaus et al. 2002). Study 1 examined factor structure and internal consistency and included 1980 persons with self-diagnosed NES who completed the NEQ (2004) on the Internet. The mean score was 33.1 (7.5). Principal components analysis was used to generate four factors and a total Cronbach‟s alpha of 0.70. Yet only one individual factor had an acceptable alpha above 0.7: (nocturnal ingestions 0.94). Other alphas were evening hyperphagia 0.65 and morning anorexia 0.57. Factors 4 and 5 (mood and sleep) were combined into a single construct, with even then a low alpha of 0.30. The rationale for combining these two factors is unclear and it is hardly surprising that 2 factors measuring different constructs generate a poor alpha when combined together.
The second study in 81outpatients diagnosed with NES found acceptable convergent validity of the NEQ (2004) with additional measures of night eating, disordered eating, sleep, mood, and stress. The mean score was 32.4 (6.8) and significantly higher in normal weight individuals as opposed to obese. The third study compared scores from obese bariatric surgery candidates with and without NES and found appropriate discriminant validity of the NEQ (2004). Of 184 individuals, 10.3% (19) were identified with NES. Mean scores were NES 26.2 (8.1) vs non-NES 16.0 (6.3). The positive predictive value (PPV) of the NEQ (2004) at a score of 25 or higher was low (40.7%), increasing to 72.7% at a score of 30 or greater. The negative predictive value (NPV) was high for cut scores of both 25 and 30 (95.2% and 94.0% respectively).
Reviewing the evidence as a whole, the authors conclude that the NEQ (2008) appears to be an efficient, valid measure of severity for NES. As previously discussed, a clear relationship exists between mood and sleep, although combining both concepts into a single construct does not appear logical. Presenting data from different NES studies together, given the diverse NES criteria on which studies are based is also questionable. NES criteria on which the first 2 studies
152 are based are not reported and the third study did not use the 2003 criteria. Despite these
limitations, it is at present the only tool available to systematically identify potential NES sufferers although it falls far short of being a diagnostic tool. Its use by other researchers is becoming widespread. Other researchers have found similar cut points useful. Lundgren et al. ( 2006) examined scores in 399 psychiatric patients and found a PPV of 52% using a cut point of 25 and 68% with a cut point of 30. Re-examination of the data when item 13 was removed from the total score gave an increased PPV of 62% and 77%, respectively. Although Stunkard
recommends excluding the awareness item when scoring the NEQ (2008), results from the identification and characterisation studies suggest considerable variability in the reporting of awareness of night eating by individuals with NEB. For this reason, the NEQ (2008) was delivered in this prevalence study with the item included, although the individual item score was excluded from the total score calculation in keeping with scoring guidelines.
5.3. Ethical Approval
Ethical approval to conduct the prevalence study was obtained from St Helen‟s and Knowlsley Ethics Committee (ref. no. 04/Q1508/9) on 30th July 2009.Approval from the UCLan Research Ethics Committee (ref. no. CA 142) was obtained on 4th September 2009. Management approval ref. no. 04DE006 was obtained from the Hospital R&D committee on 23rd September 2009.
5.4. Data Collection
Immediately prior to the period of data collection for the prevalence study, the WMC expanded from one clinic a week to 6 clinics per week, with a mixture of approximately 12-14 new and follow-up patients per clinic. An anonymised screening tool (screening tool v 1.3 24/05/09) was devised which included the NEQ (2008), ESS and PSQ as well as height, weight, age and gender in order to include demographic characteristics in the analysis (Appendix 10). A patient information sheet (PIS) (version 1.4 24/05/09) was posted by JC to all WMC clinic attendees a week before their clinic appointment to explain the purpose of the study. The screening tool and the PIS were then handed to consecutive attendees on arrival at clinic by the clinic nurse. Permission was obtained from the nurse manager of the Hospital Outpatient Department and Professor W, the consultant in charge of the WMC. JC and another obesity
153 specialist nurse colleague placed copies in the clinic notes prior to the start of clinic to reduce disruption to staff. Participants were invited to either complete the tools whilst waiting for their appointment and return them to the clinic nurse, or return them by post in a pre-paid envelope. The majority were completed in clinic although a small number (approximately 5%) were returned by post.
As this was an anonymised process it was not possible to collect information on the number on patients who declined to participate, although anecdotally, clinic staff noted very few patients who declined. The initial collection period was planned for 1 month, but due to staff holidays and clinic cancellations, data were collected over a 5 week period and stopped once a total of 103 completed tools were returned. A further 31 tools were returned with incomplete data: either height, weight and gender (data collected on a separate page) were inadvertently omitted or non-specific sleep and wake times were given, rendering the PSQI data invalid. 5.5. Statistical Analysis
Analysis was based on 103 individuals for whom a complete data set exists (weight was not available for 2 individuals). SPSS (version 14) and Inter Cooled Stata (version 9) were used to perform data analysis. Data was generally presented as means (SD). Independent samples t tests were used to perform between-groups analyses of means on interval data and Mann Whitney U tests on ordinal data. Differences in gender were assessed using the binomial test. Frequencies (%), with chi-square analysis, were calculated for the proportions of individuals above and below screening tool cut points (ESS > 10, PSQI >5, NEQ > 25 (first cut point) NEQ > 30 (second cut point)). Fisher‟s exact test and bootstrapping were used on small sample sizes either approaching significance or just reaching significance (p-values between 0.01 and 0.1). Pearson Correlation Coefficients were calculated to examine potential linearity of relationships between individual screening tool scores.
5.6. Results