To standardise processes across the three centres and maximise data quality, researchers were trained to use detailed standard operating procedures for each stage of data collection. A number of cross-checks were routinely performed as a means of ensuring that any data inconsistencies arising from either
baseline assessment or follow-up were identified and resolved at the earliest opportunity. Trial data were
entered into a Microsoft Access 2003 database (Microsoft Corporation, Redmond, WA, USA) at each centre, before being merged into one central database following the end of data collection. A range of data validation checks were carried out in both Microsoft Access and Stata 11.2 (StataCorp LP, College Station, TX, USA) to minimise erroneous or missing data.
Allocation
CBT (+UC) UC
Randomised
3-month telephone call
PHQ-9, treatment received, use of medication including adherence to antidepressants
6-month assessment (primary outcome)
BDI-II, PHQ-9, GAD-7, use of medication including adherence to antidepressants, attitudes to treatment, SF-12, EQ-5D, DAS-SF2, MAQ, mental health literacy
9-month telephone call
PHQ-9, treatment received, use of medication including adherence to antidepressants
12-month assessment
BDI-II, PHQ-9, GAD-7, use of medication including adherence to antidepressants, attitudes to treatment, SF-12, EQ-5D, DAS-SF2, MAQ, mental health literacy
FIGURE 2 CoBalT trial follow-up stages and data collected. DAS-SF2, Dysfunctional Attitudes Scale-Short Form
Measures
Primary outcome
The primary outcome was the BDI-II score at 6 months post randomisation– specifically a binary variable
representing response defined as a reduction in depressive symptoms of at least 50% compared with
baseline. A threshold of 50% improvement in symptoms is a widely used definition of improvement67
and used to compare treatment effects in the systematic review of interventions for TRD.14
The BDI-II is a 21-item self-report instrument to measure the severity of depressive symptoms occurring over the previous 2 weeks and has been widely used in depression trials. The 21 items are rated on a
four-point severity scale (0–3) and are summed to give a total score (range 0–63). A higher score on the
BDI-II denotes more severe depression.
Secondary outcomes
The BDI-II was also completed at 12 months to assess the longer-term effect of the intervention. Secondary outcomes included the BDI-II as a continuous score, and a further binary version representing remission of symptoms (defined as a BDI-II score of < 10).
Other outcome measures included at the 6- and 12-month follow-ups assessments are listed below:
l SF-12 (version 2) A 12-item Short Form Health Survey measuring quality of life.56The SF-12 is an
abbreviated form of the SF-36 (Short Form questionnaire-36 items), a 36-item instrument for measuring subjective health status. It consists of 12 self-report items, selected from the SF-36. The CoBalT study used a revised version of the SF-12, the SF-12v2, which was introduced in 2002. The algorithms used to score data are dependent on the recall period. The CoBalT study used the acute (1-week recall) survey. Norm-based scores for the physical and mental subscales were calculated. Higher scores indicate better health and functioning.
l PHQ-9 The Patient Health Questionnaire, a brief nine-item depression scale53developed for use in a
primary care setting. The questionnaire is designed to assess the patient’s mood over the previous
2 weeks and scores for each of the nine items range from 0 (not at all) to 3 (nearly every day). Items
are summed to give a total score (range 0–27), with a higher score denoting more severe depression.
l GAD-7 The Generalised Anxiety Disorder Assessment– a measure of generalised anxiety disorder
(GAD).54The GAD-7 is a brief self-report questionnaire designed to detect probable cases of GAD and
to provide a measure of its severity as recalled over the previous 2 weeks. As with the PHQ-9, scores for each item range from 0 (not at all) to 3 (nearly every day). Items are summed to give a total score
(range 0–21), with a higher score denoting more symptoms of anxiety. The PHQ-9 and GAD-7 form
part of the core IAPT outcome data set (www.iapt.nhs.uk/silo/files/iapt-outcome-framework-and-data-
collection.pdf) and will enable comparison with national IAPT data.
l Panic (Brief PHQ) The presence of panic disorder was measured using the panic module of the
self-report version of the PRIME-MD questionnaire (Brief PHQ).55The measure consists offive items.
Thefirst item asks individuals to report whether or not they have experienced an anxiety attack within
the last 4 weeks; if they have they are asked four further questions about their experience.
Each question elicits a response of‘Yes’ or ‘No’. The total number of panic items endorsed therefore
ranges from 0 to 5.
l EQ-5D-3L A standardised measure of health status developed by the EuroQol Group to provide a
simple, generic measure of health for clinical and economic appraisal.57The EQ-5D-3L is used in
CoBalT as a measure of health outcome for the economic evaluation. Scores were calculated using standard algorithms, with higher scores indicating better health.
Bespoke measures relating to patient’s treatment experience and mental health literacy were also recorded
at 6 and 12 months.
METHODS
14
Self-reported adherence to antidepressants was collected at each of the four time points.48With consent,
data on antidepressant medication received (additional prescriptions, changes in dose and/or changes in the antidepressant prescribed) and other medications prescribed during the course of the study were recorded from GP records, together with details of consultations in primary care. Data on antidepressants prescribed during the year prior to entry to the study were also recorded, when consent was given to access medical records. Data on health care utilisation in primary and secondary care, private treatments, and complementary and alternative treatments were collected as part of the 6- and 12-month follow-up questionnaires and were used to inform the economic evaluation.
Process measures of dysfunctional attitudes and metacognitive awareness were also collected at the 6- and 12-month assessments:
l Dysfunctional Attitudes Scale-Short Form (version 2) (DAS-SF2) The DAS-SF2is a self-report
questionnaire containing nine items that was developed from Weissman’s original Dysfunctional
Attitude Scale,68using item response analysis to provide an efficient and accurate assessment of
dysfunctional attitudes among depressed individuals.58
l Metacognitive Awareness Questionnaire (MAQ) The MAQ59assessed whether or not patients
with depression view their negative thoughts as reflecting reality. The scale consists of nine self-report
items and has the same seven-point response format as the Dysfunctional Attitudes Scale (DAS), with
higher MAQ scores reflecting greater metacognitive awareness.
The brief telephone follow-ups at 3 and 9 months comprised the PHQ-9, use of antidepressant
medication, including adherence to antidepressants48and other treatments received.
Table 2 shows the measures and when they were collected.
Handling missing items
For outcomes on the BDI-II, PHQ-9, GAD-7, DAS-SF2and MAQ, the trial dealt with any missing data at an
individual item level by adopting the following rule. If > 10% of the items were incomplete then the data collected on that measure for that participant were disregarded. However, if < 10% of items on a particular measure were missing, missing item(s) were imputed using the mean of the remaining items (rounded to an integer). Therefore, when an individual had completed 19 or 20 items for the primary outcome measure (BDI-II) then the remaining one or two items were imputed. For all other measures
(PHQ-9, GAD-7, DAS-SF2, and MAQ) the 10% rule meant that only a single item would be imputed.
Data were complete for the majority of the sample; the number of cases for which values were imputed are reported in Table 3.
The scoring manuals for the SF-12 or EQ-5D-3L, which require the application of complex scoring
algorithms, indicated that if any item was missing, the scale score should not be calculated. In the case of greater item non-response or missing follow-up data, sensitivity analyses were conducted using the method of multiple imputation by chained equation (MICE) to examine the impact of missing data on the
mainfindings (see Sensitivity analyses to examine the impact of missing data, below).