Before data are analysed it is very often necessary to transform them in various ways. For example, you might have a variable indicating which of four age groups people are in (e.g.
16–20, 21–24, 25–30 and 31–34) and wish to create a new variable indicating which of two more general age groups they fall in (e.g. 16–24 and 25–34). Or you might have obtained Likert scale scores for people on five questions relating to happiness with career choice, and want to create a new variable in which you place the average of their score across these five questions.
SPSS provides several ways of transforming data, and this can be done either with the pull-down menus or with syntax files. If you are new to SPSS you may find the pull-down menus easier, but for those with some familiarity with it the syntax files are usually quicker and more efficient to use. In this section we will cover both approaches.
To explain how data transformation is achieved, we will use an example. Imagine that you design a questionnaire to measure how satisfied people are with their career. To keep things simple we will assume that this questionnaire contains just the following four items:
1 I am very happy with my current career 2 I wish I had chosen a different career
3 I never think I would be better off in a different career 4 I am convinced that I am in the wrong career
For each question, 10 respondents are asked to indicate whether they strongly agree, agree, are unsure, disagree, or strongly disagree. The responses to all questions are coded as follows:
Strongly disagree ⫽1 Disagree⫽2
Not sure ⫽3 Agree⫽4
Strongly agree ⫽5
With the adoption of this coding system it is clear that questions 2 and 4 are negatively keyed, because here greater agreement (and therefore higher scores) indicates relatively higher unhappiness with career choice. The results are shown in Table 4.3.
Before analysing the questionnaire responses you wish to transform the variables in various ways. As explained below, the Recode and Compute commands are very useful for doing so.
RECODE
Recode is used when you wish to change the way in which a variable is coded. For example, you may have coded males as 1 and females as 2 and wish to change this coding to males
⫽ 0 and females ⫽ 1. In doing so you can choose to overwrite the original variable (in which case your original codings are permanently deleted), or you can opt to create a new variable containing the recoded version, leaving the original variable untouched.
In questionnaire surveys a common use of recoding occurs when questions with Likert scale responses are both positively and negatively keyed. For example, in developing a scale to measure how happy someone is with their career, the researcher may use four positively keyed questions (e.g. I am very content with the career path I have chosen) and four negatively keyed questions (e.g. I wish I had chosen a different career). If a 5-point Likert scale is used, and responses are coded 1 to 5 (e.g. strongly disagree ⫽1, disagree ⫽2, in between⫽3, agree ⫽4, strongly agree ⫽5) it is necessary to recode the negatively keyed questions so that the greater the agreement the lower the score obtained (e.g. a response of 5 is recoded 1, 4 is recoded 2 etc.). Upon completion, higher scores will always indi-cate higher levels of happiness with career. It then becomes possible to combine the scores across all eight questions to obtain an overall score for happiness with career.
To do this, start SPSS if it is not already running, and then enter the data in Table 4.3 (see Figure 4.11) and then do the following:
➢ Click on Transform.
➢ Click on Recode.
➢ Click on Into Different Variables.
A dialog box will appear with your variables in the white box on the left hand side.
➢ Click on the first variable you wish to recode, (in this case ‘cq2’), and then on the black triangle to the left of the box headed Numeric Variable†--> Output Variable.
You will see that cq2--> ? appears in the large white rectangular box. You have now specified the variable you wish to have recoded, and the next step is to provide the name of the variable you wish to place the recoded version of ‘cq2’ in (referred to in SPSS as the output variable). On the left hand side of the dialog box you will see the heading Output Variable, and beneath this Name and Label.
1111
AN INTRODUCTION TO SPSS
Table 4.3 The responses of 10 people to four career choice questions
Case Gender cq1 cq2 cq3 cq4
1 1 3 2 2 1
†Input VariableinSPSSVersion 12.
In the small white box under Name type in the name you want to give the output variable. This name should be something which allows you to keep track of the fact that this new output variable is a recoded version of ‘cq2’. Let’s call it
‘cq2rec’ (meaning variable ‘cq2’ recoded) in this case.
➢ So type ‘cq1rec’ in the white box under Name. If you wish to, you can also give this variable a label, (perhaps ‘Career question 2 recoded’), by typing this in the white box headed Label.
➢ Click on Change.
You will see that the output variable has been moved into the large white rectangle which now contains the following: cq2 --> cq2rec (see Figure 4.12).
The next step is to indicate the relationship between the values of the original variable (in this case ‘cq2’) and the new output variable (in this case ‘cq1rec’). Here we wish to change the coding as follows: 5⫽1, 4⫽2, 3⫽3, 2⫽4, 1⫽5. This will ensure that the more people agree with the negatively keyed ‘cq2’ variable, the lower the score they will obtain, and the more they disagree with it, the higher the score they will obtain. To do this:
➢ Click on Old and New Values. A dialog box will appear. Click on the white circle to the left of ValueunderOld Valueto highlight this, and then enter a 1 in the white rectangle to the right of Value.
Figure 4.11 Data entered for transformation
We now need to give the value of the output variable when the original variable is coded in the way indicated under Old Value, Value (i.e. in this case 1). Here we want people who are coded 1 for ‘cq2’ to be coded as 5 for ‘cqrec’. So we enter a 5 in the rectangular box to the right of New Value, Value.
➢ Click on Add. The coding of the value of the original variable (in this example 1 for ‘cq2’) is now linked to a code for the new variable (in this example 5 for
‘cq2rec’) in the white box headed Old --> New.
➢ You now need to repeat the previous two steps for each of the other recodings (i.e. 4⫽2, 3⫽3, 2⫽4 and 5⫽1). When you have finished, all five of the
recodings should be seen in the white box headed Old --> New (see Figure 4.13).
➢ Click on Continue.
➢ Click on OK.
The new variable ‘cq2rec’ will be created. You can view this in Data View. Because in this example ‘cq4’ is also negatively keyed, you now need to go through the same procedure again to recode this, perhaps into a variable called ‘cq4rec’.
1111 2 3 4 5 6 7 8 9 10 1 2 3 411 5 6 7 8 9 20111 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 9 40 1 2 3 4111
AN INTRODUCTION TO SPSS
Figure 4.12 Specifying variables for recoding
Sometimes you may wish to recode a variable only for some cases and not others. For example, you might want to recode the score that women get on a particular variable, but leave the score that men get on it intact. To do so you should follow the same procedure as that set out above for Recode, except that before clicking on OK (i.e. the last step) you should click on If. Here you can enter the name of the variable on which recoding depends and the value it should have if recoding is to go ahead. For example, if for some reason you only wanted ‘cq2’ to be recoded into ‘cq2rec’ for men, and not for women, you could enter gender ⫽1 in the If dialog box and then click on Continue. As a consequence of the command you have entered, the recoding would only take place for men.
COMPUTE
Another commonly used method for transforming data is the compute command. Compute is used when you want to create a new variable that results from some arithmetic function,
Figure 4.13 Specifying old and new variables when recoding
a function usually involving existing variables. For example, having recoded ‘cq2’ and ‘cq4’, you are in a position to work out the mean score that each of the respondents has obtained across the four career happiness variables.
To do so:
➢ Click on Transform.
➢ Click on Compute.
A dialog box will appear. In the white box at the top left headed Target Variable write in the name of the new variable you wish to have created. In this example, because we are going to compute the mean score across the four career questions, we will type in ‘careerav’
(standing for career average). If you wish to give the newly created variable a label (in this case perhaps ‘Mean career score’) click on Type & Label and enter this in the white box next to Label and then press Continue.
Next you need to type in the expression that will create the new target variable. In this example we want to find the mean across the four questions so we enter:
(cq1⫹cq2rec⫹cq3⫹cq4rec)/4.
This indicates that the four career questions are to be added together and then divided by four to find the mean score (see Figure 4.14). If there had been six questions we would, of course, have divided by six. The brackets are used because by default SPSS carries out 1111
2 3 4 5 6 7 8 9 10 1 2 3 411 5 6 7 8 9 20111 1 2 3 4 5 6 7 8 9 30 1 2 3 4 5 6 7 8 9 40 1 2 3 4111
AN INTRODUCTION TO SPSS
Figure 4.14 Entering a complete instruction
multiplications and divisions before additions and subtractions. When brackets are used, the computations inside the brackets are carried out first. You can either type this instruction in directly in the white box headed Numeric Expression or you can click on the variables as they appear in the white rectangle on the bottom left to transfer them across to the numeric expression box, and use the keypad to enter the brackets and ⫹ signs. Of course, if your intention was to compute some other score for the target variable, you would enter what-ever numeric expression you wanted in order to do so.
➢ Click on OK.
The newly created variable of ‘careerav’ will be computed, and you can view it in Data View.