Cómo trabajar con archivos
Capítulo 34. Carga y compartición de archivos
SES can be seen as a composite measure (Baker 2014), and therefore can be measured in a number of different ways, including family income, parental occupation, housing tenure and
62
parental education11. The two measures of SES I used in this empirical chapter were parental occupational classification and household income. The advantage of using parental
occupational classification as the measure of SES is that it is relatively stable, and may therefore be a good indicator of permanent socioeconomic position. In the context of this empirical chapter, the main advantage of using household income is the fact that it is a broadly continuous measure, and therefore applicable for use in the CI, which ranks individuals according to their socioeconomic position.
I also considered using parental educational attainment as a measure of SES to compare inequalities between the cohort studies, as the age at which mother left full time education was collected in approximately the same manner in the NCDS, BCS and MCS. However, as argued by Feinstein et al,. (2008), although years of schooling can be considered a
‘functional proxy’ for the level of education, this basic measure does not fully take account of the type or quality of educational attainment, and it is the level of qualification rather than the years of schooling that will lead to socioeconomic dividends through signalling effects.
One issue pertaining to cross-cohort comparisons such as this chapter is the fact that the different studies collect apparently similar variables in very different ways. This is
exemplified in the ways in which information concerning parental occupational classification and household income has been collected in the NCDS, BCS and MCS. For instance, in both the NCDS and the BCS, the question regarding occupational classification refers to the father (also referred to as the ‘male head’). However, in the MCS, there are survey questions relating to both the mother and father, enabling the calculation of the highest occupational classification in the family, which may be a more appropriate measure to use in modern society, given the significant increase in the number of women in the labour market over time.
The measures of income are also collected in different ways in the NCDS, BCS and MCS. In the empirical analysis, I attempted, where possible, to capture a measure of permanent household income, in order to minimise potential biases from short term income shocks. Blau (1999) has argued that the effect of current income levels may be relatively small compared to that of a permanent income measure.
63
For the NCDS, the measure of income I used was the ‘Permanent Parental Income’ variable provided by the Centre for Longitudinal Studies and originally funded by the Economic and Social Research Council (ESRC) in order to aid research into income dynamics and health inequalities (Taylor 2000). This income variable was created because although the NCDS contains extremely detailed and high quality information on a variety of child and family outcomes throughout childhood, the family income of the child was only collected at the age of 16. Although this variable has been used in studies relating to child cognitive ability, for example by Gregg and Macmillan (2010), there are several associated problems. Firstly, Benzeval et al., (1997) have argued that such a measure may not be an accurate reflection of living standards in earlier childhood, when the cognitive tests are undertaken. Secondly, earnings are grouped into a small number of bands, with the highest band having no upper limit. Although work has been undertaken in order to convert this banded measure into a continuous one (Blanden and Gregg 2004), this analysis resulted in only 77 unique income categories being generated. Thirdly, the interview for the third wave of the NCDS happened to be conducted during the ‘Three-Day Week’ of 1974, and there is therefore confusion unto whether the respondents were reporting their usual salary or the reduced figure
(Micklewright 1988).
In order to overcome these problems, Taylor (2000) has calculated a measure of permanent income, using information on parental characteristics deemed to have a permanent impact on family income levels, such as parental years of education, parental occupational class and whether parents were absent during childhood. In an estimated income equation, the above measures were used as key explanatory variables, along with controls for parents’ age and region, similar to the study of Dearden et al., (1997). Using the grouped dependent variable technique of Stewart (1983), a total measure of log family income was calculated, with this measure taking into account the bounded nature of the income question.
However, there are several issues associated with this measure of income in the NCDS. Firstly, this method assumes that factors such as occupational class and parental education only affect child outcomes indirectly through income, which may be considered a strong assumption. Secondly, cohort members with missing information on parental occupational class and cohort members with no natural parents in wave 3 of the survey were not given a value and therefore excluded from analysis. Although the estimated values of non-natural parents were imputed using the mean difference between the reported mother’s and
64
father’s age at birth, there may be concerns that the missing information regarding parental occupational class may bias the empirical estimates using the income measure. However, detailed studies of non-response in the NCDS, such as Nathan (1999) and Hawkes and Plewis (2006), have shown that no significant bias is generated from this missing data, and
descriptive statistics, as displayed in Table 4.3, show that the distribution of fathers’ social class at 1960 (as measured by the NSSEC-5) does not significantly change between the full sample and the sample in which there is missing data on the income variable.
Table 4.3- Distribution of parental occupation in the NCDS
Full Estimation Sample Income Estimation Sample Observations % Observations % Managerial/Professional
566 5.18 361 4.89 Lower Managerial/Higher Technical
1602 14.67 1,088 14.75 Intermediate Occupations
6144 56.26 4,169 56.53 Small Employers/Own Account
1927 17.64 1,303 17.67 Lower Supervisory/Technical
682 6.24 454 6.16 Total
10921 100 7375 100
Although a measure of permanent income has been generated in the NCDS, no such measure of income has been generated in the BCS. The income measure in the BCS was collected in a single income, banded manner and was also only collected at the ages of 10 and 16. Micklewright and Schnef (2010) have argued that the reliability of single income questions such as the one used in the BCS could be brought into question, as such questions may be poor at capturing income when one individual is asked to report the income for the whole household. Although a significant amount of work has been undertaken to generate a measure of income comparable to the NCDS at the age of 16 (Blanden and Gregg 2004), it is once more unclear how representative such a measure would be in the earlier waves. Due to the absence of an appropriate measure of income, I did not consider the BCS for analysis using the CI, and therefore I could not use this study when comparing the level of
socioeconomic inequality in cognitive ability over time, despite the comparability of the vocabulary ability cognitive test at age 5.
For the MCS, the measure of income I used was equivalised family income, calculated from information in waves 3-5, when the cohort children were 5, 7 and 11 respectively. Similar to
65
the measure of income in the NCDS, the original income question in all three of the waves required the main respondent to choose from a number of income bands. Rather than using these measures, I used the CLS provided OECD equivalised measures for all three waves, found in the MCS list of derived variables. The OECD equivalence formula (Haagerors et al., 1994) can be given by:
𝐻𝑜𝑢𝑠𝑒ℎ𝑜𝑙𝑑 𝐼𝑛𝑐𝑜𝑚𝑒
1 + (0.5 ∗ 𝐴𝑑𝑑𝑖𝑡𝑖𝑜𝑛𝑎𝑙 𝐴𝑑𝑢𝑙𝑡𝑠) + (0.3 ∗ 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶ℎ𝑖𝑙𝑑𝑟𝑒𝑛)
(4.14)
This particular equivalence method assigns a value of 1 to the household head, 0.5 to every additional adult, and 0.3 to each child. For instance, a family of two adults and two children is given a value of 2.1, and a single parent family with four children is given a value of 2.2. The total level of household income is then divided by this value to create an equivalised income measure.
To account for the fact that a permanent income measure combines the income measures from different waves of data over a span of six years, I also adjusted these measures for inflation. This was calculated using wave 3 as the base year (2006) and the end of year Great Britain inflation rates (ONS 2016). Following this, I calculated income measures for the separate waves. The income measure used in the 4th wave (when the cohort children are 7) was the average income across the 3rd and 4th waves. The income measure used in the 5th wave of data (when the cohort children are 11) was the average income across the 3rd, 4th and 5th waves.
As demonstrated above, although unavoidable, the income measures for the different cohort studies were collected in different ways, complicating direct cross-cohort comparisons. Previous studies that have used measures of income in cross-cohort
comparisons, such as Blanden and Machin (2004; 2010) and Gregg and MacMillan (2010), either standardised the calculated income measures to mean 0, SD 1 (in an attempt to ensure that changes in income inequality or the variance of income across the cohorts did not drive the results) or converted the income measure into quintiles. Due to the fact that I used the CI as the empirical strategy in this chapter, the income of the cohort children was ranked in the calculation, converting household income into an ordinal measure rather than a cardinal scale.
66 4.5.3 Other explanatory variables
I also included a number of variables that may attenuate the relationship between SES and child cognitive ability in the various regression models. The different surveys have collected information regarding similar variables in a variety of ways, and as a consequence the variables I included were relatively limited. My choice of explanatory variables was also partially guided by the studies of Goisis et al., (2017a; 2017b), who have used the NCDS, BCS and MCS to examine the changes in relationship between birth weight and cognitive ability (2017a) and maternal age and cognitive ability (2017b) over time respectively.
The first child characteristic I included was a dummy variable for gender, as boys and girls may excel at different aspects of cognitive ability. Several empirical studies, including Hedges and Nowell (1995), Weiss et al., (2003) and Halpern (2013), have shown that there may be significant gender differences in child cognitive test scores, with the extent of the difference dependent on the cognitive assessment in question. A further child characteristic I
controlled for was ethnicity, as several studies (for instance Todd and Wolpin 2007) have shown that there may be significant ethnic disparities in child cognitive ability. It is
particularly important to control for ethnicity in the context of cross-cohort comparisons, as the MCS is substantially more ethnically diverse than both the NCDS and BCS. I also included categorical variables for region, in order to account for potential spatial variation in child outcomes, which may occur due to localised educational policies (Taylor et al., 2013). Several empirical studies, including Black et al., (2005) have shown that early life factors, such as having a low birth weight and being a preterm birth, may also be significantly associated with a number of short and long-term factors, including child cognitive ability. Furthermore, Goisis et al., (2017a) found that there has been a decreasing association over time in the relationship between low birth weight and cognitive ability, with a significantly higher level of correlation in the NCDS than the MCS. Although the impact of such early life factors may not be strictly causal, it is thought that such factors may proxy for the early environment experienced by the child.
These two early life variables are also likely to be highly correlated due to the fact that one of the distinctive determinants of a low birth weight is being a preterm birth. However, it is important to control for both factors, as although being a preterm birth may also be picked up by variation in birth weight, there are other issues that may contribute to a low birth
67
weight, for instance genetics and maternal behaviours such as smoking cigarettes and drinking alcohol. I included low birth weight as a dummy variable with the value of 1 if the cohort child weighs below 2500g at birth, and 0 otherwise, and preterm birth as a dummy variable with the value of 1 if gestational age is lower than 259 days (37 weeks) and 0 otherwise.
As well as child characteristics, I also included a small number of maternal characteristics in the empirical models. The first maternal characteristic I controlled for was maternal age. As Fergusson and Lynskey (1993) have shown, maternal age may affect child outcomes through two main pathways. Firstly, children of younger mothers are more likely to be born into poorly educated, socially disadvantaged families. Secondly, the same children are also less likely to be exposed to stable home environment. Furthermore, Goisis et al., (2017b) have shown that the relationship between maternal age and cognitive ability has changed over time, with the correlation negative in the NCDS and positive in the MCS. Ideally, I would have liked to also include paternal age, but the inclusion of this variable would have resulted in a large amount of missing data across the three cohort studies. To capture any non-linear effects of maternal age, I entered this variable into the model in both a linear and quadratic form. I also included a dummy variable for marital status, which acts as a proxy variable for the stability of the household environment. In the NCDS and BCS this was measured as whether or not the mother was married, and in the MCS this was measured as whether or not the mother was married, cohabiting or single.
I also considered two markers of maternal health related behaviour: whether the mother smoked at all during pregnancy and whether the mother breastfed the child at any point. As Fergusson and Lloyd (1991) have shown, although the relationship between smoking in pregnancy and child cognitive ability may not be strictly causal, it may mediate itself through the home environment. Horwood and Fergusson (1998) have shown that although the relationship between breastfeeding and cognitive ability may be small, it is long lived and may extend into late childhood. I included both as dummy variables, taking the value of 1 if the mother engages in the respective activities, and 0 otherwise.
Finally, I controlled for three further sociodemographic variables. The first of these was family size, as a number of studies (including Hanushak 1992) have shown that this measure to be associated with a number of child outcomes, including cognitive ability. Although there is significant debate about whether this relationship is causal (this issue is discussed in great
68
detail in Chapter 5), I included this variable to further account for the potential impact of family structure on child cognitive ability.
The second sociodemographic variable I included was maternal employment. A number of studies (such as Waldfogel et al., 2002), have shown that maternal employment may be associated with child outcomes such as cognitive ability through a number of pathways, such as maternal allocation of time or the resources available to the household. It is again
particularly important to include such variables in the context of cross-cohort comparisons, given the significant changes in maternal employment levels in the UK over time. I included this variable as a dummy taking the value of 1 if the mother is employed and 0 otherwise12. The final variable I controlled for was a proxy measure of maternal education. A number of authors, for example Carneiro et al., (2013), have shown that maternal education levels may be significantly associated with child outcomes, with this association potentially mediated through maternal achievement beliefs or the ability to provide a stimulating home
environment for their children (Davis-Kean 2005). Once more, it is particularly important to include this variable in the analysis, given the increases in levels of maternal education over time. Due to data limitations, I included this measure as a dummy variable taking the value of 1 if the mother stayed in formal education past the minimum school leaving age at the time, and 0 otherwise.
Clearly, there are a wide range of other controlling variables that I would have also wanted to include in empirical analysis, such as a variety of household measures relating to the home learning environment. However, due to the different survey structures, finding comparable variables for such measures was difficult, and therefore this was unfortunately not possible. Definitions for the variables that were included in the empirical analysis are shown in Tables 4.4-4.6.