The fixed effects estimating equation incorporates the impact of observed and time- invariant unobserved individual heterogeneity on the log of real hourly wage, which is given by:
Equation 11
where is the real hourly wage for individual i; is an intercept term; is a vector of coefficients for the individual’s characteristics; are observed labour market characteristics of the individual; are the individual unobserved effects or heterogeneity; t refers to time; and is a random variable error term.
The estimation of Equation 11 without the time-invariant unobserved individual effects produce biased estimates of as individuals make labour market decisions, have different productivity and motivation, and face discrimination, which are unobserved. Suppose that denotes female discrimination which has a positive impact on the gender wage gap and negative impact on hourly earnings of women. Then the coefficient estimate on the gender dummy variable estimated without the individual effects will be negatively biased as a result of omission of the unobserved . Once unobserved heterogeneity is control for, it is expected that the gender wage gap would decline. Failure to control for unobserved characteristics results in omitted variable bias to the coefficient estimate .
As such, panel data techniques (fixed effects models) are utilised in this analysis to control for time-invariant unobserved individual heterogeneity by estimating Equation 11. The results from the fixed effects estimation are compared to results from OLS estimation of Equation 2 and Equation 3 using pooled person-year observations. The standard errors of the estimations are robust to heteroskedasticity and are clustered by respondents’ cross-wave identifier.
This conventional method of accounting for unobserved effects makes the assumption that is a parameter that needs to be estimated for each individual ( . Therefore, an intercept for each person ( is estimated with . Here, the individual slopes are not allowed to vary as the slope coefficient on is the same from one individual to the next.
45
The implementation of fixed effects models with individual slopes and intercepts is left for future work.
Equation 11 is estimated for men and women separately and will show the returns to male and female labour market characteristics while accounting for the time-invariant unobserved individual heterogeneity. More importantly, the estimation of Equation 11 segregated by gender will allow for a more robust estimation of the gender wage gap. The coefficient estimates ( on the time-varying explanatory variables ( in Equation 11 show the impact of these variables on average hourly wages without the effect of time-constant variables. For example, the coefficient estimate on experience presents the impact of an additional year of experience in the labour market on average hourly wages without the impact of time-constant factors such as motivation and innate ability.
To measure the impact of the occupation and industry of employment on men’s and women’s earnings and therefore the gender wage gap while accounting for the time- invariant unobserved individual heterogeneity, workplace controls are estimated using Equation 12.
Equation 12
where is the real hourly wage for individual i; is an intercept term; is a vector of coefficients for the individual’s characteristics; are observed labour market characteristics of the individual; are the individual unobserved effects or heterogeneity; represents the workplace characteristics (occupation and industry) and represents the workplace specific effect; t refers to time; and is a random variable error term.
When estimating Equation 4 and Equation 12, it is assumed that the workplace and individual characteristics are not correlated, as such, the estimation will enable for the observation of both effects. The workplace specific effect ( ) will also capture unobserved individual effects that are common to employees in an industry and occupation. However, the remaining idiosyncratic effects cannot be identified and will be captured by the residual. If the remaining individual effects are uncorrelated with
46
workplace specific effects, the estimates will not be biased. However, it is understood that the choice of industry and occupation can be endogenous.
Gender wage gap
3.6
The gender wage gap is computed using Equation 13 and the relevant coefficient estimates for men and women from the estimations of the earnings functions discussed above. Equation 13 ( (∑( ̂ ̅ ) (∑( ̂ ̅ ) ⁄ )
where represents the exponential, ̅ and ̅ denote the mean labour market characteristics of individual , and ̂ and ̂ are the male and female estimated coefficients of individual , respectively (as denoted by the male ( and female
( superscripts).
Counterfactual wage decompositions
3.7
Mean decomposition
3.7.1
The decomposition undertaken at the sample mean in Chapters 5, 6, and 7 is an extension of the approach developed by Blinder (1973) and Oaxaca (1973), which is widely used and accepted in decomposing the mean gender wage gap. This extension follows Biewen (2014); refer to Appendix A for more details about the Biewen (2014) decomposition and a comparison of this method with more common decomposition methods. The Biewen (2014) decomposition is written as:
Equation 14 ( ̅̅̅̅̅̅̅̅̅̅̅ ( ̅̅̅̅̅̅̅̅̅̅ ∑ ̂ ( ̅ ̅ ∑ ̅ ( ̂ ̂ ∑( ̂ ̂ ( ̅ ̅
47
where ̅̅̅̅̅̅ and ̅̅̅̅̅̅ are the male and female mean log real hourly wages, respectively. ̅ and ̅ denote the male and female mean labour market characteristics of individual , respectively, and ̂ and ̂ are the male and female estimated coefficients of individual , respectively.
The first component in Equation 14 of the wage decomposition is the endowment component. This component attributes the difference in wages between two groups to differences in labour market characteristics (endowments) such as education, work experience, and tenure. This component is generally referred to as the “explained” component of the decomposition. This part of the equation also represents the impact on the gender wage gap from a change in mean characteristics holding all else constant. The second component in Equation 14 is the coefficient component, which attributes the differences in wages between men and women to the rewards that they receive for their labour market characteristics (coefficients). This component is referred to as the “unexplained” component. The unexplained component is generally a measure of discrimination, but also includes the effects of group differences in time-invariant unobserved individual heterogeneity that the model does not capture (Jann 2008). Further, this component of the equation represents the impact on the gender wage gap from a change in the mean returns holding all else constant.
The final component in Equation 14 is the interaction component, which attributes the differences in wages between the two groups to the simultaneous impact of coefficients and endowments. As the impact of endowments and coefficients on earnings is not independent, the interaction between the two terms assists in explaining the interrelated impact of coefficients and endowments on the difference in earnings between men and women (Biewen 2014). In other words, this component represents the overall change in the gender wage gap that cannot be explained by changing the endowment or the coefficient component in isolation.
To undertake the mean decomposition, denote the male and female labour market returns by and and their characteristics by and , respectively. This allows for the estimation of three counterfactual densities. The first is the female log wage density that would arise if women retained their labour market characteristics but were
48
paid like men. The second is the female log wage density that would arise if women obtained men’s labour market characteristics but continued to be paid like women. The third is the female log wage density that would arise from a simultaneous change in women’s labour market characteristics and rewards for those labour market characteristics.
In “non-discriminatory” situations where men and women possess the same productive characteristics, the returns to labour market characteristics/wage of men and women would be equal ( = ) and no wage gap will be evident. Therefore, observed wage differences can be seen as a cause of unequal treatment by gender, or other time- invariant unobserved individual heterogeneity that the model fails to capture. A positive (negative) sign implies that market returns to characteristics for men are higher (lower) than the returns to characteristics for women.
The steps for the mean decomposition procedure are summarised below:
1. Use the log wage Equation 2 and Equation 3 to estimate the and using the male and female datasets, respectively.
2. Estimate female wages by using female’s average characteristics ( ̅ and the estimated coefficients ̂ from step (1). This produces women’s actual average earnings.
3. Predict the first wage density by using male’s average characteristics ( ̅ and the estimated coefficients ̂ from step (1). This predicts the counterfactual wages of women if they had men’s labour market characteristics and continued to be paid like women.
4. Predict the second wage density by using female’s average characteristics ( ̅ and the estimated coefficients ̂ from step (1). This produces the counterfactual wages of women if they retained their labour market characteristics and were paid like men.
5. Predict the third wage density by solving ∑ ( ̂ ̂ ( ̅
49
produces the counterfactual wages of women if their labour market characteristics and returns to labour market characteristics changed to be the same as men’s.
6. To calculate the explained component of the decomposition, take the difference between women’s counterfactual wages from step (3) and women’s actual wages from step (2).
7. To calculate the unexplained component of the decomposition, take the difference between women’s counterfactual wages from step (4) and women’s actual wages from step (2).
8. To calculate the interaction component of the decomposition, take the difference between women’s counterfactual wages from step (5) and women’s actual wages from step (2).
Distributional decomposition
3.7.2
The distributional decomposition is undertaken in Chapters 5 and 7 using an Oaxaca- Blinder type bootstrap procedure developed by Machado and Mata (2005), which is applied to decompose the gender wage gap along the earnings distribution. This decomposition attributes the gender wage gap to differences in endowment and coefficients. This procedure involves the estimation of marginal wage densities that are consistent with conditional densities.
The decomposition is written as: Equation 15 ( ( ∑ ̂ ( ∑ ( ̂ ̂
where ( and ( are the log real hourly wages of men and women, respectively at the quantile ( of interest. and denote the male and female labour market characteristics for individual at the respective quantile ( of interest,
50
and ̂ and ̂ are the male and female estimated coefficients for individual , respectively at the respective quantile ( of interest.
The steps for the distributional decomposition are summarised below:
1. Sample the th quantile of interest using a standard uniform distribution.
2. Using the female and male segregated datasets, estimate the log real hourly wage (Equation 6) at each quantile ( from step (1) to obtain and for men and women, respectively.
3. Use female’s average characteristics at each quantile to predict female wages by using the estimated coefficients from step (2) at the respective quantiles. 4. Use male’s average characteristics and women’s estimated coefficients from
step (2) at each quantile to predict the first set of wage densities. This calculates the counterfactual earnings of women if they had men’s labour market characteristics and continued to be paid like women at the respective quantiles. 5. Use female’s characteristics and the estimated coefficients from step (2) at
each quantile to predict the second set of wage densities. This calculates the counterfactual earnings of women if they retained their labour market characteristics and were paid like men at the respective quantiles.
6. To calculate the explained component of the decomposition, take the difference between the female’s counterfactual wages from step (4) and female’s actual wages from step (3) at each quantile.
7. To calculate the unexplained component of the decomposition, take the difference between the female’s counterfactual wages from step (5) and female’s actual wages from step (3) at each quantile.
Step (3) and (4) generate two female counterfactual wages. The total number of draws (with replacement) is set to 5000.
51
Wellington decomposition
3.7.3
The Wellington (1993) decomposition is undertaken in Chapter 5 to attribute the change in the gender wage gap between 2001 and 2012 to endowment and coefficient components. This technique extends the one-period Oaxaca-Blinder decomposition into a two-period decomposition:
Equation 16
( ( )
[ ̂ ( ̅ ̅ ̂ ( ̅ ̅ )]
[ ̅ ( ̂ ̂ ) ̅ ( ̂ ̂ )]
where and refer to the male and female log real hourly wages in 2012 dollars, respectively. Subscripts 01 and 12 refer to 2001 and 2012, and and refer to male and female. ̂ refers to the estimated mean coefficient estimates, and ̅ refers to the average characteristics for men and women and in 2001 and 2012 as denoted by the subscripts.
The first term of the decomposition expresses the change in the gender wage gap due to changes in the characteristics evaluated at the 2012 returns (coefficients). This component of the decomposition answers the question ‘if the returns to the independent variables were constant at their [2012] levels, what proportion of the wage gap can be accounted for by changes in the means?’ (Wellington 1993, p.393). The second term shows the proportion of the gender wage gap that can be explained by the changes in the coefficients between 2001 and 2012 evaluated at the 2001 means.
In Chapter 5, a “distributional Wellington decomposition” is proposed and is specified by Equation 17.
Equation 17
( ( )
[ ̂ ( ̂ ( )]
52
where and refer to the log real hourly wages of men and women in 2012 dollars, respectively at the quantile ( of interest. Subscripts 01 and 12 refer to 2001 and 2012, and refer to male and female, and subscript refers to the quantile.
̂ refers to the estimated coefficients at the quantile ( of interest and refers to the characteristics of individuals at the quantile ( of interest.
The advantage of implementing the Wellington decomposition along the wage distribution is that the decomposition results will show how the gender wage gap has changed between 2001 and 2012 at each quantile of the wage distribution.
53
Appendix A
:
Decomposition methods3.8
Oaxaca blinder decomposition
3.8.1
The Oaxaca (1973) and Blinder (1973) decomposition method is one of the most commonly used decomposition methods in the labour market literature. It is used to decompose the gender wage gap into endowment and coefficient components. In general, this decomposition can be written as:
Equation 18 ̅ ̅ ( ̅ ̅ ) ̅ ( ) or as
Equation 19 ̅ ̅ ( ̅ ̅ ) ̅ ( ) where and are the male and female mean wages, respectively. ̅ and ̅ denote the male and female mean labour market characteristics of individual , respectively, and and are the male and female estimated coefficients, respectively.
The Oaxaca-Blinder decomposition has been generalised to various settings by Gomulka and Stern (1990), Fairlie (2005), Yun (2004), Machado and Mata (2005), Biewen and Jenkins (2005), and Bauer, Göhlmann and Sinning (2007).
Biewen decomposition
3.9
Equation 18 answers the question of why women’s average wages are less than men’s by attributing the difference to two reasons. First, the difference is attributed to women’s less favourable characteristics ( ̅ and the second to women’s lower returns ( ̅ . A third component that is not incorporated within this equation is one that attributes the difference between men’s and women’s mean earnings to a combined effect of characteristics and returns. The interaction term is presented in Equation 20 and was first introduced by Winsborough and Dickinson (1971). The interaction term would have a value of zero if ̅ or if . However, it would be inappropriate to assign the interaction term to either the impact of characteristics or returns (Jones & Kelley 1984). As discussed by Jones and Kelley (1984), in Equation 18, the interaction term is assigned to the returns effect which is linked to the idea that the gender wage gap will be zero if women’s returns increased to be equal to men’s. While in Equation
54
19, it is expected that the gender wage gap would be zero if men’s returns were reduced to be the same as women’s. Realistically, neither case appears completely plausible. The Biewen (2014) decomposition is represented by Equation 20, which includes an interaction term.
Equation 20
( ̅ ̅ ) ̅ ( ) ( ̅ ̅ ( )
̅ ̅ ̅
Expanding this equation to incorporate multiple explanatory variables, Equation 20 can be written as: Equation 21 ( ̅̅̅̅̅̅̅̅̅̅̅ ( ̅̅̅̅̅̅̅̅̅̅ ∑ ̂ ( ̅ ̅ ∑ ̅ ( ̂ ̂ ∑( ̂ ̂ ( ̅ ̅
where ̅̅̅̅̅̅ and ̅̅̅̅̅̅ are the male and female mean log real hourly wages, respectively. ̅ and ̅ denote the male and female mean labour market characteristics of individual , respectively, and ̂ and ̂ are the male and female estimated coefficients of individual , respectively.
In this thesis, Equation 21 is used to decompose the gender wage gap at the sample mean. The Machado and Mata (2005) method is used to decompose the distributional gender wage gap and the Wellington (1993) decomposition is used to decompose changes in the gender wage gap over time. The distributional decomposition and the dynamic decomposition used in this thesis do not incorporate an interaction term. The incorporation of an interaction term within these decompositions is left for future research.
55
The implication of using the Biewen (2014) decomposition method is the interpretation of the results from the decomposition. As noted in Chapter 3, the three components on the right hand side of Equation 21 are interpreted as:
1. The endowment component, which attributes the difference in wages between men and women to differences in labour market characteristics (endowments) such as education, work experience, and tenure. This component is generally referred to as the “explained” component of the decomposition and represents the impact on the gender wage gap from a change in mean characteristics, holding all else constant. 2. The coefficient component, which attributes the difference in wages between men
and women to the rewards that they receive for their labour market characteristics (coefficients). This component is referred to as the “unexplained” component and is generally used as a measure of discrimination. It also includes the effects of group differences in unobserved individual heterogeneity that the model does not capture (Jann 2008) and represents the impact on the gender wage gap from a change in the mean returns, holding all else constant.
3. The interaction component, which attributes the differences in wages between the two groups to the simultaneous impact of coefficients and endowments. As the impact of endowments and coefficients on earnings is not independent, the interaction between the two terms assists in explaining the interrelated impact of coefficients and endowments on the difference in earnings between men and women (Biewen 2014). In other words, this component represents the overall change in the gender wage gap that cannot be explained by changing the endowment or the coefficient component in isolation.
As outlined by Biewen (2014), the disadvantage of using the most commonly used decomposition methods in the literature is that they are sequential and lead to path- dependence. That is, the decomposition results depend on the ordering of the contributing components. Further, conventional decomposition methods do not allow for an interaction between different components and are not aggregation consistent. This implies that, if one factor is disaggregated into multiple sub-factors, the contribution of the components of the decomposition may change.
56
The decomposition proposed by Biewen (2014) is path-independent, and allows for aggregation consistency. This can be written in general notation as:
Equation 22
( ( ( ( ( where the components on the right hand side denote: 1) the impact of factor one holding all else constant ( ; 2) the impact of the second factor holding all else constant
( ; and 3) the interaction effect that would occur if both factor one and factor