CAPÍTULO IV. MECANISMOS ESPECÍFICOS PARA LA EVALUACIÓN, ACTUALIZACIÓN Y, EN SU CASO, CORRECCIÓN DEL PROGRAMA
GOBIERNO DE LA CIUDAD DE MÉXICO SECRETARÍA DE SALUD
All regressions used in the pooled OLS and the selection model are specified using robust standard errors. Using robust standard errors does not change the estimated coefficients but it adjusts for
33
serial correlation and heterogeneity that may exist between the error terms (Yun, 2003). If the assumption of no serial correlation is violated, error terms for different years will be correlated, if there is unobserved heterogeneity. Since the panel data used in this minor dissertation is a cross section over time, it becomes important to control for any unobserved heterogeneity that might arise due to the use of robust standard errors.
This section then outlines the three econometric models that will be used to analyse the effects of trade union participation on wages in South Africa between 2008 and 2015. These techniques are pooled OLS regression, the selection model and Oaxaca-Blinder decomposition. These methods are ideal for this minor dissertation, because they combine the individual variable factors, the industry-specific factors and location-related factors in analysing wage differentials.
The first estimation model to be used is pooled OLS with a union dummy variable that captures the effect union participation has on wages. One drawback of the pooled OLS method is that it does not allow the estimation of differences in unobservable workers characteristics amongst union and non-union members (Addison & Hirsch, 1986). This implies that, if the unobservable variables influencing earnings are correlated with the decision of joining a trade union, then the estimates obtained using pooled OLS regression will be biased(Armstrong & Steenkamp, 2008).
Union participation in this minor dissertation is accounted for by using a selection mechanism (Cragg hurdle model) as proposed by Cragg (1971). The Cragg hurdle model is used because it features the selection mechanism of the Heckman and Tobit models. The Heckman and Tobit model assumes that union participation and an earnings decision can be modelled as one equation. The Cragg hurdle model, however, relaxes this assumption and models both decisions separately (Eakins, 2014). The last method applied by this minor dissertation is Oaxaca-Blinder decomposition that is used primarily to measure whether there is discrimination in the labour market, i.e. to see whether the wage advantage of union members is indeed a union wage premium, or whether there are other work characteristic differences that can also explain the union wage premium.
4.3.1 Pooled OLS Regression
Pooled OLS wage regression estimates the impact selected variables have on wages (Blackburn, 2005). It is a single wage equation with an exogenous union membership dummy variable.
34 𝑙𝑜𝑔𝑊𝑖𝑡 = 𝛼 + 𝛽𝛸𝑖𝑡+ 𝛿𝜇𝑖𝑡+ 𝜂𝑖𝑡+……….(1)
Where:
The dependent variable (𝑙𝑜𝑔𝑊𝑖𝑡) is the log of monthly wages of individual i and 𝛸𝑖𝑡 consists of a vector of individual worker i’s characteristics such as education, gender, race, age, job tenure and geographical location, which may determine differences in wages. 𝛼 is the estimated intercept. Equation 1 also captures the union membership effect on monthly wages through the inclusion of a dummy variable (𝜇𝑖𝑡) and 𝜂𝑖𝑡 is the error term.
Equation 1 assumes the 𝛽𝑖 coefficients to be the same for both unionised and non-unionised workers. That means individual worker i’s characteristics are assumed to be rewarded the same, whether an individual is unionised or not (Armstrong & Steenkamp, 2008). This implies that the wage setting mechanisms are the same in the union and non-union sectors. However, this is not the case because unions usually bargain higher wages for their members. The wages for union members will be different from those of non-union members, when controlling for different endowments of workers. One is expected to find evidence of a union wage premium due to union power (Casel & Posel, 2009).
Due to this limitation the minor dissertation uses a union/non-union model, allowing separate earnings regimes for the different sectors. This is done in order to assess if, after controlling for similar key determinants of wages, the residuals of the two models will be the same or not. The minor dissertation therefore estimates the pooled OLS wage equations for the pooled sample of union members (U) and pooled sample of non-union workers (N), separately:
𝑙𝑜𝑔𝑊𝑖𝑡𝑢 = 𝛼 + 𝛽𝛸𝑖𝑡𝑢+ 𝜂𝑖𝑡𝑢………(2)
𝑙𝑜𝑔𝑊𝑖𝑡𝑛 = 𝛼 + 𝛽𝛸𝑖𝑡𝑛+ 𝜂𝑖𝑡𝑛………..(3)
Where:
𝑙𝑜𝑔𝑊𝑖𝑡 is the log of monthly wages for individual i in state j = U, N. 𝛸𝑖𝑡 a vector of individual i’s. Job characteristics such as age, education, occupation, economic sector and location that influence wages in each sector. 𝜂𝑖𝑡 is the error term. 𝛼 is the estimated intercept. Equation 2 presents an individual i who is a member of a trade union and who selected to be in a union sector (U), and
35
equation 3 is when an individual i does not belong to a trade union and falls under a non-union sector (N).
A union/non-union model will capture the union premium more accurately in sectors having different earnings structures. Equation 4 below will explain how union premium changes as you move from the pooled OLS to the union/non-union model.
The union/non-union estimates the union premium by:
𝑢̂ = 𝑒𝑥𝑝[(𝛽𝑗 𝑢− 𝛽𝑛)𝑥̅̅̅]-1………(4) 𝑛
𝑋̅ may be a vector of mean characteristics from either the union sample, the non-union pooled sample and the pooled sample(Azam & Rospabe, 2005). Using the non-union vector as the reference group means that the union premium should be interpreted as the additional earnings that non-union members would earn if they joined a union given their existing attributes. If the union vector is used it means that the union premium represents the amount by which union member’s earnings would fall in the absence of union membership, given their existing attributes (Armstrong & Steenkamp, 2008).
The minor dissertation did not use fixed effects as a test for individual heterogeneity, because it was argued that in the fixed effects model variables such as gender are absorbed by the intercept and that the fixed effect estimates are particularly sensitive to measurement error in changing union status which can bias the estimated coefficients to zero (Freeman, 1980). Hence the use of union/non-union model and not fixed effects model. Moreover, Bell and Jones (2013) also argued that random effects model is preferable in most instances because it analyses and separates both the within and between components of an effect explicitly and assesses how those effects vary over time and space rather than assuming heterogeneity away with fixed effects. Gujarati and Porter (2009) adds that heterogeneity is not a technical problem calling for an econometric solution but a reflection of the fact that we have not started on our proper business of trying to understand what is going on.
Trade union membership is treated as an exogenous variable when the pooled OLS model is estimated, thus causing estimators to be biased and inconsistent (Armstrong & Steenkamp, 2008).
36
Pooled OLS regression is also insufficient if one is interested in estimating what is happening at different points in the wage distribution (Yu, Lu & Stander, 2003).
Certain conditions apply to OLS estimates. This includes an assumption that errors from regressions are homoscedastic and are distributed normally. Blackburn (2005) rejected the normality assumption of the error terms indicating that estimates will be biased. Despite the listed shortcomings, pooled OLS regression serves as a useful method to compare other techniques.
Since pooled OLS estimates might be biased and inconsistent, the Cragg hurdle model will be estimated. This selection model is included in the earnings function in an effort to obtain unbiased estimates of coefficients (Armstrong and Steenkamp, 2008).
4.3.2 Selection Model (Cragg hurdle Model)
One reason for wage gaps is endogeneity in union status and this endogeneity is controlled for in this minor dissertation. The endogeneity arises because workers choose to join unions and employers may choose particular workers for union jobs, indicating that union members may not be a random sample of all workers (Casale and Polse, 2009).The selection model allows for endogenous (Ui) union status that controls for possible reverse causality between earnings potential and union participation (Azam & Rospabe, 2007), because selection into unions cannot be separately determined from the wages earned by unionised members (Bhorat, Goga & Van der Westhuizen, 2012). The endogeneity arises if the likelihood of correlation between the independent choice variable and some unobserved variables captured by the error term are not being controlled for (Armstrong and Steenkamp, 2008).
This study controls for the endogeneity by using a selection model known as Cragg hurdle model can be defined as a technique that fits a linear or an exponential hurdle model for a bounded dependent variable (Wooldridge, 2002). Wooldridge (2002) further explained that the Cragg hurdle model combines a selection model that determines the boundary points of the dependent variable with an outcome model that determines its non-bounded values. Separate independent covariates are permitted for each model. This implies that, observations where the dependent variable is equal to one of the boundary values are not the result of the inability to observe the distribution above or below a certain point (Belotti, Deb, Manning, & Norton, 2015).
37
Cragg hurdle models are characterised by the relationship 𝑦𝑖 = 𝑠𝑖ℎ𝑖∗, where 𝑦𝑖 is the observed value of the dependent variable. The selection variable, 𝑠𝑖, is 1 if the dependent variable is not bounded and 0 otherwise. In the Churdle model, the lower limit that binds the dependent variable is 0 so the selection model is:
𝑠
𝑖={1 𝑖𝑓 𝑧𝑖 𝛾+𝜀𝑖>0
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 }
………(5)
where 𝑧𝑖 is a vector of explanatory variables, 𝛾 is a vector of coefficients, and 𝜀𝑖 is a standard normal error term. Cragg hurdle model allows a different lower limit to be specified and the conditional heteroscedasticity of the random error 𝜀𝑖 is allowed if sub-option heteroscedasticity is specified in select model. The continuous latent variable ℎ𝑖∗ is observed only if 𝑠𝑖 = 1. The outcome model for the minor dissertation is the exponential model, as proposed in Cragg (1971):
ℎ𝑖∗ = exp (𝑥𝑖𝛽 + 𝑣𝑖)………(6)
where 𝑥𝑖 is a vector of explanatory variables, 𝛽 is a vector of coefficients, and 𝑣𝑖is an error term. The exponential model, 𝑣𝑖has a normal distribution. Cragg hurdle model allows for conditional heteroscedasticity of the random error 𝑣𝑖if the user specifies the heteroscedasticity option. The parameters and regressors in the models for ℎ𝑖∗and for 𝑠𝑖 may differ.
The selection model differs from the pooled OLS regression model, where Ui enters as an exogenous dummy variable (Azam & Rospabe, 2007). According to Armstrong and Steenkamp (2008) endogenous union membership should be modeled as a latent variable Ui*. This latent variable is supposed to be a linear function of a set of exogenous variables. The set of exogenous variables should include at least one variable relating to union status and not to wages. However, this leads to a number of problems, like for instance if unobservable variables influencing wages are correlated with the union membership decision. The sample will then not be random and the estimates obtained will be biased. If workers in union jobs are different in unobservable ways to workers in non-union jobs, and if these omitted characteristics are related to earnings, then OLS estimates of wage gap within the union and the non-union sectors will be biased. It can be suggested that a selection model like that used by Cragg (1971) be included as one of the estimation techniques. Cragg (1971) proposed the Cragg hurdle model, in the study that explained the demand
38
for durable goods. The use of this model was to determine individual decisions, such as money donated to charity, cigarette consumption and time spent volunteering. Hence the use of the Cragg hurdle model in this minor dissertation, as it mainly determines individual’s decision such as union participation.
The reviewed studies have typically used the exclusion restriction to predict union membership. The exclusion restriction that was consistently used in the selection equations across the studies reviewed was the dummy variable for whether the individual lives with other union members. The selection of this exclusion restriction as explained by Casale and Posel (2009) is that it reflects “household-specific tastes for unionisation, such as the willingness to invest union dues and time in meetings for the sake of long-term security and wage gains”. It may also reflect firm strategies of recruitment of family members by employers. However, this minor dissertation will use exclusion restriction as per equation 7 below.
According to Armstrong and Steenkamp (2008) the earnings regression can be specified as follows:
𝑤𝑖𝑡 =𝛼 +𝛽𝑋𝑖𝑡+ 𝛿𝑈𝑖𝑡+ 𝜌𝑖𝑡𝛿𝜀𝑖𝑡{∅(𝛾𝑍𝑖𝑡)
Φ(𝛾𝑍𝑖𝑡)}+𝜂𝑖𝑡
………(7)
Where:
𝑤𝑖𝑡 is monthly wages for individual i and 𝑋𝑖𝑡 a vector of individual i’s and job characteristics such as age, education, occupation, economic sector, location, that influence wages in each sector. Equation 4 also captures the effect endogenous union status might have on monthly earnings through the inclusion of a dummy variable (𝑈𝑖𝑡). Variable 𝜌𝑖𝑡 is known as the self-selection
variable and indicates the union membership decision through∅(𝛾𝑍𝑖𝑡)
Φ(𝛾𝑍𝑖𝑡), while 𝜂𝑖𝑡 is the error term.
The selection model (Cragg hurdle model), models the individual’s decision to participate as a union member, as well as the earnings decision. This selection model assumes that, for an individual to attain the union wage premium, they must first participate as a member of a trade union. Secondly an individual must be employed in the labour market so that he/she can receive wages (Eakins, 2014). The returns to productive characteristics are assumed constant over union and non-union workers. This implies that there is an intercept effect as was the case with pooled
39
OLS estimates (Armstrong and Steenkamp, 2008). Since the selection model draws from pooled OLS estimates, the model will suffer the same problems as pooled OLS. Furthermore, attempts to control for selection bias in our estimations are complicated further because selection may also occur at other stages of the employment decision. This is particularly relevant in South Africa, where very high unemployment rates mean that employment and labour force participation are not synonymous (Casale and Posel, 2009). Because none of these methods have proved their superiority, the sensitivity of the results to the different specifications will be tested.
4.3.2 Oaxaca-Blinder Decomposition
Oaxaca-Blinder (1973) decomposition compares the wage structures of union and non-union members. Then it decomposes the mean wage gap between union and non-union members into the explained and unexplained components (Oaxaca & Ransom, 2001). The explained component arises from the differences in the average productive characteristics of the union and non-union members. The unexplained component results from differences in the compensation structures between the union and non-union members. The unexplained component is believed to be an estimate of discrimination in the labour market (Oaxaca & Ransom, 2001). This minor dissertation does not expect union wage premium to be explained by differences in worker characteristics.
This minor dissertation has two groups union (A) and non-union (B) workers, an outcome variable log wages (Y) and a set of predictors such as education, work job tenure and many more variables that are discussed below. The decomposition equation that is drawn from the wage equations of union and non-union workers is specified by Jann (2008) as follows:
Based on the linear model
𝑌ℓ=𝑋ℓ′𝛽ℓ+ 𝜀ℓ, 𝐸(𝜀ℓ) = 0, ℓ𝜀{𝐴, 𝐵}………(8)
where 𝑋 is a vector containing the predictors and a constant, 𝛽 contains the slope parameters and the intercept, and 𝜀 is the error, the mean outcome difference can be expressed as the difference in the linear prediction at the group-specific means of the regressors. That is
𝑅 = 𝐸(𝑌𝐴) − 𝐸(𝑌𝐵) = 𝐸(𝑋𝐴)′𝛽𝐴− 𝐸(𝑋𝐵)′𝛽𝐵………(9) since
40 𝐸(𝑌ℓ) = 𝐸(𝑋′ℓ𝛽ℓ+ 𝜀ℓ)= 𝐸(𝑋′ℓ𝛽ℓ)+ 𝐸(𝜀ℓ)= 𝐸(𝑋ℓ)′𝛽ℓ
with𝐸(𝛽ℓ) = 𝛽ℓ 𝑎𝑛𝑑 𝐸(𝜀ℓ) = 0by assumption.
To identify the contribution of group differences in predictors to the overall outcome difference, equation (9) can be rearranged, as follows (Daymont & Andrisani 1984):
𝑅 = [𝐸 (𝑋𝐴) − 𝐸(𝑋𝐵)]′𝛽𝐵+ 𝐸(𝑋𝐵)′(𝛽𝐴− 𝛽𝐵) + [𝐸(𝑋𝐴) − 𝐸(𝑋𝐵)]′(𝛽𝐴− 𝛽𝐵)…………..(10)
This is a three-fold decomposition that (this minor dissertation will use), implying that the outcome difference is divided into three parts (Jann, 2008):
𝑅 = 𝐸 + 𝐶 + 𝐼
The first summand
𝐸 = [𝐸(𝑋𝐴) − 𝐸(𝑋𝐵)]′ 𝛽𝐵
amounts to the part of the differential that is due to group differences in the predictors (the endowments effect). The second component
𝐶 = 𝐸(𝑋𝐵) ′ (𝛽𝐴 − 𝛽𝐵)
measures the contribution of differences in the coefficients (including differences in the intercept). The third summand
𝐼 = [𝐸(𝑋𝐴) − 𝐸(𝑋𝐵)]′ (𝛽𝐴 − 𝛽𝐵)
is an interaction term accounting for the fact that differences in endowments and coefficients exist simultaneously between the two groups.
Decomposition in equation (10) is formulated from the viewpoint of non-union. That is, the group differences in the predictors are weighted by the coefficients of non-union to determine the endowments effect (E). In other words, the E component measures the expected change in non- union’s mean outcome, if non-union had union’s predictor levels. Similarly, for the second component (C), the differences in coefficients are weighted by non-union’s predictor levels. That is, the second component measures the expected change in non-union’s mean outcome, if non- union had union’s coefficients (Jann, 2008).
41
The Oaxaca-Blinder decomposition technique is used, because decomposition using the pooled wage structure results in a more adequate reflection of competitive structures in the labour market, and it also produces the lowest standard errors for estimated differences (Neumark, 1988).