A possible problem related to the estimation of (42) is that the various price indicies are highly correlated. If the explanatory variables are not independent of each other but correlated to a “large” extent, the estimated coefficients lose predictive power. Variance of the model increase, thus increasing the standard deviations and reducing significance. In addition, coefficients are sensitive to changes in model specification. This is symptoms of a high degree of multi- collinearity.
To test whether multicollinearity was a problem, I regressed each explanatory
variable j on the −j other variables. The Variance Inflation Factor (VIF),
explanatory variables, indiciating a severe multicollinearity problem. Three possible remedies were considered.
The first-best approach to a multicollinearity problem is always to obtain more and “better” (less correlated) data (Woolridge 2009). Unfortunately, be- cause of different standard (consumption good classifications) for consumer sur- veys before 1998, this was not possible.
The second option considered was running a Tikhonov regularization, also known as a ridge regression. Ridge regressions will produce estimates with lower variance, however somewhat biased. The rationale for doing a ridge-regression is the bias-variance tradeoff. Unbiased estimators with high variance can be traded off for somewhat biased coefficients with lower variance. I abstained however from doing this estimation here.
The third and chosen method is actually two separate operations. Since the price indicies are highly correlated, performing some sort of standarization could reduce the correlation. By for instance looking at how much prices for consumption good category j has changed, relative to the overall price level, correlation could be reduced. Each price variable was therefore deflated by
the overall consumer price index P . The interpretation of the γ0s would then
be the effect on demand of an increase in the price of good j, relative to the average good. Unfortunately, the correlations were reduced but not by much. Performing the same VIF estimations, the problem appeared to persist. The final remedy, since all the price indicies are telling much of the same story, was to drop prices except the one corresponding to the dependent variable. The
price effect variable, γi, then measures the effect of a 1% increase in the price of
good category i relative to the average good, on good category i0s budget share.
The model estimated will therefore be the one given below, where subscript ei denotes equivalent income.
wi= αi+ γilog Pi P + βilog Xei P (42)
The interpretation of the coefficients is as follows. αi is the default budget
share, i.e. what the budget share would have been if the price of the good was
identical to the average price and real oncome was 1. βi measures the change
in budget share of good i when real income increases by 1%. γi is the already
covered price effect.
To obtain these coefficients, I use data from the Consumer Survey of Statis- tics Norway for the period 1998 - 2008. These data reports average consump- tion expenditures on various categories, for different household types, for each
year16. The sector-specific and general price indicies are obtained from Statis-
tics Norway’s price statistics for the same periods. The Consumer Surveys cover 5 different types of households. These are; single individual households, cou- ples with children 0-6, couples with children 7-19, couples without children and single provider with young (0-19) children.
16As a matter of fact, Consumer Surveys are undertaken within 3-year intervals. The numbers reported as average for year t in this paper is actually the average of the Consumer Survey in the period(t-1 - t+1)
Since (43) is on equivalent income form, budgets have to be scaled according to the expected equivalence scale of the respective household types. Therefore, the expected equivalence scales are calculated. The probability of having X children, given that the household is in the relevant group, is multiplied with the corresponding equivalence scale. For households with three or more children, data is not reported, so these are given an equivalent scale corresponding to
a similar household with three children. All data are taken from Statistics
Norway’s demographic statistics.
In principle, (43) can be estimated in multiple ways. Two approaches is especially popular when estimating such systems. The first is standard OLS, equation by equation. An alternative is to estimate the system by Feasible Gen- eralized Least Squares (known as Seemingly Unrelated Regressions). However, the coefficients of OLS is consistent provided that the error term are uncorre- lated with the regressors and has an expectation equal zero. I therefore estimate the unrestricted system by OLS.
In addition to multicollinearity I was worried about heteroskedasticity be-
cause low-income households are likely to be budget constrained. While it
doesn’t affect the unbiasedness of the estimators, it makes inferences meth- ods less reliable. It should therefore be tested for. When estimating the system, I systematically obtained the residuals and regressed them again on the ex- planatory variable performing a Breusch-Pagan test for all the residuals. If the explanatory variables were able to predict the residuals to some extent, heteroskedasticity would have been a problem and Generalized Least Squares would have had to be applied. I did not however detect heteroskedasticity and the t-statistics is thus, at least keeping the possibility of heteroskedasticity in mind, reliable.
The results from estimating (43) are presented in table 22.
The perhaps most surprising fact is that the income effect sign on food is positive, as it is in most studies reported to be a necessessity. This is likely to be caused by the fact that, in my sample, families with children have higher equivalent income than those who don’t. Other than this, most coefficients are significant and have an intuitive sign.
When using these estimated demand equations, I calculate indirect taxes paid in each income group by estimating expenditures on the different consump- tion categories, using the estimated average wages from table 4 and subtracting the savings rates from Halvorsen (2011) in the estimated demand system, before multiplying the results with the indirect tax rates, taken from table 3. The final results are given in appendix D. All estimated budget shares add up to a value the interval [0.97,1.06], where the inaccuracies are at the lowest deciles. The majority of the budget shares add up to approximately one.