8. Materiales y métodos
8.20. Elaboración y realización de pruebas
8.20.2. Grosor de los módulos
In light of the previous discussion, a combination of hedonic price regres- sions and LS regressions would be required to determine the full value of neighborhood amenities and features. Ideally, both types of estimations should be done jointly; but that would require information for all variables to be available for the observation sample. Thus, for each individual n in the sample, it would be necessary to have data on the two main dependent variables—housing rents or prices pnj, and subjective well-being or life
satisfaction Wn—and on the four main sets of explanatory variables: (1)
an individual’s personal and family characteristics Fn, (2) income yn, (3)
housing features Hn, and (4) neighborhood amenities and characteristics
Zj(n). As explained, the basic hedonic and LS regressions, respectively, are
pn= p(Hn, Zj(n)) (3.11)
Wn= W(Fn, ynj, Hn, Zj(n)). (3.12)
Notice that if all the explanatory variables of subjective well-being are combined, this refers to the vector of life satisfaction aspects x defined above. That is, the vector xn = (Fn, yn, Hn, Zj(n)).
In practice, simultaneous estimation of both regressions with a fully consistent data set lies beyond the scope of this book because the national data sets to be merged worked with different definitions and universes. For this reason, the two basic regressions are estimated independently in the country cases in this volume (which limits interpretation of the results).
As mentioned in chapter 2, the hedonic regression usually estimated is of the following form:
ln (pn) = constant + g 1 Hn+ g 2 Zj(n)+ vn vn = dj(n)+ hn. (3.13)
In addition to the explanatory variables, the equation includes the compos- ite error term vn, which is a combination of a neighborhood-specific error
component dj(n) in which j(n) stands for n’s neighborhood, and a house-
specific error component hn. The neighborhood-specific error component
is common to all houses in the same neighborhood j, and it represents all those amenity characteristics that do not vary within a neighborhood.
Because estimating the hedonic price regression at the neighborhood or city level does not present any further major conceptual difficulties, the rest
an urban quality of life index: theory and methods 77
of this section is devoted to the estimation of the LS regression, considering a few novel methods that have been suggested and applied in the literature. Several of the studies in this volume use these novel methods—known as cardinal ordinary least squares (COLS) and probit-adapted ordinary least squares (POLS)—that were introduced by van Praag and Ferrer-i-Carbonell (2008b). The outcome of this section is that all methods fortunately yield about the same results with respect to the estimation of the subjective trade- off ratios −a1/a2.
In practical econometrics, the estimation problem can be summarized as follows: There are both a variable W to be explained (in this case, life satisfaction self-reported on a scale of 0 to 10 or a monotonous transfor- mation w of W) and a set of explanatory variables (Fn, yn, Hn, Zj(n)). Recall
that F stands for individual and family characteristics, y is income, H stands for house characteristics, and Z stands for neighborhood characteristics. Then one may stipulate an approximate relationship:
wn= a1Fn+ a2yn+ a3Hn+ a4Zj(n)+ hn, (3.14)
where hn stands for the residual error—that is, the difference between wn
and the structural estimate (a1Fn + a2yn +a3Hn + a4Zj(n)). The best esti-
mates of the unknown parameters a are then those values that minimize the sum of squared residuals.5
wn Fn yn Hn Zj n i N = + + + ⎡⎣ ⎤⎦ =
∑
α1 α2 α3 α4 ( ) 2 1 , (3.15)where N stands for the total number of households. The problem is which transformation w of W—that is, which cardinalization—should be taken. In a growing number of papers, researchers take simply the response val- ues from 0 to 10.
In the older versions of satisfaction questions, numerical response cat- egories sometimes were avoided; instead, the answers were cast in verbal ratings, such as “not satisfactory,” “somewhat unsatisfactory,” “satisfac- tory,” and “very satisfactory.” In that case, the dependent variable is a verbal rating. Although verbal ratings can be converted into positions on a numerical scale, that step clearly introduces some arbitrariness. Whereas some authors employ such conversions, others use probit or logit speci- fications, maintaining that the ordinary least squares (OLS) specification would create an arbitrary cardinalization of life satisfaction.6
The probit and logit specifications, however, imply arbitrary cardinal- izations as well. Recall that the probit model assumes a latent model:
where it is assumed that the response categories correspond with a parti- tion of the real axis into T intervals (−⬁, m1], (m1, m2], . . . , (mt−1, mt], . . . ,
(mT−1, ⬁), such that wn belongs to the tnth interval, where tn stands for n’s
response category, and h is assumed to be distributed as a normal stan- dardized random variable with mean equal to 0 and variance equal to 1. By definition, the chance of observing a response in interval t by respon- dent n is then
Pn(t) = N[mt− (a1Fn+ a2yn+ a3Hn+ a4Zj(n))]
− N[mt–1− (a1Fn+ a2yn+ a3Hn+ a4Zj(n))]. (3.17)
The probit estimates are found by maximizing with respect to a and m the sample probability—that is, the product of the chances P tn n
n N
( )
=∏
. 1The latent cardinalization is caused by the choice of the distribution of e. A similar story holds for the logit, where the assumption that the error follows a normal distribution function is replaced by the logistic distribu- tion function. Again, here w may be interpreted as a (an ordinal) utility level, and equation (3.18) may be interpreted as describing an indifference surface corresponding to a specific satisfaction level wn:
wn= a1Fn+ a2yn+ a3Hn+ a4Zj(n). (3.18)
At first glance, there seems to be a serious problem because it is not clear which specification should be chosen for the estimation method. In practice, however, the problem is minimal because these two methods yield about the same gradient vectors α, except for a multiplication fac- tor. Indeed, Amemiya (1981) has found that probit and logit specifica- tions yield the same estimates, apart from a multiplicative factor (see also Ferrer-i-Carbonell and Frijters 2004).
Although the numerical estimates look rather different when using different cardinalizations, the trade-off ratios look very similar when the estimators are “normalized.” Let aˆ (1), aˆ (2) be two estimators of the gradi- ent vector corresponding to two cardinalizations. Then their ratios may be easily compared by “normalizing”7 both vectors—that is, by dividing them by their respective norms αˆ =
∑
αˆi2.Likewise, van Praag and Ferrer-i-Carbonell (2008b) present estimation results for a satisfaction question, using four different cardinalizations—namely ordered probit, OLS, COLS, and POLS—showing that the four estimates yield about the same trade-off ratios.
In sum, regardless of the cardinalization method used, approximately the same estimate of the gradient of the satisfaction indifference curve will be found (apart from a method-specific proportionality factor) and, as a
an urban quality of life index: theory and methods 79
consequence, approximately the same subjective trade-off ratios. The reli- ability of those estimates, in terms of their standard errors, will be about the same as well. In plain language, there are many methods that yield similar results.
All of these techniques also may be used to assess the effect of urban amenities. Instead of considering life satisfaction as the dependent vari- able, such an analysis would focus on satisfaction with the urban envi- ronment. Variables typically considered in this domain of satisfaction include such features as “public street lighting” and “vandalism in the neighborhood.” The problem with this type of estimated equation is that researchers frequently include too many correlated explanatory variables (mostly dummy variables), which leads to statistically nonsignificant esti- mates for many effects. The chapters that follow, however, also provide many meaningful estimates and, thus, offer one of the first large-scale and consequential studies on urban environment in the literature.
A final point regarding econometric methods is the possible cardinaliza- tion of LS variables. An individual who is very satisfied with his or her life would report 8 or 9 (on a 10-point scale), and an unsatisfied individual in all likelihood would report low-number answers; but the relative magnitude of the answers is not significant if an ordinal interpretation is followed. A problem is that ordinal scales do not allow for interpersonal or intertem- poral comparisons. Normative statements comparing individuals’ levels of happiness are possible only if some strong assumptions are made. First, it must be assumed that the wording of questions is emotionally translated by respondents in the same way and that they evaluate parallel situations simi- larly. This assumption has been examined, with roughly positive results (see van Praag 1991). A second point is that one cardinalization and, henceforth, one cardinal utility function must be agreed on for all individuals.
A cardinal measure is required to compare or analyze life satisfaction between individuals. For example, a simple statistic, such as national aver- age life satisfaction (that is, the average of individual life satisfactions), is based on the implicit assumption of cardinality. Therefore, a careful approach to cardinalization is relevant. One way to cardinalize is to trans- form the responses on the LS scale, using a probability distribution func- tion that has a range between 0 and 1. This choice has nothing to do with a probabilistic content of the phenomenon under consideration; rather, it concerns only the analytical suitability of this procedure. A normal distri- bution function of the type N [a1Fn+ a2yn+ a3Hn+ a4Zj(n); 0,σ], which
can vary between 0 and 1, seems a reasonable choice. This formulation gives rise to the cardinal median transformation8:
α1 α2 α3 α4 , F y H Z wn n+ n+ n+ j(n) = σ (3.19)
where w– n is the average between the upper and the lower bounds for each
satisfaction questions, one can run a simple linear regression model on sat- isfaction data. Different alternatives likewise can be obtained to estimate the same model. In van Praag and Ferrer-i-Carbonell (2008b), the COLS and POLS methods are described in detail. These methods are merely two of many different possible cardinalizations of the satisfaction answers. When a specific cardinalization is accepted, it makes sense to consider average happiness in a society (see, for example, Easterlin 1974 and van Praag and Ferrer-i-Carbonell 2008a) or the inequality of the distribution of happiness in a population.