Macro Econometría Avanzada
Ignacio Lobato (ITAM)
ECONOMETRIC MODELLING
IEconomic models: Establish relations between economic variables,
and traditionally distinguish between two types of variables: endogenous variables that are determined inside the model, and exogenous variables that are determined outside the model. Examples: corn output as a function of quality of soil, amount of fertilizers, amount of labour, amount of rain, etc. Salaries as a function of education, age, experience, ability, place of residence, etc.
We target to measure (estimation/testing) some aspect of the
dependence between one endogenous variable,Y, and a set of
explanatory variables, Z, which can be endogenous or exogenous.
This dependence is fully contained in the conditional distribution ofY
given Z,but typically we will be interested in just some aspect Typically the aspect of interest will re‡ect some measure of central dependence, such as the conditional mean or the conditional median
But other aspects as conditional quantiles are also employed in econometrics
Econometric Models
In Econometrics we conceive the economic variables as random variables
The dependence between random variables appears in the conditional distribution
An econometric model establishes some restrictions in the conditional distribution or in some aspects of the conditional distribution (such as the CEF or the conditional variance)
These restrictions may come from economic theory or not
Econometric Models
a theory-based example is that the basic intertemporal
consumption-based asset pricing model establishes that (…rst order condition) given by
Et "
β0 ct+1 ct
α0
rt+1 1
#
=0,
where β0 is the discount parameter,
ct is the representative agent’s consumption at timet, rt is the gross return to bonds at time t,
andEt denotes the conditional expectation given the information set
publicly available at timet.
α0 corresponds to the representative agent’s relative risk aversion
coe¢ cient,
whereas β0 is the discount factor
In order to analyze the aspect of interest of the conditional distribution of Y given Z,we can use the notation:
Y =m(Z) +U,
whereY is the endogenous variables, and
Z= (Z1, ...,Zp)0 is a p 1 vector, which may include endogeneous
and exogenous variables
andU is an error term with E(U) =0.
The relation between U andZdetermines what the model m(Z)will
mean, that is, the aspect of interest of the conditional distribution of
Note that assumptions can be made on the error term U
OR alternatively one can state the assumptions directly on the economic random variables (see Goldberger or Wooldridge).
That is, to assume that E(YjZ) =Z0β, say, instead of writing Y =Z0β+ε,and assume that E(εjZ) =0.
Both are equivalent, but it could more natural to place assumptions directly on economic variables rather than in some transformations of
them (ε Y Z0β).
The relation between U andZdetermines the kind of model that we
have in mind, that is, the meaning ofm(Z).
In the next slides we will argue that selectingm(Z) =E(YjZ)is a common practice in Econometrics because its simplicity and
convenience.
In this case U =Y m(Z)satis…es the mean independence
assumptionE(UjZ) =0
The selection/interpretation of a particular m(Z)can be rationalized/justi…ed in a prediction framework
Imagine you observe Z and want to predictY using some function of
Z,say h(Z)
In order to select the optimalh you need to introduce some loss
function that weights the errors you can make (the errors are
Y h(Z))
A loss function is a distance function between the random variablesY
andh(Z)
the conditional expectation function (CEF)-the regression function-is
the optimal h when you choose the mean squared error loss function
E(ε2)
whereε is the error, ε= [Y h(Z)]
that is
E(YjZ) =arg min
h(Z)E [Y h(Z)] 2
the conditional median function (median regression) is the optimal h
when you choose the mean absolute value loss function
Ejεj
that is
median(YjZ) =arg min
h(Z)EjY h(Z)j
theα th conditional quantile function is the optimalh when you
choose to minimize the loss function
The regression model can be represented in terms of an error term, i.e.
Y =m(Z) +U with E(UjZ) =0,
which implies that,
1.- E(U) =0.
2.- For any h such thatE h(Z)2 <∞,
C(U,h(Z)) =E(U h(Z)) =0.
IQuantile Regression: The α quantile regression function is de…ned
as,
QYjZ,α(z) =inf y :FYjZ(yjz) α
Median regression: α=1/2.
Notice that, for Y continuous,
IMode Regression:
Let fYjZ be the conditional density of Y given Z,then,
M(z) =arg max
y
fYjZ(yjz).
Parametric Models
Typically econometric models are parametric, that is
the model proposes a particular functional (parametric) form for the relation between the economic random variables
(for the CEF, for the median regression, for the quantile regression, etc)
this form depends on a vector of unknown parameters,which are
unknown, but they can be estimated from the data
for example
Y =mθ(Z) +U for some θ 2Θ Rs,
For instance
ILinear model: θ = β0,β0 0,β= β1, ...,βp 0
andmθ(Z) =β0+Z0β;
i.e.
Y = β0+Z0β+U.
What does this model mean?
It depends on which aspect is modeled (which loss function is chosen)
Examples:
* Linear regression model: E(YjZ) =β0+Z0β,when Zj increases
in one unit, and the rest of explanatory variables remain …xed,Y
varies in mean βj units.
* Linear quantile regression model: QYjZ,α(Z) =β0+Z0β,when Zj
increases in one unit, and the rest of explanatory variables remain …xed,Y varies in median βj units (α=0.5).
* Linear mode regression model: M(Z) =β0+Z0β,when Zj
Data structures
Empirical evidence: Observations of economic agents characteristics and decisions, at individual or aggregated level.
Data ! Wn =fw1, ...,wng
wi = (yi,zi)! i th observation of the variableW= (Y,Z0)0.
Data generating process (DGP): The probabilistic model that
explains how the data are generated according to the particular
sampling squeme. Variables are considered random, whichjoint (or
conditional, many times) distribution is theDGP.
Cross-sectional data: Data coming from a random sampling scheme; i.e. we consider as random experiment to take an individual at random from an in…nite population.
Sample space Ω:all individuals in the population.
Random vector: W=W(ω),ω 2Ω;W:Ω!R1+p.
We can consider the data set Wn =fw1, ...,wngas the realizations
of n random vectorsfW1, ...,Wng,which are independent copies of W.
Conditional Expectations
For a vector of random variablesW= (Y,Z0)0 with joint distribution:
FW(w) = Pr(W w)
= Pr(Y y,Z1 z1, ...,Zp zp)
= Eh1fY yg1fZ1 z1g 1fZp zpgi,
with w= (y,z0)0,z= (z1, ...,zp)0 and 1fAg is the indicator function
of the event A.
Notice that:
Pr(Y yjZ z) = FW(w)
FZ(z)
The conditional distribution of Y given thatZ=zevaluated at y is
given by:
FYjZ(yjz) = Pr(Y y,Z=z)
Pr(Z=z) when Zis discrete
or
FYjZ(yjz) = lim
h!0
Pr(Y y,Z2[z,z+h])
Pr(Z2[z,z+h]) when Z is continuous
=)Conditional density (when(Y,Z) is continuous) ofY given Z is
fYjZ(yjz) = fW(w) fZ(z)
The regression model (conditional mean):
E(YjZ=z) =
Z
RyFYjZ(dyjz).
IE(YjZ=z)is a function of z2Rp.
IE(YjZ)is a transformation of the random variableZ.It is, of course, a random variable, i.e. E(YjZ):Ω!R.
IPartial e¤ect: The partial e¤ect on E(YjZ=z) of a marginal change in Zj,∆Zj say, taking the rest of explanatory variables
constant atZ=zis
∆jE(YjZ=z) =E(YjZ=z) E(YjZ=z+ij∆Zj),
with ij a vector of ceros with a 1 in the j-th position.
Notice that,
lim
∆Zj!0
∆jE(YjZ=z) ∆Zj
= ∂
∂zjE
(YjZ=z)
=) ∆jE(YjZ=z) ∆Zj
∂ ∂zjE
(YjZ=z)
IConditional Expectations of Generic Functions:
For a generic function (where now we considerY to be a vector of
dimensionm)-until nowm=1
h:Rm Rp !R
the conditional expectation of h(Y,Z)given Z=zis
E(h(Y,Z)jZ=z) =
Z
Rh(y,z)FYjZ(dyjz).
Examples:
Conditional Variance V(YjZ):h(Y,Z) = (Y E(YjZ))2 and
V(YjZ=z) =Eh(Y E(YjZ=z))2 Z=zi.
Conditional Covariance C[ (Y1,Y2)jZ]:
h(Y,Z) = (Y1 E(Y1jZ)) (Y2 E(Y2jZ))
and
C[ (Y1,Y2)jZ=z]
IMain Properties of Conditional Expectations (see "tarea 1"): 1.- Y 0 a.s.=)E(YjZ) 0a.s.
2.- jE(YjZ)j E(jYjjZ)a.s.
3.- kV(Z)k=0,=)E(YjZ=E(Z)) =E(Y)a.s.
4.- Y =h(Z)a.s.=)E(YjZ) =h(Z)a.s.
5.- E[E(YjZ)] =E(Y).
6.- E[E(YjZ1, ...,Zl,Zl+1, ...,Zp)jZ1, ...,Zl] =E(YjZ1, ...,Zl) a.s.
7.- Y andZ independent=)E(YjZ) =E(Y).
8.- E(jh(Z) Yj)<∞=)E(Y h(Z)jZ) =h(Z) E(YjZ)a.s.
9.- E Y2 <∞=)[E(YjZ)]2 E Y2 Z a.s.
10.- E Y2 < ∞=)E [Y E(YjZ)]2 E [Y h(Z)]2 for
anyh such thatE h(Z)2 <∞.
11.- If (Y,Z)are independent of V,then,E(YjZ,V) =E(YjZ) a.s.
12.- C(Y,Z) =C(E(YjZ),Z).
Linear Projectors and loss functions
The correlation model is a linear model with errors satisfying that:
E(U) =0 and E(ZU) =0 Orthogonality condition.
Then,
β0 = E(Y) E(Z)0β andβ=V(Z) 1C(Y,Z),
The linear proyector of Y on(1,Z0)0 is:
L(Yj1,Z=z) = β0+z0β,
which is the best way of predicting the events ofY from a linear
combination of the random vector Z.
Best, in the sense of minimizing the mean squared error.
Notice that:
β0
β = (barg min0,b)2Rp+1E
h
Y b0 Z0b 2i
=)Among all the linear predictors of Y givenZ,consisting of all possible linear combinations of Z,L(Yj1,Z)is the one with smaller mean squared error.
L(Yj1,Z)is also known as best linear predictor.
Obvious statement :If the regression model is linear then
L(Yj1,Z=z) =E(YjZ=z).
Linear projector through the origin:
L(YjZ) =Z0δ
with
It may be the case that E(YjZ=z)is not a constant function but
L(Yj1,Z) =β0 a.s.
ILoss Function: We can consider other linear projectors in terms of
alternative loss functions L, a distance function between the r.v.’sY
andb0 Z0b.The linear projector in terms of the loss function L is:
β0
β = (barg min0,b)2Rp+1E L
Y,b0+Z0b .
The squared error loss functionL(a,b) = (a b)2 results in the linear predictor.
α th Quantile Loss Function:
L(a,b) = α (a b) 1fa>bg (1 α) (a b) 1fa<bg
= 1
2[ja bj+ (2α 1) (a b)]
_ [ja bj+ (2α 1) (a b)]
more complicated, think α=0.5,and we get to
Absolute Value Loss Function: α=1/2,L(a,b) =ja bj
β0
β = (arg minb0,b)2RpE+1
Linear Model and Error Term
In the linear model:
Y = β0+Z0β+U
IfE(UjZ) =0,the model is the Linear Regression Model.
IfU is independent ofZ,the model is the Classical Linear Regression Model.
IfU has conditional quantile equal to zero, the model is α th
Quantile Regression model.
If the conditional distribution ofY givenZ is Gaussian with constant variance, the model is the Normal Linear Regression Model.
IfE(UZ) =0,the model is the Correlation Model.
Variance Decomposition
Variance Decomposition:
V(Y) =E[V(YjZ)] +V[E(YjZ)].
Error Term Variance Decomposition:
Y =L(Yj1,Z) +U,
V(Y) =V[L(Yj1,Z)] +V(U).
Coe¢ cient of Determination:
R2 = V[L(Yj1,Z)]
V(Y) =1
V(U)