La implementación de la gobernanza en el Fondo Monetario Internacional

1.3 La gobernanza y las instituciones internacionales

1.3.4 La implementación de la gobernanza en el Fondo Monetario Internacional

This section aims to give the reader more insights on the interesting GB2 distribution. It is inspired by the works of McDonald (1984); Kleiber & Kotz (2003); European Union (2011a); Graf & Nedyalkova (2011, 2013).

Figure 6.1 – Links between the GB2 and other well-known distributions.

As mentioned in Section 5.3, many authors agree on the fact that the GB2(a, b, p, q)distribution is very flexible and is well adapted to be fitted on income data. Figure 6.1 illustrates this flexibility. One can see that, with

various reparametrizations or by anchoring some parameters, the GB2 can be seen as a generalized version of many other well-known distributions often used to model income. In particular, by setting p= q=1 one obtains the Fisk or log-logistic distribution, the Dagum with q = 1, the Singh- Maddala with p = 1, the Beta2 with a = 1, the Lomax with p = a = 1 and the Lomax inverse with q = a = 1. By letting q tend to infinity, and by applying the variable substitutions shown in Figure 6.1, one can derive the Generalized Gamma and Lognormal. If in addition a=1, one obtains the Gamma, with p=1 the Weibull and with a= p=1 the Exponential.

6.1.1 GB2 distribution function

The density funtion of a random variable following a GB2 is already given in Section 5.3. Now, if B(p, q) = ∫₀1tp−1₍₁₋_t₎q−1_{dt is the beta function} (defined only for p>0 and q>0), let Iz(p, q)be the incomplete beta ratio

function, given by:

Iz(p, q) = 1 B(p, q) ∫ z 0 t p−1₍₁₋_t₎q−1_{dt, 0}_≤_z_≤_1.

Then the distribution function of a random variable following a GB2 can be expressed as follows:

FGB2(y; a, b, p, q) =P[Y<y] = Iz(p, q) (6.1)

where z= ₁₊₍(y/b_y/b)a₎a, or, more precisely, z=u/(1+u)with u = (y/b)a. Since the incomplete Beta function, Bz(p, q) =

∫z

0 tp−1(1−t)q−1dt, is connected with the hypergeometric function1

Bz(p, q) = z

p 2F1[p, 1−q; p+

1; z], we can also express (6.1) as follows

FGB2(y; a, b, p, q) = z p pB(p, q)2F1 [ p, 1−q ;z p+1 ] .

6.1.2 Moments of the GB2 distribution

Let z_αbe theα-quantile of the Beta(p, q)distribution, and let u_α =z_α/(1−

zα), then theα-quantile of the GB2(a, b, p, q)is given by:

yα =b u1/α α. (6.2)

If Z is a random variable following a standard Beta(p, q)distribution, and writing U= Z/(1−Z), then Y=b U1/α follows a GB2(a, b, p, q).

Let Y be a random variable following a GB2. Then, reminding that

B(p, q) = Γ_Γ(₍p)Γ(q)

p+q), the k-th order moment is defined by: E(Yk) =bkΓ(p+_Γk/a)Γ(q−k/a)

(p)Γ(q) =b

kB(p+k/a, q−k/a)

B(p, q) (6.3) 1

The hypergeometric function is defined by:

pFq [ a1, a2, ... ap _;x b1, b2, ... bq ] =

∑

∞ n=0 (a1)n...(ap)n (b1)n...(bq)n xn n!, with(a)_n=a(a+1)(a+2)...(a+n−1),(a)₀=1.

6.1. Notes on the GB2 149

where the moments exist for −ap < k < aq, and Γ(z) = ∫₀∞tz−1_e−t_dt ₌

2∫₀∞e−t2t2z−1dt is the gamma function, which is defined only for z > 0. One notes in particular that the expectation, if it exists, depends linearly on b. On the other hand, the variance of any distribution can be expressed as E(Y2)−E(Y)2. If it exists, by (6.3), it tends to 0 for a→∞. In this case, as it has been mentioned in Section 5.3, the expectation tends towards b. Therefore, for large values of a, the mass of probability of GB2-density is concentrated around b.

The incomplete moment of order k is given by

E(Yk|Y<y) E(Yk₎ = FGB2(k)(y; a, b, p, q) =FGB2(y; a, b, p+ k a, q− k a). (6.4)

6.1.3 Indicators of poverty and social exclusion in the EU-SILC and

GB2 Distribution

As mentioned in Section 5.3, an advantage of a parametric estimation of income distribution, in particular with a GB2, is that there are explicit formulae for the inequality measures as functions of the four parameters of the law fitted on the data (McDonald, 1984; European Union, 2011a; Graf & Nedyalkova, 2011, 2013). The paramteric expressions of the five inequality measures defined in Eurostat (2003) and used in our article are now presented. The empirical definitions of these indicators are given in Sections 3.2.1 to 3.2.7.

The at-risk-of-poverty threshold (ARPT)

Let q50 be the median of the GB2(a, b, p, q), FGB2(q50) =0.5, computed as detailed in (6.2). Then, the ARPT is given by:

ARPT(a, b, p, q) =0.6 q50

The at-risk-of-poverty rate (ARPR)

The risk of poverty is the proportion of the population below the poverty line. As it is scale-independent, the parameter b can be arbitrarily chosen, e.g. set = 1. Graf & Nedyalkova (2013) show that for large values of a (approx. ≥9) the ARPR becomes insensitive to the values taken by q. We have often been in this situation with the SILC 2009 data used. If Y is a random variable following a GB2(a, b, p, q):

ARPR(a, p, q) = P(Y <0.6 q50) =P(Y< ARPT)

= FGB2(0.6 q50) =FGB2(ARPT; a, 1, p, q),

where FGB2 is the distribution function, see (6.1). Note that ARPT = qGB2(ARPR; a, 1, p, q)where qGB2is the quantile function of the considered

Relative median at-risk-of-poverty gap (RMPG)

It is the relative difference between the poverty line and the median of the poor (= those below the threshold): let mp = F_GB2−1 (ARPR/2) =

qGB2(ARPR/2)be the median of the poor,

RMPG(A, a, p, q) = 0.6q50−mp

0.6q50

= ARPT−mp

ARPT ,

where A = ARPR(a, p, q)is the at-risk-of-poverty rate. The RMPG is defined as one minus the ratio between the median income of people having an income below the ARPT and 60% of the median income of the population.

RMPG(A, a, p, q) =1− qGB2(A/2, a, 1, p, q)

qGB2(A, a, 1, p, q)

=1−qGB2(A/2, a, 1, p, q)

ARPT

where qGB2= F_GB2−1 is the quantile function of considered GB2.

Quintile share ratio (QSR orS80/S20)

Let q80et q20be the 80thand 20thpercentiles of the GB2 distribution function. The quintile share ratio is the ratio of the cumulated incomes of the 20% richest, over the cumulated incomes of the 20% poorest:

QSR= E(Y|Y >q80)

E(Y|Y <q20) .

It can be expressed using the incomplete moments of order one, see (6.4) with k=1: QSR(a, p, q) = 1−FGB2(1)(q80; a, 1, p, q) FGB2(1)(q20; a, 1, p, q) = 1−FGB2(q80; a, 1, p+ 1 a, q−1a) FGB2(q20; a, 1, p+ 1_a, q−1_a) . Gini Index

There are several possible definitions. If X and Y are two random variables with distribution F,

GI N I(F) = E(|X−Y|)

2E(X) .

It is an inequality index measuring the expectation of the absolute difference of two independently selected income values relative to the median income. The Gini index of a GB2 distribution is given in McDonald (1984):

GI N I(a, p, q) = B(2p+1/a, 2q−1/a) B(p, q)B(p+1/a, q−1/a) { 1 pG1− 1 p+1/aG2 } where G1=3F2 [ 1, p+q, 2p+1/a ; 1 p+1, 2(p+q) ] and G2=3F2 [ 1, p+q, 2p+1/a ; 1 p+1+1/a, 2(p+q) ] ,

6.1. Notes on the GB2 151

where 3F2 is the hypergeometric function already defined in Section 6.1.1. The Gini index is defined only if the expectation exists, i.e. when q−1/a>

0, see (6.3)). Direct application of the above formula for the Gini may lead to convergence problems. An efficient algorithm that may be used to calculate the Gini index is given in Graf (2009c) and implemented in the R-package GB2 (function gb2.gini).

In some cases, this index takes a simpler form (see European Union, 2011a; Kleiber & Kotz, 2003)

• Distribution Beta-2 (a=1): GI N I(p, q) = B_2pB(2p,2q2₍_p,q−1₎), • Distribution Dagum (q=1): GI N I(a, p) = Γ_Γ₍(p)Γ(2p+1/a)

2p)Γ(p+1/a)−1,

• Distribution Singh-Maddalah (p=1): GI N I(a, q) =1−_ΓΓ₍(q_2q)Γ₎_Γ(2q₍_q−₋1/a_1/a)₎.

• Distribution Lognormal: GINI(σ) =2Φ(σ/√2)−1,Φ being the cu- mulative distribution function of the standard normal distribution.

6.1.4 Note on the variance-covariance matrix of totals of the GB2 esti-

mated scores

This is a short add-on to complete the explanation of formula (5.17) given in Section 5.7. As said, ˆVp(Uˆ)is the estimated design variance of the totals

of the scores evaluated at ˆθ, it can be developed as follows: ˆ Vp(Uˆ) = varc[

∑

k∈r wkuk(yk; ˆθ)] = varc     ∑n i=1wiuia(yi; ˆθn) ∑n i=1wiuib(yi; ˆθn) ∑n i=1wiuip(yi; ˆθn) ∑n i=1wiuiq(yi; ˆθn)     :=varc     ˆ Ua ˆ Ub ˆ Up ˆ Uq     =     ˆ

Vp(Uâ) covdp(Uâ, Ûb) covdp(Uâ, Ûp) covdp(Uâ, Ûq)

covp(Uˆb, Ûa) Vˆp(Uˆb) covdp(Uˆb, Ûp) covdp(Uˆb, Ûq)

covp(Uˆp, Ûa) covdp(Uˆp, Ûb) Vˆp(Uˆp) covdp(Uˆp, Ûq)

covp(Uˆq, Ûa) covdp(Uˆq, Ûb) covdp(Uˆq, Ûp) Vˆp(Uˆq)

   

Note that we can obtain the covariances by simply calculating variances, since for two random variables X and Y, we can write

var(X−Y) =var(X) +var(Y)−2cov(X, Y), thus

cov(X, Y) = var(X) +var(Y)−var(X−Y)

2 .

It follows that to obtain the estimated variance-covariance matrix of GB2- scores evalutated at ˆθ, ˆVp(Uˆ), it is enough to compute the diagonal

terms ˆVp(Uˆa), ˆVp(Uˆb), ˆVp(Uˆp), ˆVp(Uˆq) as well as ˆVp(Uˆa−Uˆb), ˆVp(Uˆa −

Up), ˆVp(Uˆa−Uˆq), ˆVp(Uˆb−Uˆp), ˆVp(Uˆb−Uˆq), ˆVp(Uˆp−Uˆq). All these vari-

ances of totals are estimated via linearization as described in Massiani (2013a) in order to take the variability of the survey weights into account.

.2

Some remarks about the Durbin-Wu-Hausman test

This section aims to share some research we have done on the Durbin-Wu- Hausman tests. Some interesting questions and first conclusions could be formulated, which could lead, later on, to concrete results. This section has to be read as a complement to Sections 5.4.1 and 5.4.2. It details also what has been written in the conclusions of Section 5.9. We take over the notations from Sections 5.4.1 and 5.4.2 and do not re-introduce them here. It was found that, in practice, it is decisive for the success of gener- alized calibration to 1) to have good exogenous instruments Z1 and 2) to properly sort among the X= [X1|X2]the endogenous auxiliary variables,

X1, and the exogenous ones, X2, which can be instruments for themselves. We must be able to classify the columns of matrix X in X1 and X2 and, in addition, have instruments Z1 that are able to compensate for the endo- geneity of X1, i.e. satisfying the conditions stated under Section 5.4.1.

If we can do it, generalized calibration will provide much better weights2

than conventional calibration, otherwise the generalized calibration can be very risky (i.e. lead to worse GB2 fits than those obtained using weights stemming from conventional calibration).

We would actually like to have a way to test if the auxiliary and instrumental variables are endogenous or not. In the field of econometrics, this problem is well known and explored in the context of solving simultane- ous equations, for example. The method for doing this is to use Durbin-

Wu-Hausman tests(DHW). These tests were developed by Durbin (1954); Wu (1973); Hausman (1978).

For the time being, we will forget about sampling theory and use a conventional notation of classical statistics. Indeed, to our knowledge, the DHW tests have not been adapted to a sampling framework yet. So, let us consider the linear model

y =Xβ+ϵ. (6.5)

There has been considerable work done on the DWH tests since then (see Ruud, 1984, for a review and many more recent ones). These tests are often interpreted as testing for endogeneity or exogeneity of the columns of X not belonging to the subspace generated by Z= [Z1|X2]and it is also with this intention that we would like to apply them. However what is truly being tested is not the endogeneity or exogeneity of the columns of

Xbut rather, the effect of a possible endogeneity on the components β estimated by regressing X on y.

The basic idea of DWH tests is to build a test on a vector of contrasts, i.e. the difference between two vectors of estimated parameters. One will be consistent under less restrictive conditions than the other.

It is assumed that the model to be tested is (6.5) withϵ∼iidN (0,σ21)

where there are n observations and J regressors. In this context the prin- ciple of the DHW test is to compare two estimatorsβ0 andβA.

The null hypothesis is, H0:

Best in the sense that, in our framework, they can lead to fit and find the GB2 distribution that would be obtained by fitting on the complete data.

6.2. Some remarks about the Durbin-Wu-Hausman test 153

• There is a consistent estimator ˆβ0, asymptotically normal and effi- cient3

• There is another estimator ˆβAwhich is consistent under H0 and HA

but which is inefficient.

If H0is true, both estimators are unbiased and therefore we prefer the one that is efficient.

Under HA:

• ˆβ0 is biased and inconsistent (the 0-model is miss-specified),

• ˆβAremains consistent,

Consequently, under HA, the vector of contrasts ˆq = βˆA−βˆ0 is large in absolute value relative to its standard deviation. In the case that interests us, β0is the Ordinary Least Squares (OLS) estimator

βOLS= (X′X)−1X′y (6.6)

andβAis another linear estimator4, for example, the Instrumental Variable

(IV) estimator:

ˆbIV = (X′PZX)−1X′P′Zy. (6.7)

The PZ matrix projects orthogonally onto the hyperplane Lz, where Z =

[Z1|X2]is the matrix of instruments.

To build his test, Hausman (1978) shows that, if H0is true, cov(ˆq, ˆβ0) = 0, equivalently cov(βˆ_A, ˆβ0) = var(βˆ0). We convince ourselves easily from this fact in the case of comparison between the OLS-estimator and the IV-estimator. Indeed, in this case, under H0and withϵ∼iidN (0,σ21):

cov(ˆb_IV, ˆβOLS) = cov(ˆX′ˆX)−1ˆX′y, X′X)−1X′y)

= cov((ˆX′ˆX)−1ˆX′ϵ,(X′X)−1X′ϵ) = (ˆX′ˆX)−1ˆX′ϵϵ′X(X′X)−1 = (X′PZX)−1X′P′ZX(X′X)−1σϵ2

= σ_ϵ2(X′X)−1 =var(βˆ_OLS).

Hausman also shows that the matrix C=var(βˆ1)−var(βˆ0)≥0 is defined non negative. C is called the precision matrix, its rank is the number of endogenous variables tested, i.e. the number of variables in X2, say k. The test statistic is defined by:

H= ˆq′C−ˆq= ∥q∥2_C−

where C− is the Moore-Penrose generalized inverse of C. H follows aχ2_k distribution if σ2 is known. In the case where this error variance must be estimated, H/k ∼Fk,n−J. A too large value of H will result in rejecting H0.

Efficient: the estimator reaches the Cramer-Rao bound on the sample of finite size.

We can take any other estimator ˆβA= (X′AX)−1X′Ay where A is a symmetric matrix

of size n×n which is assumed, to simplify, having a rank at least equal to the number

of variables in X (otherwise, we could not estimate all components of ˆβA, and we could

compare only its estimated part to the corresponding sub-vector ˆβOLS). The projection

Intuitively we will remind ourselves that if X ∼ N (µ, Σ)), then the square of the Σ−1-norm of the vector of contrasts of variables in X rela- tively to their expectationµ,∥X−µ∥2_Σ−1, follows a chi squared distribution:

(X−µ)′Σ−1(X−µ) ∼ χ2_r, where r = rang(Σ), moreover a Fisher distribution is obtained by the ratio of two chi-square distributions divided by their respective degrees of freedom.

When comparing the OLS regression to IV, the test is written:

H= 1

k(ˆbIV −βˆOLS)

′_[_Var₍_ˆb

IV)−Var(βˆOLS)]−(ˆbIV−βˆOLS)∼Fk,n−J. (6.8)

Hausman also shows that the above-described test is equivalent to test

H0: α=0 or H0:δ=0 in one of the following two regressions:

y = [X1|X2]β+PZX1α+r1 (6.9)

= Xβ+ ˆX₁α+r1 or

y = [X1|X2]β˘+QZX1δ+r2 (6.10)

= X ˘β+vδ+r2

where Q_Z = 1−PZ, i.e. that v are the residuals of the projection of the

endogenous variables X1 onto the hyperplane Lz spanned by the instru-

mental variables Z= [Z1|X2]: X1 = ˆX1+v=PZX1+QZX1.

One then performs a Student-test t on α or δ if one single variable is tested, or a Fisher-test onα or δ if several potentially endogenous variables are tested at once.

With the theorem of Frisch-Waugh-Lovell (FWL) (see for example Davidson & MacKinnon, 1993), it can be shown that ∥r1∥2 = ∥r2∥2, thus tests onα ou δ will be numerically identical. These tests may be easier to conduct in some cases. However, in our study we used the Hausman test (6.8) on the estimated parameters from OLS and IV regressions and did not do testing on the significance of the residuals v, see (6.11), or of ˆX1, see (6.10), obtained from the projection of the endogenous variables onto the hyperplane Lz.

To compute (6.8) we use the following approximation:

H= 1

k(ˆbIV−βˆOLS)

′_[_V_ˆ₍_ˆb

IV)−Vˆ(βˆOLS)]−(ˆbIV−βˆOLS), (6.11)

where ˆV(ˆbIV)et ˆV(βˆOLS)are consistent estimates of Var(ˆbIV)and respec-

tively Var(βˆ_OLS).

In document La gobernanza y la rendición de cuentas en México (página 45-50)