• No se han encontrado resultados

The theoretical results developed in this paper suggest that identification of the Poisson- Logit, and the NB2-Logit models is problematic, in the sense that without further parametric assumptions two identical “global” maxima exist. However, identification is achieved if true events follow the NB1 distribution, or if the reporting process is specified as a Probit model. In this section, further implications of the above results will be discussed. Moreover, some tips for researchers who intend to use the above models will be described.

As explained in subsection 1.5.1, a first way to achieve identification is by sign restrictions on the reporting process. We need to stress that this type of restriction becomes more appropriate the more certain we are about the theoretical result that determines the sign of the “restricted” coefficient. For instance, in the example of subsection 1.5.1, if information of the FS-HCA was publicly available, it could increase the wage of outside offers as well, making the change in wages differential uncertain. Moreover in practice, given correct specification of the conditional mean,1 the effect of the “restricted” variable should be statistically significant.

In fact, the more significant the effect, the more certain we are about the appeared sign in the two models. Finally, in many cases the “restricted variable” is not directly observed and therefore, the researcher is forced to use proxy variables. In the previous example, FS-HCA is not observed in practice but it can be approximated by “job experience”. However, it is ambiguous whether general “job experience” captures the true effect of FS-HCA.

More interestingly, the results of subsection 1.5.3 showed that when exclusion restrictions are imposed on the count process, identification of the whole model is achieved, since there cannot be two linearly dependent sets of parameters that lead to the same likelihood value. However, even in this case, it is clear from (1.21) that the identification problem is exactly

1By correct specification we mean that not only should the true mean be given by λ

iΛi but also that

both processes include all the required information. That is, we do not include irrelevant variables, and we do not omit variables that must be included.

30 restored when ϕ = 0, or when the regressors excluded from the Poisson part, qi, are perfectly

collinear with the remaining elements of this vector, x1i. Hence, the closer we move towards

the one of these two conditions, the smaller the effect of the exclusion restriction, and the more difficult the identification becomes. Practically, we find that if the exclusion restriction is very “weak”, meaning that the excluded from the count process variable has a very small effect on the reporting process, another local maximum probably exists with likelihood value very close to the global one and estimated parameters very close to θ∗ = (β + γ, −γ). If a second maximum does exist, the estimation process, using zeros or conventional Poisson estimates for starting values in the count process, will always converge either towards the global or the local maximum. If a researcher performing the Poisson-Logit or the NB2-Logit MLE in real data is unaware of these problems, he/she may be puzzled estimating parameters with unexpected signs or implausible values. Section 1.7 will present a very comprehensive example of this situation.

Thus, although an appropriate restriction guarantees identification of θ, it is not guaran- teed that the global maximum has been found. Therefore, estimation of the above models must be always accompanied by a thorough search for alternative maxima. A very useful way of searching for other candidate maxima is the following: firstly, a regression is performed using randomly chosen values for the coefficients of the reporting process and conventional Poisson or NB2 estimates for the coefficients of the count process. This helps the estimation to be smoother, avoiding possible numerical errors in the optimization procedure. Unless more problems exist, the model will converge on log likelihood value ln ˆL , corresponding to estimates ˆθ = ( ˆβ, ˆγ). According to the theoretical results, the other maximum will be close to ˆθ∗ = ( ˆβ + ˆγ, −ˆγ). Hence, the estimated values of ˆθ∗ can be used as starting values for a second regression. If the second maximum exists, it will be found by this second regression, with log likelihood value ln ˜L∗ and ˜θ∗ ≈ ˆθ∗. Consequently, if we find both maxima, we will accept the set of parameters that maximize the likelihood of obtaining the observed data. It would also be useful to note that sometimes, different numerical algorithms work better in different models or different data, in the sense that they perform with lower number of numerical errors and achieve convergence more easily.1 Therefore, in case a numerical algo-

31 rithm does not perform well, before coming to the decision that there is something wrong with our model, it would be very practical to run the same model with alternative numerical optimizers.

In footnote 2 of page 14, we mentioned that the likelihood function of the Poisson-Logit model is not always globally concave which might lead to multimodality. Therefore, not always can we be certain that only two maxima exist. In practice, there might be cases where more than two maxima exist. The method described above could not succeed in reaching a, supposedly, third local maximum. One way to reach a potential third maximum would be to use random starting values for all the parameters and to experiment with different numerical algorithms. If this is repeated many times, it is highly likely that the regression procedure will converge in every candidate maximum.

Of course a researcher, using over-dispersed data could assume that the observed data are generated by a NB1-Logit model and avoid using sign or exclusion restrictions. However, the NB1-Logit MLE is less robust than the Poisson-Logit MLE, since it does not fall within the LEF. Therefore, consistency of the estimated parameters requires not only correct spec- ification of µi, but also that the data are truly generated by a NB1-Logit process. Most

importantly, as mentioned in Section 1.5.3, identification is achieved by assuming a specific form of heteroskedasticity for αi. Therefore, the estimates will be inconsistent if the variance

form is misspecified.

As explained in subsection 1.5.5, the Poisson-Probit model is identified even when x1i=

x2i. In spite of this, since the shape of the standard normal pdf is very similar to the logistic

probability function, it is quite possible that still multiple maxima exist, whose likelihood values are very close to each other. The procedures described above can assist the researcher to check for alternative maxima as ex0iβΦ(x0

iγ) ' ex 0 iβΛ(x0 iγ/s) ' ex 0 i(β+γ/s)Φ(−x0 iγ), where

s is a scaling parameter (≈ 31/2/π, see, Maddala, 1983) used in order for the Probit and

Logit parameters to be approximately the same. Finally, the Poisson-CLogLog model is also identified even when x1i = x2i. However, the empirical results (which are not presented in

alytic second derivatives and performs very well if the likelihood function is globally concave. The Bernt- Hall-Hall-Hausman (BHHH) which uses only first derivatives (outer product of the score), which results in lower computational intensity, and the Broyden-Fletcher-Goldfarb-Shanno (BFGS) which is a refinement of Davidon-Fletcher-Powell (DFP) and also uses first order derivatives. For details, see Chapter 10, in Cameron and Trivedi (2005).

32 Section 1.8 but are available on request) show that again a second maximum exists with likelihood value very close to the global one and estimates very close to ˆθ∗. This might be, because conditional on vector xi the distribution of the Logit model is similar to the

distribution of the CLogLog model.