7. ENTREVISTAS A PADRES 73
7.5. Perspectivas de género
For the linear model an affine diffusion process is considered. The process is specified as
µ(x) =
0.75 − x1+ 0.5x2+ x3 0.75 + 0.5x1− 2x2+ 0.25x3
1.5 + 0.25x1+ x2− 3x3
,
Σ(x) =
√0.3x1+ 0.3x2 0 0
0 √
0.3x2 0
0 0 √
0.3x3
.
The estimation is performed on the cube G = [0.95, 4.25]×[0.50, 1.85]×[0.60, 1.65].
By simulations it was found that P(Xt ∈ G) = 0.86. This means that the estimation was on average based on 900 observations for the small sample and on 4300 observations for the large sample.
First, the results for the drift function are presented. The normalized function is given by
µ1,1(x1) = 2.32 − x1,
where the mean was calculated by simulations. The bandwidth is given by h = (h10, h20, h30)0×T−1/5. The results in Table 4.1 indicate that the bandwidths h2 and h3 have an influence on the estimator of µ1,1(x1). As expected by the asymptotic
4.5 Simulation 95
Table4.1:MISEforestimatingµ1,1 (x1 )inthelinearspecification n=35,T=30n=50,T=100 Nadaraya-WatsonLocalLinearNadaraya-WatsonLocalLinear (h1 0,h2 0,h3 0)MeanMedianMeanMedianMeanMedianMeanMedian (1.2,0.38,0.42)0.5120.3401.4780.5060.1430.1180.1800.145 (1.2,0.54,0.42)0.4340.3050.7910.4540.1460.1130.1790.139 (1.2,0.54,0.55)0.4060.3140.8630.4380.1400.1140.1710.144 (1.7,0.54,0.55)0.2780.1831.1060.3500.1080.0900.1380.111 (1.7,0.54,0.68)0.3130.2181.2340.4100.0990.0860.1370.114 (1.7,0.70,0.55)0.2540.1740.7290.3890.0960.0680.1360.097 (1.7,0.70,0.68)0.2850.2080.6340.3660.1010.0830.1320.105 (2.2,0.70,0.68)0.2040.1580.7140.3020.0990.0740.1180.078 (2.2,0.86,0.68)0.2120.1490.6250.3250.0970.0800.1150.086 (2.2,0.70,0.80)0.1990.1540.5080.2430.1020.0790.1230.082
96 4. ESTIMATING ADDITIVE DIFFUSIONS
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−4−2012
µ1,1(x1)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−4−2012
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−4−2012
µ1,1(x1)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−4−2012
Figure 4.1: Estimators of the drift part µ1,1(x1) for n = 35, T = 30 (left column) and n = 50, T = 100 (right c.). Nadaraya Watson estimators are in first row, local linear in second. In each panel there are given the true function (solid), pointwise 0.25 and 0.75 quantiles of the estimates over 201 simulations (dashed-dotted) and the MISE-median estimator over the simulations (dashed). The bandwidth is given by h = (1.7, 0.85, 0.66)0× T−1/5
results, this effect is smaller for the large sample. However, the influence of h1 onto the MISE of eµ1,1h (x1) is stronger than the influence of h2 and h3. This finding gives evidence that the recommended bandwidth selection procedure leads to reliable results even in small samples.
The values of the mean and the median differ considerably large, indicating a number of outliers in the simulations. This results from the fact, that in some simulations not the whole cube G is filled with observations. In that case the den-sity estimators can be very close to zero2. This causes problems for the marginal estimators and the integration over the estimated conditional densities in the al-gorithm. Then, the backfitting estimators can be dominated by some extreme values, based on too few observations in a local neighborhood. However this simulation effect decreases with an increasing number of observations. In prac-tice, the cube G would be selected such that there are enough observations to avoid this problem. Thus, the result for the median is more reliable to judge the performance of the estimator.
2The convention 0/0 = 0 is used in the implementation
4.5 Simulation 97
In all settings the Nadaraya-Watson estimator outperforms the local linear es-timator. This effect can only partly be attributed to the increasing variance of the local linear estimator at boundary points, because in the calculation of the MISE some boundary points are excluded. In particular in the large sample, no boundary points are used to estimate the MISE, but still the Nadaraya-Watson estimator performs better.
Figure 4.1 underlines these findings. In the upper row, the Nadaraya-Watson estimator is displayed and in the lower row the local linear estimator. The left column shows the results from the small sample and the right column the results from the large sample. The local linear estimators show a large variance near the boundary, which is smaller for the large sample, but still their performance is worse than the Nadaraya-Watson estimators.
The second important finding is that the Nadaraya-Watson estimator seems to exhibit a larger bias, because the interquartile range of the estimators seems to follow a different slope than the true function. Theoretically this can be explained by the difference in the bias behavior of the two estimators. Recall from Theo-rem 4.1 that the bias of the Nadaraya-Watson estimator is given only implicitly as an additive projection of the first derivative of the component function and of the density. In contrast the bias of the local linear estimator is zero for this data generating process, because the second derivative of µ1,1(x1) is zero. This effect is reduced for the large sample. Recall however that the MISE of the Nadaraya-Watson estimator is always smaller. Therefore its variance must be much smaller in finite samples.
The results for estimating the diffusion function a11,1(xj) = 0.3(x1 − 2.32) are given in Table 4.2 and Figure 4.2. The bandwidth is now given by h = (h10, h20, h30)0× (nT )−1/5 because of the faster rate of convergence of the diffusion estimator. The findings from Table 4.2 are similar to those for the drift estima-tor. For all settings the Nadaraya-Watson estimator has a smaller MISE than the local linear estimator, however the difference decreases with an increasing sample size. The effect of the bandwidth constant on the MISE is much smaller than for estimating the drift function, which should be due to the faster rate of convergence.
Next, compare the results of Figure 4.2 to the estimation of the drift function.
One can see that the bias of the Nadarya-Watson estimator is still present, but the effect is much smaller. The interquartile range of the Nadaraya-Watson estimator is still smaller than that of the local linear estimator and the magnitude of this
98 4. ESTIMATING ADDITIVE DIFFUSIONS
Table4.2:MISEforestimatinga11,1 (x1 )inthelinearspecification n=35,T=30n=50,T=100 Nadaraya-WatsonLocalLinearNadaraya-WatsonLocalLinear (h1 0,h2 0,h3 0)MeanMedianMeanMedianMeanMedianMeanMedian (1.2,0.38,0.42)0.4530.4030.5060.4210.3880.3780.3940.384 (1.2,0.54,0.42)0.4460.4000.4750.4210.3860.3820.3910.386 (1.2,0.54,0.55)0.4330.3890.4560.4060.3900.3850.3950.393 (1.7,0.54,0.55)0.3830.3650.4340.3930.3700.3670.3800.380 (1.7,0.54,0.68)0.4050.3660.5120.3870.3740.3740.3840.381 (1.7,0.70,0.55)0.4210.3650.4580.3970.3760.3730.3850.379 (1.7,0.70,0.68)0.4450.3810.4870.4060.3730.3680.3830.381 (2.2,0.70,0.68)0.3720.3580.4130.3990.3610.3550.3750.369 (2.2,0.86,0.68)0.4060.3780.5050.4050.3690.3630.3830.371 (2.2,0.70,0.80)0.3920.3620.4480.4040.3710.3640.3870.383
4.5 Simulation 99
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−0.50.00.51.0
a11,1(x1)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−0.50.00.51.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−0.50.00.51.0
a11,1(x1)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
−0.50.00.51.0
Figure 4.2: Estimators of the diffusion part a11,1(x1) for n = 35, T = 30 (left column) and n = 50, T = 100 (right c.). Nadaraya Watson estimators are in first row, local linear in second. In each panel there are given the true function (solid), pointwise 0.25 and 0.75 quantiles of the estimates over 201 simulations (dashed-dotted) and the MISE-median estimator over the simulations (dashed).
The bandwidth is given by h = (1.7, 0.70, 0.66)0× (nT )−1/5
distance is much smaller than for the drift estimation. All these findings highlight the increased rate of convergence for the estimation of the diffusion function.