HORARIO DE LOS SERVICIOS CENDI SEP
4.4 S UPERVISOR /I NSPECTOR DE E DUCACIÓN F ÍSICA
3.4.1
Design of Simulations
In this section I present a small simulation study in order to investigate the effect of the choice
of hd on the performance of the LDNP estimator. For all of the simulations in this section
I have used the Gaussian kernel for Kr and the Epanechnikov kernel for Kd since this is the built in kernel for the monoproc procedure in the R package monoProc that calulates the DNP
estimate. I use the four functions that I introduced in section 3.3,p2(x)-p5(x). I do not use the
function p6(x) for reasons described in Chapter 4. Essentially the MISE optimal bandwidth for this function was so small that it caused computational problems when trying to calculate
estimates of the psychometric function. I use these four functions to generate binomial samples
assuming a binomial distribution with mean pi(x) and r repeats at each of n equally spaced stimulus levels in (0,1), x1, . . . xn.
I considered the effect of varying values of both r and n. I considered n = 5,10,20,50
samples for each model, and for each combination of r and n. A more detailed description of the setup of these simulations can be found in Chapter 4.
For each sample I fitted an unconstrained regression estimate described in Section 1.5 and
a bandwidth that minimised the asymptotic MISE of each psychometric function pi(x). If the estimate was not monotone then I used the LDNP monotonicity constraint described in
this thesis to fit monotone regression estimates. If the unconstrained regression estimate was
already monotone then this sample was discarded and excluded from any further analysis.
I used four different choices of the second bandwidthhd=
√
hr, hr, h2r, h3r. In total then for each non monotone sample I created four monotone estimates. I then compared these estimates
to see which choice of hd produced the best results. I compared the square bias, the variance, the MSE and the MISE for each of the methods.
3.4.2
Results of Simulations
The numbers of samples that required monotonisation are recorded in Tables 4.1-4.6 in Chap-
ter 4 (See the discussion there for details). For each model, and each combination of r and n
I have compared the square bias, variance and MSE for the LDNP estimates with each of the
four possible choices for the value of hd.
Figures (3.6)-(3.9) show the results for the models p2(x)-p5(x) when comparing the effect of changing the value of n. In each of these plots the value of r is kept constant at r= 5. For
the functionp2(x) it is clear that the four choices of bandwidthhd have almost no effect on the
performance of the LNDP estimate except for very small values of x. All four estimates result in comparable estimates of the psychometric function. This is also true for functionsp4(x) and
p5(x). In all three cases the choice of bandwidthhd makes little difference to the final estimate.
This suggests that for these models the sample sizes are not large enough for the asymptotic theory of section 3.3 to come into effect.
Function p3(x) is a little more interesting. When n = 5 the performance of the bandwidth
hd =
√
hr seems to perform better than the competing choices in terms of the square bias,
n= 50 the four estimates are largely the same. This agrees with the asymptotic theory that as
n→ ∞ it is desirable to choose a bandwidth such that hd=o(hr). However, these simulation results suggest that the asymptotic theory does not hold for small sample sizes and for small
samples the choice of hd is less crucial.
Figures (3.10)-(3.13) show the results for the modelsp2(x)-p5(x) when comparing the effect
of changing the value ofr. In these plots the sample size is kept constant at n= 10. Functions
p2(x), p4(x) and p5(x) all seem to show that for small sample sizes the choice of hd has no noticeable affect on the MSE performance of the LDNP method. Any of the values of hd
studied would provide similar estimates of the psychometric function. For the function p3(x)
there again seems to be a slight advantage in choosing hd=
√
hr although this advantage was not huge compared to the other options.
The main conclusion from these simulations was that the asymptotic results stated in this
chapter do not hold for small sample sizes. We have a hint from the functionp3(x) that as the sample size increases the best choice of bandwidth would be one such that hd=o(hr), but the
asymptotic results stated in this chapter seem to need sample sizes larger than n= 50 to be of
much use. This is not a common case in psychometric studies. However, the simulation study has shown that there is no disadvantage in choosing a bandwidth hd such that hd = o(hr).
Bandwidths of this sort performed just as well as other choices.
I would conclude, then, that since the asymptotic theory shows that for large sample sizes it is best to choose hd such that hd = o(hr), and the simulation study shows that this choice
of bandwidth is not a bad choice in small samples sizes it makes sense to choose hd = h3r or