METABOLITO REACTIVOS REACCIÓN POSITIVA RESULTADOS
4.2. CONTRASTACIÓN DE HIPÓTESIS 1 Hipótesis general
The previous sections discussed the problems related to standard HAC robust covariance matrix in Newey and West (1987) and showed that it is possible to overcome them with the correct asymptotic theory. This section discusses the last required ingredient - number of lags; or more precisely kernel bandwidth.
Andrews and Monahan (1992) and Newey and West (1994) have proposed automatic rules to select the bandwidth. Problem with the rules is that they were optimized to select appropriate standard error but not for optimal confidence interval coverage in tests. Sun et al. (2008) explain the problem: ”For typical economic time series, the optimal bandwidth that minimizes a weighted average of type I and type II errors is larger by an order of magnitude than the bandwidth that minimizes the asymptotic mean squared error of the corresponding long-run variance estimator.” The automatic selection rules thus tend to provide lower number of the required lags which results in confidence intervals under- coverage and rejection of too many hypothesis. Sun et al. (2008) also provide a new rule for automatic selection that overcomes these problems. It is, however, not suited for the
Figure 2.3: Proportion of Significant Signals Depending on Adjustment of Stan- dard Errors. The figure shows the proportion of significant signals at 5%, 1%, and .1% level as a function of adjustment of standard errors and block length in bootstrap. Lines show the proportion as a function of lags in Newey and West (1987) adjustment for auto- correlation and heteroskedasticity. Squares, triangles, and circles stand for bootstrapped values using alphas, unadjusted t-statistics, and HAC adjusted t-statistics as in Newey and West (1987), respectively. The figure is based on 1,590 fundamental signals, 1,497 data-mined and 93 from published studies. The sample is restricted to industrial stocks with price over $1 and capitalization larger than bottom decile in NYSE at the end of previous June. It spans July 1963 to December 2016. The equal-weighted long-short portfolios are constructed by buying stocks in the top decile of the signals and shorting stocks in the bottom decile of the signals. The alphas are estimated with Fama-French five factor model.
type of bootstrap that we use here and is again based on an asymptotic behaviour. The problem of selection of bandwidth does not disappear for the bootstrap but it translates into selection of block length.24
We next turn to assessment of impact of choice of the bandwidth on the number of significant signals. Figure 2.3 shows how the proportion of signals significant at 5% (red), 1% (light blue), and 0.1% (dark blue) level evolves depending on the number of lags and estimator of covariance matrix. We consider only equal-weighted portfolios of data-mined signals and anomalies here as equal-weighting is an overwhelming choice in the literature (McLean and Pontiff (2016)). The performance of anomalies is adjusted for five Fama-French factors. The horizontal lines correspond to Newey and West (1987) estimator with critical values from normal distribution. We also provide results for three specifications of naive bootstrap. Critical values for alphas without any standardization are depicted with squares. The upward-facing triangles show naive bootstrap for t-statistic with standard errors without any adjustment for heteroskedasticity or autocorrelation. The circles then describe proportion of significant signals with Newey and West (1987)
Omitted Strategy Bias in Anomalies Research
HAC robust estimator and bootstrapped critical values.
One apparent feature is that a larger number of lags leads to fewer significant signals. The number generally drops uniformly for the first 24 lags. The number of significant signals levels off after 24 lags and reaches the minimum at about 60. It starts increasing after that because the block length starts causing problems with randomness of the sample. The increase is mainly in tail of the distribution. The decline in the number of significant signals can be substantial. 10% of all signals are significant at 0.1% level with Newey and West (1987) adjustment and critical values from normal distribution. This drops to 0.3% for bootstrapped critical values and 24 lags. Notably, the critical values from normal distribution seem to over-reject null hypothesis for any number of lags. This is in line with evidence in the previous subsection. The green vertical lines correspond to mean number of lags selected by Andrews and Monahan (1992) and Newey and West (1994). The optimal number of lags is 5.5 and 10.4, respectively.25 The proportion of
significant signals is then intercept of these vertical lines and lines for Newey and West (1987) adjustment.
There is also a large discrepancy between different versions of the bootstrap. The simplest version, without studentization, tends to reject the most signals possibly due to distorted size of the tests.26 It can also get heavily distorted with the existence of large outliers. Studentization should improve properties of the bootstrap and its importance is emphasized in Davison and Hall (1993), G¨otze et al. (1996), and Romano and Wolf (2006). The drop in number of significant signals with the number of lags in HAC robust adjustment does not have to imply improper size of tests with the small number of the lags. This is because there is a trade-off between type I and type II error rate. Type I error rate decreases with the number of lags but type II rate increases. The test then has poor power and rejects fewer truly significant signals. Bootstrap without HAC robust adjustment should capture role of block size in the bootstrap since that is the only thing that is changing. This should in turn capture the effect of autocorrelation on standard errors without any distortion in power as in the case of Newey and West (1987) adjustment. The optimal number of lags for naive bootstrap with HAC adjustment therefore appears to be around six where number of rejected hypothesis is similar to minimum number of rejected hypothesis without the HAC adjustment. Larger number of lags then probably leads to poor power of the tests. We also provide proportion of signals significant with Kiefer et al. (2000) estimator denoted by inverse triangles. It is higher than for naive bootstrap with Newey and West (1987) adjustment and 24 lags. This hints that the optimal number of lags is lower than that. We will next turn to simulations to study power and size of the tests in a controlled environment.
25Note that the number of lags is lower for value-weighted portfolios at 3 and 9 lags, respectively. 26Shao and Politis (2013) showed that this version of bootstrap does not have a normal distribution in finite sample analogously to fixed-b asymptotics for HAC errors.