• No se han encontrado resultados

4. CAPÍTULO IV ANÁLISIS E INTERPERETACIÓN DE DATOS

4.1 RESPUESTAS DEL CUESTIONARIO DE ESTUDIANTES

In the preceding sections, we have taken the existence of a training and a validation sample as given. It turns out, however, that the precise validation scheme is an issue that deserves more attention. If the training sample and the validation sample are indeed separate samples the scheme is typically referred to as a sample split. The drawback of a sample split is that either the model fitted to the training sample is estimated inefficiently (if the validation sample is a large part of the whole sample) or that the validation exercise suffers from limited data (if the validation sample is small).

A solution to this problem is the use of cross-validation where each obser- vation is used for model estimation and validation. For instance, if we use classical leave-one-out cross-validation to estimate a calibration model (Van Houwelingen & Le Cessie, 1990), we estimate the model without theith ob- servation to imitate an out-of-sample prediction for the default probability of observation i, P DdOSi , i = 1, . . . , n. Then, the calibration model is fitted to the whole sample using only P DdOSi or its linear part (see above) as pre- dictors. Similarly, with respect to nonparametric calibration, buckets can be built according to P DdOSi , and the whole sample can be used to estimate the default probability for each bucket and to compare it with the bucket averages of P DdOSi .

is thus computationally quite intensive. The computational burden can be considerably decreased by using K-fold cross-validation where the sample is split into K approximately equally sized parts and in each step one of these subsamples is held out to simulate the corresponding out-of-sample predictions.

An important assumption underlying ordinary cross-validation is the inde- pendence of the observations.18 In many credit risk applications, including

our empirical study in chapter 3, this assumption will not be met. For the case of stationary dependent data, Burman et al. (1994) introduced block cross-validation. Within this approach, for the prediction of the observa- tion in some period t one estimates the model omitting the observations

t−B, . . . , t, . . . , t+B. B is selected such that approximate independence between the training and the validation sample is achieved. Importantly, the rationale is here to simulate predictions for an observation of a ”process that has the same distribution as [the original process] but is independent of it” (Burman et al., 1994, p. 351). This is arguably not the most relevant situ- ation at least with respect to credit default predictions. Rather, in practice one usually uses the information up to some periodtto make predictions for the subsequent periods of the same process. Note that this may well mean that there is some dependence between the training sample (which includes all observations up to period t) and the outcomes in periods t+ 1, . . .. An alternative approach that takes the latter argument into account is the application of recursive or rolling-window estimation schemes. Within the recursive approach, one estimates the model with all the information available up to period t to make out-of-sample predictions for the upcoming periods and then increases t step-by-step to generate a series of predictions. The size of the estimation window thus increases by one period in each step. In contrast, under a rolling-window approach the size of the estimation window is fixed and in each step one period is added in the end and one period is omitted in the beginning. The recursive approach is also known as forward validation (Hjorth, 1982) or prequential analysis (Dawid, 1984). In the credit

18While being generally invalid, Burman & Nolan (1992) show that in certain cases ordinary cross-validation can still be saved even when the data exhibit dependencies.

risk context, Stein (2004) argues in favor of the recursive scheme that it is closest to the actual application of default prediction models in practice. As a tool to analyze possible shrinkage effects, recursive and rolling-window validation schemes have important drawbacks which are similar (albeit less pronounced) to the problems of a sample split. On the one hand, when one starts building validation samples at an early point in time, the first models are estimated on a rather small dataset. This will usually result in an overestimation of the shrinkage effect since the amount of shrinkage decreases with the sample size. On the other hand, when the validation period starts late, only a rather small amount of data can be used for validation purposes. If the validation samples are used to estimate shrinkage parameters, this will result in inefficient estimation.

To overcome the problems of both block cross-validation and recursive or rolling-window validation we propose a new kind of validation scheme which we call circular rolling-window (CRW) validation. The precise procedure is as follows:

1. Choose a block length B so that it is reasonable to assume that obser- vations in period t and period t+B are approximately independent. Choose B such thatB ≥H, whereH denotes the prediction horizon. 2. For calendar periodt, estimate the model after omitting all information

from periodst+1, . . . , t+B. This includes a possible adjustment of the lifetimes and censoring indicators for the observations in period t and before as these may contain information about the omitted periods. 3. Use the model estimated in step 2 to make out-of-sample predictions

from period t (with a horizon of H).

4. Lett run from the first period to period T whereT is the last calendar period in the sample.

The CRW method differs from block cross-validation only in the fact that one omits a block only on the right-hand side (the future) and not on both sides of period t (for reasons that were discussed above). It is also very

closely related to a recursive or rolling-window estimation scheme. The dif- ferences are here that for the CRW approach the periodst+B+ 1, . . . , T are additionally attached to each training sample (thereby motivating the name ”circular”) and that the validation period already starts with the first period, i.e. there are more validation periods. An application of the CRW procedure to calibration analyses is straightforward. The CRW method produces out- of-sample default probabilities for each observation in the sample which can be used for a nonparametric or parametric calibration analysis as described in the preceding sections. We will apply the CRW approach in section 3.3.4, which also involves some further discussion from a practical point of view. Of course, the CRW scheme is also an interesting option for the evaluation of out-of-sample discriminatory power. However, it is more important in the context of calibration for two reasons. First, compared to the recursive scheme, the size of the training samples is much closer to the full sample size, which is important as too small samples would lead to an overestimation of the shrinkage effect. Such kind of systematic bias is usually not present in the context of discrimination. Second, a calibration analysis amounts not only to test predictive accuracy but also to potentially recalibrate the model. Since the final estimates may thus depend on the validation exercise, it is important to validate as efficiently as possible. This is achieved by the CRW method asevery period in the sample is used as a validation period.

Finally, it is important to note that methods like block cross-validation were originally designed for time series data. Typical credit default datasets, in- cluding the ones that are used in this work, have a panel structure so that the question of transferability arises. More precisely, while block cross-validation or CRW validation clearly simulate predictions for new periods, predictions for new obligors are of interest as well. In many relevant datasets, however, such predictions for new obligors are automatically done by these methods since new obligors enter the dataset over time so that a prediction for a new period will usually also involve predictions for new obligors. This is also likely to be the relevant case in practice where in a new period some obligors are already known to a lender and some additional obligors appear. Thus, as it is closely related to the actual prediction processes in practice, the application of CRW to panel data seems to be appropriate.