DEL DIRECTOR DEL DEPARTAMENTO

CAPÍTULO VI: DE LAS ÁREAS DE CONO CIMIENTO.

This section assesses and compares the goodness-of-fit between the multilevel credit scoring models and the logistic regression scorecard. I compute and report several measures of the fit of an estimated statistical model which are commonly applied in econometrics. Following Akaike (1974) and Schwartz (1978) I calculate and report Akaike Information criterion (AIC) and Schwarz criterion or Bayesian Information criterion (BIC). AIC and BIC criteria are deviance-based measures of fit of an estimated model. Generally, these criteria are applied to select the model which provides the best fit among the range of the fitted models while keeping the model parsimonious at the same time.

Table 3.5 reports the AIC and BIC criteria for the multilevel credit scoring models and the logistic regression scorecard. The model with the smallest values of both AIC and BIC criteria provides the best fit.

Postestimation diagnostics AIC BIC

Scorecard 1 2991.3 3090.2

Scorecard 2 2957.1 3062.6

Scorecard 3 2927.1 3045.7

Scorecard 4 2909.2 3041.0

Scorecard 5 2884.5 3029.4

Table 3.5. Postestimation diagnostic statistics: Akaike information criterion (AIC) and Bayesian information criterion (BIC).

According to the information criteria the multilevel scorecards (scorecard 2-5) outperform the conventional logit scorecard. It is also true that among the multilevel models AIC and BIC values decrease with the degree of the model’s complexity. Credit scorecards which include more microenvironment-specific effects and group-level characteristics show a superior classification performance.

A flexible version of a scoring model with multiple random-coefficients, microenvironment-level variables and interactions (scorecard 5) is preferred by the information criteria.

Next, I compute several scalar measures which aim to assess the predictive accuracy of the probability forecasts. Following Krämer and Güttler (2008) I use the predicted probabilities for the set of credit scoring models and apply a Brier score as well as logarithmic and spherical scores to check the accuracy of the forecasts.

The Brier score is the mean squared difference between the predicted probabilities and the observed binary outcomes (Brier (1950), Murphy (1973), Jolliffe and Stephenson (2003)). It is one of the oldest and most commonly used techniques for assessing the quality of the probability forecasts of a binary event (default/non-default).

The formula for the calculation of a Brier score is given in [3.1]. It cali- brates the average squared deviation of the predicted probabilities Ee_d from the actually observed outcomes _!. Lower values for the score indicate higher accuracy. The estimated Brier scores for the credit scorecards are reported in the second column in Table 3.6.

A6_A +9aA_ _C∑ C ! Eed , gh_A_ ! i 1, G_jklmD _{0 , a G_jklmDn} . [3.1] The logarithmic score is another measure of the forecasting accuracy of a model. The calculation of the score is shown in [3.2]. The logarithmic score values are always negative. For _! 1 , ln |Ee _d _! 1| is close to zero when Ee_d approaches one; for _! 0 it is close to zero when E

e

₆

is small. Accordingly, the scoring rule imposes that a model with the closest to zero logarithmic score shows the best performance. The third column in Table 3.6 presents the values of the logarithmic scores for the credit scoring models.

`akA6Dhp69 b9aA_ _C ∑ ln |EC!q e d ! 1|. [3.2] A slightly modified version of the logarithmic score is a spherical score which was introduced by Roby (1965). The calculation of the score is shown in [3.3]. The logarithmic score approaches unity when the predicted probabilities

are close to the observed outcomes. The values of the spherical scores for the credit scoring models are provided in the last column in Table 3.6.

+Eh_A69km b9aA_ _C

∑

|ZsUtr )H|

/Zu₎3_UHZ_s_r 3

.

[3.3]

Predictive accuracy

scores: Brier score

Logarithmic score Spherical score Scorecard 1 0.08090 -0.301 0.910 Scorecard 2 0.06736 -0.235 0.926 Scorecard 3 0.06252 -0.208 0.932 Scorecard 4 0.05663 -0.187 0.938 Scorecard 5 _0.05652 _-0.186 _0.939

Table 3.6. Score measures of predictive accuracy for the logistic regression and the multilevel credit scoring models: the Brier scores, logarithmic scores and spherical scores.

The results of the Brier scores confirm that the logistic scoring model produces the crudest forecasts yielding the highest per observation error. It is also true, that among the multilevel scorecards (scorecard 2-5), models with more microenvironment-specific effects provide a better calibration of the probabilities of default. The smallest error of the forecasts (0.05652) is produced by the flexible version of a credit scoring model (scorecard 5) which includes multiple area- specific coefficients, group-level variables and interactions.

Similar conclusion is made after comparing the logarithmic and spherical scores. The spherical scores are reported in the last column in the table. The best results of the logarithmic and spherical scores are given by the scorecard 5. It is also true that the score values increase with the degree of the model complexity.

To summarize the results of the predictive accuracy measures and the goodness-of-fit check, I conclude that the multilevel credit scoring models outperform the logistic regression scorecard. It is evident that the results of different postestimation diagnostics provide the same ranking to the credit scoring models discussed in the previous chapter. This confirms the main

contribution of this thesis is to introduce a multilevel scorecard which improves the forecasting quality of a scoring model.

Multilevel credit scoring is more efficient because it allows specifying a two-level structure where borrowers are nested within microenvironments and modelling random-effects. Microenvironment-specific effects vary across groups and show the impact of the economic and demographic conditions in the living areas on the riskiness of borrowers. These area-specific effects are viewed as unobserved determinants of default. Accordingly, including them in the scoring model improves the predictive quality and provides better fit to the data.

Importantly, microenvironment-specific effects capture the information on unobserved determinants of credit worthiness of individuals which impact the probability of default additionally to the observed characteristics measured at the borrower-level or group-level. This implies that two identical borrowers with the same personal characteristics but different living area conditions (microenvironments) are going to have different forecasts of probabilities because they are ex- posed to different area-specific hazards.

In the next section I apply a graphical illustration of the fitted model results in order to analyse the quality of borrowers and microenvironment- specific effects in the living areas with different economic and demographic conditions.

3.3 Predictive quality comparison: bivariate

In document B.O.U.H. 2001, núm. 21-22 - 1 de julio (página 67-69)