• No se han encontrado resultados

Actividad económica y sistemas económicos

1. Las funciones del Estado en las sociedades modernas

1.1. Problemas económicos y sociales, y sistemas de organización

1.1.1. Actividad económica y sistemas económicos

The minimum distance classification method of SAIS was found to be by far the most con- sistently performing method compared to the discriminant analysis and polynomial techniques (refer to Chapter 2). Therefore, in this chapter SAIS uses the minimum distance classification method for building a scorecard.

The performance of SAIS is tested on three consumer credit datasets: Australian, German and Thomas credit approval datasets. The first two datasets (Australian and German) are publicly available benchmark datasets from the University of California Irvine (Blake & Merz 1998). They have also been used in the Statlog project, which is designed to test and evaluate statistical and logical learning algorithms on large-scale and commercially important applications, such as classification and prediction, in order to determine to what extent the various techniques meet industry needs (Michie, Spiegelhalter & Taylor 1994). The objectives of the project are threefold (Michie et al. 1994):

1. to provide critical performance measurements on available classification procedures;

2. to indicate the nature and scope of further development required for particular methods to meet the expectations of industrial users;

3. to indicate the most promising avenues of development for the commercially immature approaches.

The Thomas dataset, on the other hand, is obtained from Thomas et al. (2002).

These three datasets are selected as they have been used in a number of credit scoring studies, making it possible to compare the performance of our scorecard against others. Table 5.1 gives a description of these datasets. All three datasets contain different numbers of records and variable types. However, only the Australian dataset can be considered as a near balanced

Table 5.1: Datasets used for experiments

Number of Number of n with missing

Dataset

attributes Variable type records (n) Classes attributes

6 Continuous 307 good Australian 15 9 Categorical 690 383 bad 37 7 Continuous 700 good German 20 13 Categorical 1000 300 bad - 10 Continuous 902 good Thomas 14 4 Categorical 1225 323 bad -

dataset, having a near equal distribution of ‘good’ and ‘bad’ classes. The German and Thomas datasets are unbalanced since they have an averagebad : goodratio of 1 : 2.6. By using three types of credit scoring dataset, we are able to determine if SAIS is suitable for developing scorecards which can be used in different environmental settings.

5.2.1

Data Preprocessing

A stepwise regression analysis is performed on the three datasets in order to select the most rel- evant explanatory variables. An explanation of this regression method was provided in Section 2.5.2. It is chosen as the main method for variable selection because it is used by most financial institutions in the field of credit scoring.

A full list of the predictor variables of the three datasets used in this study after data pre- processing is shown in Table 5.2. While it was possible to obtain descriptive information on the attributes used in the German and Thomas datasets, it was not possible to do so for the Australian dataset due to confidentiality issues. The adjusted R2 for each dataset has also been included. It is a measure of the percentage of explained variation in the dependent variable that takes into account the relationship between the number of cases and the number of independent variables in a regression model (Groebner, Shannon, Fry & Smith 2008). The Australian dataset has the highest adjusted R2

, meaning that its predictor variables are more able to explain the dependent variable.

Table 5.2: Attributes used for experiments

Num Australian dataset German dataset Thomas dataset

1 A2 Status of checking account Year of birth

2 A3 Duration Number of dependants

3 A4 Credit history Home phone

4 A5 Credit amount Spouse’s income

5 A6 Saving account bonds Applicant’s income

6 A8 Present employment since Residential status

7 A9 Instalment rate in percentage

of disposable income

Mortgage balance

outstanding

8 A10 Personal status and sex Outgoings on loans

9 A11 Other debtors/guarantors Outgoings on hire purchase

10 A12 Property Outgoings on credit cards

11 A14 Other instalment plans

12 A15 Housing

13 Number of existing credits at

this bank 14 Telephone 15 Foreign worker adjusted R2 = 0.594 adjusted R2 = 0.227 adjusted R2 = 0.058

5.2.2

Experimental Procedure

A 10-fold cross-validation (CV) technique is used to partition each dataset into training and testing sets (refer to Figure 4.6). Ten different sets of data, each containing one portion as the testing set and nine portions as the training set, are therefore generated. SAIS is run 600 times on the 10 training sets of each dataset and the results obtained indicate that, on average, the performance of the classifier becomes near-constant after about 120 iterations.

The classifier is then run on the 10 testing sets of each dataset, with each set of data producing a classification performance for SAIS. The 10 classification results are averaged to yield an overall classification performance of the model. Since SAIS is non-deterministic in that some

degree of randomness is used in evolving the B-cells, the results obtained are unlikely to be similar twice. Therefore, the experiment described above is performed 10 times, that is 10×10- fold CV, and the results obtained are again averaged.

The same training and testing sets are used by both discriminant analysis and logistic regression and the experiments are carried out using the Statistical Package for Social Sciences1 (SPSS) software. However, unlike in the case of SAIS, only 1×10-fold CV is performed on discrimi- nant analysis and logistic regression. This is because these two methods are deterministic and thus when given the same training and testing datasets, they will both always produce the same results if used on the same set of data.

Due to the fact that feature selection is performed as part of data preprocessing and is likely to affect the final results as compared to those obtained without any data preprocessing, the same training and testing sets were applied to AIRS (Watkins et al. 2004), and to some of the most common algorithms found in the Waikato Environment for Knowledge Analysis2(WEKA) software. WEKA contains several standard machine learning techniques and has been widely used by researchers and industrial scientists. The objectives of the WEKA project are to (Witten & Frank 2005):

• make machine learning techniques generally available;

• apply machine learning techniques to practical problems that matter to the New Zealand industry;

• develop new machine learning algorithms and give them to the world;

• contribute to a theoretical framework for the field.

The description of the WEKA algorithms used in this chapter are as follows:

J48

C4.5 decision tree learner (implements C4.5 revision 8)

MultilayerPerceptron (MLP) Backpropagation neural network 1http://www.spss.com

IBk

k-nearest-neighbour classifier

LWL

Genetic algorithm for locally weighted learning

5.2.3

Performance Measure

The ROC curves and their associated AUC and Gini coefficient are the most appropriate mea- sure of credit scoring model performance (refer to Chapter 2). However, SAIS has so far been designed to produce only class decisions, i.e. a ‘good’ or a ‘bad’. In other words, SAIS is a discrete classifier and as a result cannot produce ROC curves. This is because when such a discrete classifier is applied to the testing data, it produces a single confusion matrix (refer to Figure 2.12), which in turn corresponds to a single ROC point (Fawcett 2003).

It should, however, be noted that while ROC curves cannot be generated, ROC graphs can be obtained. These are two-dimensional graphs in which the true positive (TP) rate and the false positive (FP) rate are plotted on the y- and x-axes respectively. They demonstrate the trade-offs between benefits (TP) and costs (FP). An ROC graph is plotted for each dataset. The single ROC point of each classifier is obtained by averaging all the TP and FP rates of each testing dataset.

Four other performance measures, described in Section 2.6, are also used. These performance measures and the justifications for their use are as follows:

1. Percent correctly classified (PCC)

The ‘percent correctly classified’ measure is used because other researchers, who have worked with the three datasets used in this chapter, have also used it as their main measure of performance. In order to ensure comparability between our experiments and theirs, this performance measure has to be used.

2. gmean

The gmean is chosen since it has been used as a performance measure in a number of

research studies involving unbalanced datasets, as is the case for the German and Thomas datasets which do not have an approximately equal number of ‘good’ and ‘bad’ classes.

3. F-measure

The F-measure weighs the effectiveness and accuracy of the algorithm equally.

4. Kappastatistic

The Kappa statistic is selected since it is a common measure used in WEKA and four

WEKA algorithms are used and compared against SAIS.