CAPÍTULO 3. MACHINE LEARNING METHODS FOR MEDICAL IMAGE CLASSIFICATION
3.3. Training The Models
Once we have our data preprocessed, we will start training the models. To do this we will train the ve models (the four neural networks and the SVM) with dierent parameters.
CNN Methodology
For the CNN models we will use the pre-built models existing in the Keras library for the ResNet50, InceptionV3, VGG16 and VGG19, but as explained in section (2), the images available for training the model, even after data augmentation, are not enough for training a good CNN model, so we will use the ne-tune technique.
For this reason we will load the 'imagenet' weights, which were trained during the 'ImageNet Large Scale Visual Recognition Challenge'[19], where the models were trained with nearly 1.2 millions images, another 50.000 images for validation and 100.000 for testing. During this competition, the models were trained for object detection and object classication, but we are interested only in the second one, achieving a mean accuracy in this task higher than 90 %.
Once the weights are loaded, we will remove the last layer of each CNN, which is the classication layer, and we will add a new classication layer which will classify the previous layer's output into 2 dierent classes. The rst parameter to be modied to look for the best possible model will be the activation functions of the classication layer. We will train the models with the following activation functions: linear, sigmoid and softmax.
Then, before start training the model, it has to be compiled with the binary cross entropy function, which computes the loss value for the backpropagation algorithm as (ylog(p) + (1y)log(1p)). To compile the model we also have to choose the optimizer (choosing between the sigmoid, Adam or Nadam optimizers), and the learning rate.
26 Trabajo de Fin de Grado
CAPÍTULO 3. MACHINE LEARNING METHODS FOR MEDICAL IMAGE CLASSIFICATION Finally, we need to prepare the preprocessed data for training the model. To do this we will create, as explained before, an ImageDataGenerator data structure to scale the data and select the random cropping zoom range. In addition, we will set the validation split as a 10 % of the training samples. Then, we will start training the model. Because of how long the algorithm takes in every training iteration (epoch), we will save the model each 10 epochs with the keras library function, which saves the weights and the compilation status. For this reason, to train a 30-epoch model we will load the 20-epoch model and we will train it for 10 more epochs. This will allow us to stop training the models and resume it later, or to train a model again in much less time that it would take to train it from the beginning.
Our rst try will be to freeze all layers except the last one (the classier). Frozen layers won't be trained any more, what is to say that the weights won't be modied. This way we will use the CNN's pretrained weights as a feature extractor, allowing us to get good results by training the models only for a few epoch (between 1 and 10).
After this, we will build new models without freezing any layer, also modifying the parameters explained before (activation function of the classication layer, the optimizer and it's learning rate). The models will be trained for 20 to 50 epochs.
Finally, we will test every model with the testing samples and we will choose the best parameter's conguration. We can see all of this in gure (3.4).
Figura 3.4: Model's Conguration Schema
SVM Methodology
To prepare the pre-processed images to train the SVM models, we will standardize every image in the augmented dataset, and then we will atten them from a matrix of shape 224x224 to a 1D array of 1x50176. Then we will build dierent models changing it's kernel. The kernels used will be, as stated in section (2) the linear, sigmoid, rbf and the polynomial kernel with degrees from 2 to 8. Then we will test the models to nd the kernel which t better with this dataset.
Finally, we will use the SVM models to classify the features extracted by the CNNs.
To do this we will remove, as explained in CNN methodology, the last layer of each Neural
CAPÍTULO 3. MACHINE LEARNING METHODS FOR MEDICAL IMAGE CLASSIFICATION
Network (the classication layer), which will give us an array of dimension 1x4096 in the VGG16 and VGG19 models, what is to say the CNN will extract 4096 features of each image, and they will be used to classify the image using the SVM. On the other hand the resNet50 and the InceptionV3 model will return an array of 2048 features each one. With this features (that are already attened), we will train other SVM models.
Ensembles
Once we have trained and tested the models, we will analyze their results. To do this we will set a criterion to measure each model's performance, because focusing only on the overall accuracy of the model could lead us to choose a worse model than if we look at other criterion. This criterion is explained in section (4).
Then we will choose the 3 algorithms which achieve the best results according to the criterion chosen. We will not use the same algorithm with dierent parameters, i.e, we wont use the VGG19 model with dierent activation function and/or optimizer. This is because we want to improve our stand-alone models assuming that there are higher probabilities of classifying correctly an image if it has been classied by two dierent models. However, if (at least) two of the models that make up an ensemble classify each image the same way, what is to say that if they extract the same features then the ensemble will classify each image the same way than those stand-alone models.
We will build two dierent ensembles, applying to their results the same performance criterion than to the stand-alone models. Both ensembles will receive an array with the predictions made by the stand-alone models for the testing data, that will be obtained with the function predict(testing data) for both the CNN and the SVM models. This array will be formed by a list of two numbers between zero and one for each input image, that will correspond to the probability of the image to belong to the class 0 (Normal) or 1 (Abnormal) according to that stand-alone model.
Then, the ensembles will be:
Ensemble 1: The rst ensemble will not take into account the probability of each image belonging to each class. Then, the ensemble will classify each image as it has been classied by the maximum number of stand-alone models, what is to say the image will be classied as zero if it has been classied as zero by 2 (or more) of the stand-alone models that form the ensemble, and the same happens to classify an image as one.
Ensemble 2: The second ensemble will not necessarily classify an image as it has been classied by 2 of the stand-alone models. Instead of that, it will get the probability of an image to belong to a class as the sum of the probabilities given by the stand-alone models. The result of this sum will be between 0 and 3, so if an image gets a result higher than 1.5 of belonging to a class, then the image will be classied as that class. This is, in fact, the average probability of belonging to each class.
28 Trabajo de Fin de Grado
CAPÍTULO 3. MACHINE LEARNING METHODS FOR MEDICAL IMAGE CLASSIFICATION Finally, we will analyze all results, looking for the best trained model, and we will discuss the future work needed to use our model in a real diagnosis service.
Results 4
First let's dene how we are going to measure each model's performance. Then we will analyze each model's results, evaluating them and choosing the three best models for our ensembles.