Subcategoría Jerarquización de la Información

NIVEL CRITICO INTERTEXTUAL

4.6.3 Categoría Dificultades en Comprensión

4.6.3.2 Subcategoría Jerarquización de la Información

2.12.1 Features and Classes

A support vector machine (SVM) (http://www.csie.ntu.edu.tw/~cjlin/libsvm/, LIBSVM toolbox for MATLAB) was used to assess the decodability in the form of pattern classification across the four sequential positions (Chang and Lin, 2011). This SVM was applied on the trials of each individual participant. The trial information was spectral source power values ‒ the result of the beamformer analysis.

The beamformer projected sensor time course information into source space, averaging time and frequency of interest. The source space of each participant consisted of 6783 voxels representing the brain space. That means each trial contains 6783 voxels. For the pattern classification, these trials were reduced to the peak voxels representing a brain location revealed by the permutation ANOVA. For the ABA, six peak voxels were defined and used as features for the pattern classification.

Before the source power values of the trials were subjected to the 4 classes, the scale was transformed to fit zero to one by dividing each value by the maximum value of the data set, because power values are always positive. The trials were separated into four classes referring to the four sequential positions.

2.12.2 SVM Parameter Settings

The data set was separated into five equal subsets to apply a fivefold cross-validation. Thus, the data were separated into 80% training and 20% test sets. For each set, four subsets were pooled to train the classifier (80% of the data), and the remaining subsets was used to test the classifier (hold-out data). This procedure was repeated five times

such that each subset was tested with a training applied to the remaining others. For each subset, an inner cross-validation was applied to find the best parameter. The best parameter was searched by the libsvm function that searches for the best kernel and its regularization parameter C (cost parameter) and decision boundary parameter gamma. This approach is described in the tutorial offered by Kittipat Kampa (https://sites.google.com/site/kittipat/libsvm_matlab). The code for the inner cross- validation was adopted from his libsvm_demo_script 13. The concept of outer and inner cross-validation was proposed by Nowotny and colleagues (Nowotny, 2014) for best feature selection. Here we used the inner cross-validation to find the best parameters. The best feature selection was computed by a sequential classification approach that is described in the next Chapter.

The decodability is expressed in a mean accuracy. The mean accuracy was computed across five independently computed accuracies, which corresponded to the outer cross- validation. To estimate the significance of the mean accuracy, a randomization approach was applied. The train and test data sets of the inner cross-validation were permuted, and one proceeded with the permutation result as described above with a non-permuted dataset. That means one proceeded with a search for the best parameters for classification and tested on the permuted hold-out test dataset. This was applied 100 times to each inner cross-validation, so 500 times in total. Out of these 500 accuracy values, a mean of five accuracies values, one of each inner cross-validation, was computed and reduced the distribution again to 100 mean accuracies values. The distribution of these accuracies of a random set was used for the distribution of the null hypothesis. Thereafter, the procedure continues with a standard permutation method (see Chapter 2.7.2 Permutation Test ) with an alpha level of 5%.

2.12.3 Sequential Classification Procedure

The best feature identification was achieved by a sequential classification approach. The aim of this approach was to identify a sequence of voxels that were most informative for the pattern classification. In the following, we will describe the procedure.

First, the pattern classification was applied to each voxel and participant separately. This means that the number of voxels determined the number of applied pattern classification for each participant. In other words, each pattern classification used one voxel only as a feature. Then, the median accuracy across the participants was computed for each feature, and the best feature was determined by the highest median accuracy across participants. Next, the pattern classification was applied on two-dimensional feature vector including always the best feature of the one-dimensional feature pattern classification and one of the remaining features. Continuing this led the last run consider all features for one pattern classification. Again, for each run the best feature was determined by the median accuracy across participants. Each proceeding pattern classification, which was applied on a n-dimensional feature vector, used a feature combination, always including the best feature of the previous pattern classification applied. This continued until all voxels as a feature were included for pattern classification.

For a better understanding, I will give the following real-life example: The beamformer revealed six peak voxels that changed significantly across the sequential positions. With a pattern classification approach introduced above one can answer the question if some of the peak voxels were more relevant as others, or if all peak voxels are equally relevant. Therefore, each voxel is determined as a feature for the single pattern classification. Thus, six pattern classifications were applied with a unique one-dimension feature for each participant. This results in that many accuracy values as participants times features exist: 41 participants times 6 features are equal to 246 accuracies values. That means for each feature 41 accuracy values of 41 participants exist. For each of these features the median accuracy was computed across participants. This revealed that the third feature reached the highest median accuracy. Next, this third feature was selected to build five two- dimensional features with each of the remaining features, resulting in these feature pairing: 3-1, 3-2, 3-4 and 3-5. This results to continue with five pattern classifications (for each participant) with one unique two-dimensional feature vector and again to compute the median accuracy across the participants. The best two-dimensional feature determined by the best median accuracy results to continue the same procedure with four feature triplets, a three-dimensional feature classification. This continues until the

last pattern classification is applied to a six-dimensional feature vector. After all, one determined a sequence of the best voxels and identifies which feature dimension had the by the highest median accuracy.

In document ESTRATEGIAS DE LECTURA PARA EL MEJORAMIENTO DE LA COMPRENSIÓN LECTORA EN LA BÁSICA PRIMARIA (página 112-115)