• No se han encontrado resultados

PARA LA EVALUACIÓN DE LA UNIDAD DIDÁCTICA

prediction model for hepatotoxicity

As a first overview of the data a hierarchical clustering was performed with all time points tested (2 h, 6 h, 1 d, 3 d, 5 d, and 9 d after dosing). As shown in Figure 62, no clear separation was achieved at any of these time points. On days one and five, Rosi and Clo separated from the other experiments, but on day one also the livertoxic compound Tet grouped together with them. All other experiments were organized in two large groups but clearly not based on toxicity. At later time points, cells treated with all three doses of Dex separated from the other experiments and built their own cluster. These findings were also shown by other clustering methods, such as PCA (Figure 63). These results re-enforce the difficulty in establishing a model based on global gene expression. Also toxic compounds have specific mechanisms of action with specific gene expression changes, and these differences can be hidden by the large number of unaffected genes. To establish a model capable of discriminating between the two defined groups, other techniques are needed.

Figure 62: Hierarchical clustering from global gene expression data from compound treated primary rat hepatocytes. Shown are the results from cells dosed for 1 d, 5 d and 9 d with the previously described model compounds. No obvious separation of toxic and non-toxic compounds was achieved at any time point.

The normalized data was grouped by compound, time point, and dose. Finally two groups, toxic and non-toxic, were defined according to the previously defined toxicity (see Figure 58). First, the possibility to create a functional classification model was tested. Therefore, trainings sets were created for all time points and for the high and low doses separately as well as for both together. The classification was conducted with four different classification algorithms to account for any potential

Non-livertoxic Livertoxic FC Liver FaO-cells Non-livertoxic Livertoxic FC Liver FaO-cells Non-livertoxic Livertoxic FC Liver FaO-cells Non-livertoxic Livertoxic FC Liver FaO-cells

analysis and the K-nearest neighbour analysis, all of which are supervised learning methods, were used. They were applied on the same dataset that was used for the training, but in this case, the leave-one-out cross-validation method was applied. This means that the training set was applied 1,000 times on the whole dataset, but in every run, 15% of the dataset were removed and the remaining data was classified. This classification method was checked for its accuracy afterwards and misclassification rates for each of these algorithms were calculated. This number defines the percentage by which the samples were allocated to the wrong group. At the same time, genes were ranked according to their importance for this discrimination and the number of genes needed for best results were calculated. Results are shown in Figure 64.

50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 Lo w dose Hi gh do se Lo w a nd hi g h d ose

1 d after d o sin g 9 d after d o sin g

50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 50 40 30 20 10 0 1 10 100 103 104 105 Lo w dose Hi gh do se Lo w a nd hi g h d ose

1 d after d o sin g 9 d after d o sin g

Figure 64: Construction of the classification models and gene rankings. Four different algorithms were applied to discriminate between two previously defined groups (toxic and non- toxic). For each algorithm, the misclassification rate and the number of genes needed for best results were calculated.

In most cases, the classification algorithm of K nearest neighbour resulted in the best predictions. Generally, the misclassification rates were lower for the samples treated for 9 d than for samples treated for shorter times. By analysing only the low dose samples, a misclassification rate of approximately 32% was detected one day after dosing. This result was only slightly, but not significantly, improved at later time points. Taking only the high dose groups into the model resulted in a misclassification rate of 19% after one day of dosing and 11% after 9 d. Best results were obtained with samples dosed for 9 d in culture taking both doses together into the model. In this case, the misclassification rate was reduced to 7.5%. To reach this rate, only 724 genes were needed and were sufficient.

Figure 65 shows examples of the results of the cross validations, 1 d and 9 d after dosing. It is clear to see the reduction of misclassified samples for the later time point. Whereas in the early samples the computer estimated both false positive and false negative samples, at later time points there were no falsely positive predicted samples. Only three samples were misclassified, all of which were low dose samples. One biological replicate of each, Tro, Tet and ANIT was wrongly predicted to be non-toxic. However, the whole group was still classified as toxic. All three groups had a classifier output of below 0.5, which means that they were

1 d after dosing 9 d after

to increase robustness of the model by tolerating single experiments to be misclassified but retaining the overall correct result.

The main objective of this study was to determine whether it would be possible to distinguish between hepatotoxic and non-hepatotoxic compounds with the help of an in

vitro system and global gene expression analysis. The clustering analysis of the global

gene expression data alone did not allow such discrimination. By using the support vector machine algorithm together with a cross-validation, it was possible to obtain a subset of genes that allowed the discrimination, with a false discovery rate of only 7.5%. These results clearly show the advantage of longer term dosing for the establishment of gene expression changes, which clearly contribute to the discrimination of the two groups. Short term experiments only show the acute effects of a compound, like inflammatory or immune responses. This is not sufficient in in vitro experiments, because of the lack of certain cell types and therefore specific mechanisms may be missing. Dosing for longer times has the advantage of increasing compound specific gene expression changes and therefore enables the discrimination algorithms to find basic differences between toxic and non-toxic compounds in the dataset.

At the same time, the combination of two different dosing schemes also contributed to a better model. This could be simply due to the fact that more data was available for the algorithm, making the comparison more valid. Additionally, by combining high and low doses, further information hidden in the global gene expression data set may be accessible to the algorithm. It is noticeable that the low dose treated samples alone were poorly distinguishable by the algorithms but improved the result of the whole dataset. This shows that this effect is not just additive but that there is really additional information introduced into the calculation by the low dose samples. For future applications these results imply that large datasets and, if possible, two (or more) doses are required for these kind of calculations.

As detailed above, the aim of such prediction models is the classification of new data from novel compounds. This would not be possible by simple clustering methods but by ranking genes according to their contribution to the discrimination of the predefined groups and generation classifiers, this goal was achieved.

For the verification of this prediction model, the potential hepatotoxicity of EMD X was predicted. Dosing and data acquisition for this compound was conducted exactly as described for the model compounds. Additionally, the same classifier was applied to the whole dataset, including the data from AAP and the model compounds used for the calculation of this model as a retrospective verification of the previously analyzed data.

With altogether 120 experiments, the calculated misclassification rate of 7.5% would allow nine experiments to be wrongly classified (partly shown in Figure 66). Overall, only eight experiments were misclassified. In most cases, all experiments were classified correctly, independent of the dose. For Tet, two out of five low dosed and one of the high dosed experiments were misclassified. Even so, because of the five biological replicates, the majority of these experiments were still correctly classified resulting in an overall correct classification for Tet. The new compound EMD X, was classified as hepatotoxic. All experiments were clearly allocated to this group resulting in a robust classification. This result corroborated perfectly with previously obtained results from other in house studies (data not shown).

Another interesting result was obtained by the classification of AAP. Even so no toxicity was detected in the cytotoxicity tests (LDH and ATP test), the compound was still classified as hepatotoxic in both high and low dose treatment groups based on the global gene expression. A closer look on the single experiments revealed that in both doses, one experiment was classified as non toxic and two as toxic. The classifier output in most cases was unequivocal suggesting borderline classification. This means that the classification of this compound is less robust than for EMD X. Nevertheless, the classification showed an effect which could not be detected by cytotoxicity tests, but is well known in vivo.

Figure 66: Result of the classification of data gained from primary rat hepatocytes treated for 9 d with Tro,

3.4.5

Analysis of the top ranked genes of the prediction