• No se han encontrado resultados

SEGUNDA PARTE

5.2. LAS PALABRAS

5.2.9. ARTÍCULOS.

There are two recent works (Gombar and Hall, 2013, Lombardo and Jing, 2016) that have tested their models in external test sets, and provided the complete set of predictions

obtained, which allows direct comparison of the models’ predictive power with this work. Gombar and Hall (Gombar and Hall, 2013) used the same dataset used here to build the QSAR models, whereas Lombardo and Jing (Lombardo and Jing, 2016) used a larger Vss dataset (N = 1096). Comparing this work’s predictive performance against that of Gombar and Hall (scenario 1) allows assessing the value of introducing physiological information into modelling, as this is the major difference between both modelling routines – other secondary changes are present, like the removal of problematic compounds, however these are considered minor. On the other hand, a comparison against Lombardo and Jing (Lombardo and Jing, 2016) (scenario 2) allows determining the value of increasing chemical space through the increase in observation count (quantitative improvement) versus providing enriched input through the addition of physiological information (qualitative improvement).

In scenario 1, the best model in this chapter (8a) showed improved performance in all calculated measures for an external set of 30 compounds taken from Gombar et al (Gombar and Hall, 2013), as shown in Table 6.7 and Appendix III, Figure A3.4. Considering the smaller training set size used in this study (Gombar and Hall: N = 569; this work: N = 398), the better performance of model 8a in external prediction is notable (other things being equal, more data should lead to better performance). As a result, the superior performance of model 8a may be attributed to the availability of physiological input during training, or the modelling scheme used in this work which are the major differences between both works.

Table 6.7. Summary of predictive performance from Gombar and Hall (Gombar and Hall, 2013) and

this work (model 8a), evaluated on an external dataset (N = 30).

models in (Gombar and Hall, 2013) m_8a (this work) SVM MLR MAE 0.205 0.264 0.422 GMFE 1.604 1.835 2.641 MFE 1.869 1.995 5.430

Regarding scenario 2, with the external set of 34 compounds obtained from Lombardo and Jing (Lombardo and Jing, 2016), it can be observed that despite the fact that model 8a was trained with a significantly smaller training set (N=398 versus N=1096), it is still able to show comparable performance to other models as seen in Table 6.8. This model could also overcome some extreme mispredictions which were still mispredicted by model 8a, but to a lesser extent (See Appendix III, Figure A3.5. to compare the plotted observed vs predicted for models in Table 6.8). Furthermore, 53% of model 8a’s predictions show smaller error than RF_33 (which is the model with the smaller MAE value in Lombardo and Jing (Lombardo and Jing, 2016)). This supports the validity of the selected (best) model and may also indicate the value brought by accounting for physiological processes.

To further determine the value of using physiological features in the modelling of Vss in a larger chemical space, the best model from both this chapter and that from Lombardo et al.

were trained with Lombardo’s entire dataset of 1096 compounds and the resulting models

were used for the prediction of the external test set of scenario 2. Note that the modelling algorithm used in both models is random forest. The difference between Lombardo’s model and model 8a is the features used in the analysis (different Volsurf+ molecular descriptors). In addition, different algorithm parameters were used as per the original study, i.e. no minimum node size set for this work’s model and a minimum node size of 10 for Lombardo

et al, and descriptor sampling per split set to WEKA’s default for the current model versus

11 for Lombardo et al. To test the effect of physiological descriptors, both models were tested with and without the presence of these descriptors. Still, despite the different

conditions, using both the current set of parameters and Lombardo et al’s set of parameters

yielded the same conclusion: including PDs improves predictive performance across all measures, as summarized in Table 6.9.

Table 6.8. Summary of predictive performance measures from Lombardo and Jing (Lombardo and

Jing, 2016) and this work, evaluated on an external dataset (N = 34).

models in (Lombardo and Jing, 2016)

m_8a (this work) RF_33 PLS_11 consensus RF_33 and PLS_11 MAE 0.305 0.302 0.363 0.317 GMFE 2.017 2.003 2.308 2.073 MFE 2.276 2.300 2.970 2.510

Lastly, it should be noted that the difference between retrained m_8a and the retrained

Lombardo’s model can be explained by the fact that the latter was selected in the original

publication as the best (standalone) model based on the performance obtained on this same test set. As a result, comparing both models is not fair as Lombardo’s model is bound to be superior for this particular test set, hence why this study focuses on comparing presence or absence of PDs within each model.

Table 6.9. Summary of predictive performances from the different variants of the Vss modelling

conditions. All performances result from testing the models on a fixed, common dataset. Retrained m_8a model Lombardo’s modelRetrained m_8a

(this work) MDs only MDs & PDs

MDs only MDs & PDs MAE 0.305 0.322 0.318 0.300 0.293 GMFE 2.017 2.104 2.080 1.993 1.962 MFE 2.276 2.728 2.689 2.290 2.253 Number of predictions with the smallest error

8 7 7 4 5

Lastly, it should be noted that, surprisingly, model 8a was the one generating the highest rate of the smallest prediction errors (out of all 5 models in Table 6.9), which means that it shows the highest number of predictions associated with the smallest error across all 5 alternative models.

There is an alternative theoretical hypothesis that transport holds no significant additional value based on the fact that many correct Vss predictions are made from compounds that undergo protein-mediated transport (Berellini et al., 2009). However, the impact of transport may vary across compounds, and a given compound that is transported and generates a 2-fold error is perceived as being correctly predicted. Perhaps accounting for the transport effect in this case would reduce the error from 2-fold to closer to 1 (perfect prediction). Indeed, this is what the current work demonstrates, whereby the addition of physiological

6.4. Conclusions

Modeling distribution using only the chemical information of compounds has proven difficult, since such an approach does not successfully account for the specific interactions between drugs and the physiological system that govern Vd. When modelling Vss, mispredictions are generally attributed to transport or tissue binding processes, and this is to some extent the general assumption even for unexplainable mispredictions, as seen in the literature (del Amo et al., 2013, Lombardo and Jing, 2016). This demonstrates the importance of addressing transporters in modelling drug distribution.

This chapter explored the impact of using key physiological processes as input information in the modelling of human Vss. However, as descriptors of this nature are obtained experimentally, it was proposed that physiological features could be modelled in a prior step (some of which was done in chapters 4 and 5), and the learned (predicted) responses would be used to complete the data on experimental responses. At the limit, this could potentially be used as the standalone source of physiological information. The physiological parameters used in this work capture information about the potential of drugs to be transported by ABC or SLC transporters (substrate/non-substrate data) and the potential of drugs to accumulate in tissues through drug induced phospholipidosis (again a categorical variable).

It was observed that, across different variations of regression methods or feature selection techniques, adding physiological descriptors improves the predictive performance of Vss in the great majority of cases. Additionally, it was observed that using predicted physiological data to fill in missing experimental observations, specifically regarding phospholipidosis, improved the predictive performance in the majority of cases, when compared to using just experimentally observed PL responses.

To validate the main premise of this chapter that physiological descriptors are useful features in Vd modelling, the best model obtained in this study was compared to: (1) a model built on the same dataset as the one used here, and (2) a model built on a considerably larger dataset, both only using molecular descriptors. Direct comparisons were possible through testing on two relatively small external datasets (one used by each of the mentioned models), which revealed that the best model in this work performed better than, or similarly to, previous models, and the incorporation of physiological descriptors improves models obtained by both methods.

The work presented in this chapter not only shows the value of using transporter and phospholipidosis data as input descriptors for the modelling of the Vd, but also opens a

precedent for the possibility of predicting physiological responses and using those predictions to complete missing data, in order to aid the learning of the Vd QSAR model.

7.

Accounting for Transporter Binding, Transporter