The first step for building the model was to select the state variables to represent the feedback structure of the pathway. Since there was a lack of quantitative time course data in the literature, the model was chosen to be a minimal description of this system. The use of purpose-built mutants allowed several simplifying assump- tions to be made, such as the decoupling that exists between nutritional monitoring and Ste11 regulation in the cyr1∆ strain (Valbuena and Moreno, 2010). Ste11 is constitutively expressed in these cells; however, its mere presence in the cell is not sufficient to exert its transcription factor functions, which can only begin after exposure to pheromone. This required that at least two possible states of Ste11 be considered, and since it was not clear whether the main determinant of Ste11 state was its subcellular localisation, or phosphorylation status, the two states
were broadly labelled as (TF) active and inactive (Qinet al., 2003; Kjærulff et al., 2005). Although there are many steps in-between pheromone detection by the re- ceptors and Ste11 activation, the amount of pheromone sensed by cells is known to affect the rate of Ste11-mediated transcription (Didmon et al., 2002), so this was reflected in the model by making the rate of Ste11 activation directly dependent on pheromone dose. To complete the model, the feedback loops of the pathway were included by incorporating a Ste11 enhancement of its own production (Sug- imoto et al., 1991; Kunitomo et al., 2000), and the Ste11-dependent expression of
sxa2, whose gene product is responsible for the enzymatic inactivation of P-factor pheromone (Imai and Yamamoto, 1994; Ladds et al., 1996).
6.2.1
Gaps in knowledge give rise to a family of model
variants
Deriving the model based on limited knowledge leaves open the possibility that alternative descriptions are better suited to model the system in question. In section 3.2.1 a number of possible model variants were defined based on the model details for which no optimal choice was clear. This included questions such as the level of detail that should be included, and how important were different aspects of the model to explain the actual signalling dynamics. The expectation was that the data generated at later stages of the modelling cycle would be used to resolve the ambiguities between the model variants.
6.2.2
Preliminary assessment of the new model
Before continuing with further efforts to develop the model, it was desirable to judge how well it could describe existing data, and to set the expectations of how much data would need to be generated to fully constrain the model dynamics.
Since most studies in this area to date have produced qualitative data, only one suitable data set could be found in the literature to which the model could be fit. The study by Ladds et al. (1996) performed a relative quantification time course of Sxa2 production in response to several doses of pheromone. Although the time consuming nature of the experiments used to acquire this data only allowed for sin- gle replicates to be obtained, it was expected that having information for multiple doses would make this data set highly informative for parameter estimation. Fitting the model variants to the Ladds et al. (1996) data gave similar results in most cases, which highlighted the need for an objective way to discriminate between model fits. Typical approaches to model selection are broadly categorised as either selection criteria or statistical hypothesis testing, and although there are usually advantages and disadvantages that have to be weighed for each one, the large number of models that had to be compared simultaneously made the choice easier as hypothesis testing is usually restricted to pairwise comparisons (Motulsky and Christopoulos, 2004). Here the AIC score was used, as it is one of the most widely used criteria for model selection, and has been shown to compare favourably against other alternatives (Aho et al., 2014). One of its advantages is that it not only gives information about which model is more likely, but it can also estimate how much more likely a model is compared to its competitors. The use of AIC scores to compare model variants at this stage of the project, with only one data set available for parameter estimation, did not provide any conclusive evidence for or against particular model variants; however, it gave a clear indication that without additional data only the simplest alternatives should be considered. Nonetheless, in one particular case, the option between fixing Hill coefficients or leaving them as free parameters, a decision could be reached without resorting to AIC comparisons. A simulation analysis showed that deviations of the Hill coefficients away from a value of 1 resulted in model dynamics that were unambiguously incorrect, and that these effects could not be compensated by other
parameters (Figure 3.10). On the basis of these observations, the choice of model option was resolved for this variant.
6.2.3
Unidentifiable parameters and the possibility of a
fully identifiable model
From the initial fits of models to the Laddset al.(1996) data, it became clear that parameter non-identifiabilities were affecting the model, and several approaches were used to gauge the extent of dependency between the model parameters. A cross-correlation calculation showed which pairs of parameters were affected the most (Figure 3.12), and the resulting relationships were visualised using a Monte Carlo approach (Figure 3.13). This revealed several striking relationships where any parameter value was seemingly allowed, pointing towards the presence of struc- tural non-identifiabilities. Other relationships showed that despite being strongly correlated, the parameter values were still confined to a small region of parameter space, which suggested that confidence intervals could be calculated to show how well determined each parameter was.
In principle, it is possible to compute confidence intervals through the pseudo- volumes of the data clouds in the Monte Carlo analysis (Balsa-Cantoet al., 2008); however, this analysis is computationally very expensive, so a more efficient alter- native was sought. A literature survey revealed PLE analysis as a viable option, which in addition to being reported as robust for use with small or noisy samples, also had the benefit of having a readily available free implementation to perform the calculations (Maiwald and Timmer, 2008; Raue et al., 2009, 2015).
The results of the PLE analysis were consistent with the results obtained through the other methods, but it also illustrated that they complemented each other, as they each provided a different perspective on the non-identifiabilities. Most pa- rameters exhibited some type of non-identifiability, either structural or practical,
with only two parameters having finite confidence intervals in both directions (Fig- ure 3.14). These results then prompted the question of whether a fully identifiable model was possible, and if so which measurements would be required to accom- plish it. To answer this question definitively a structural identifiability analysis was performed using the STAUS method (Evans et al., 2002). Although it was shown that full identifiability was possible, repeating the analysis with many dif- ferent observation functions suggested that all model species had to be measured, at least in some combination, to resolve all the non-identifiabilities in the model (section 3.5).