Recall that the expression of genes is regulated by TFs, which bind to promoter regions and act either to activate or to repress transcription of the target gene. Signalling molecules bind to receptors on a cell’s surface to cause changes to levels of intracellular factors which also regu- late transcription of its downstream targets. These target genes may also encode TFs or signals,
CHAPTER1: INTRODUCTION
leading to a gene regulatory network (GRN). We have already provided motivation for study- ing the GRN underlying mesendoderm formation and proceed to give an overview of types of mathematical models which are used to analyse the behaviour of a GRN. Mathematical simu- lations have several advantages over wet lab experiments, such as being quicker, cheaper and being able to uncover dynamics of components that are difficult to measure via experiments. In recent years GRNs have been studied by both experimental and theoretical biologists in both single cells (for example the λ phage switch [40]) and multicellular systems (for example the segment polarity network in Drosophila [91]). There are three main classes of model used to analyse GRNs in a single cell: logical models, continuous models and stochastic models [69]. Here we give an overview of each class of model, before introducing multicellular models in the next section.
1.10.1
Overview of modelling frameworks
In this section, an overview of the types of mathematical models available to model GRNs is provided (as reviewed in [39, 69, 129, 137]), along with a detailed description of the ordinary differential equation (ODE) approach that we will use to develop the mathematical models in this thesis
Logic-based methods provide the simplest modelling framework, focusing on the behaviour of the network topology, rather than changes in the expression levels of genes. Boolean networks were first proposed as a method for exploring the behaviour of gene networks by Kauffman [70, 71]. In a Boolean model, each gene xi is a node in the network which is assigned a binary variable such that it is either ‘ON’ (xi =1) or ‘OFF’ (xi = 0). Time is defined by a number of discrete time steps (t1, t2, ..., tn) and the state of each gene is updated synchronously according to a defined set of rules. Formally, for a set of genes (x1, x2, ..., xn) the state of each gene at t+1 is updated according to a Boolean function ( fi), such that
xi(t+1) = fi(xi(t)). (1.10.1)
The study of Boolean networks has been applied to a range of biological networks, including the segment polarity network in Drosophila melanogaster, which is capable of producing patterns which are in good agreement with experimental data [2]. Advantages of Boolean networks in- clude fast computational analysis and a requirement for only qualitative data about the struc- ture of the network to build a model. Disadvantages include the fact that, although a Boolean network can be used to explore steady states and the robustness of a system, it does not take into account the change in levels of gene expression on a continuous scale.
Real biological systems produce continuous rather than discrete-valued data. Continuous mod- els, based on systems of ODEs, give real-valued levels of gene expression over a continuous timescale, producing simulations which can be directly compared with experimental data. In a continuous modelling framework, a single ODE represents the rate of change of an mRNA (or protein) (x) in the network as a nonlinear function ( f(x)) of the other mRNAs (or proteins) in a
CHAPTER1: INTRODUCTION
network, leading to a system of coupled ODEs of the form:
˙x= f(x). (1.10.2)
The representation of activation, repression and degradation of a an mRNA (or protein) as part of f(x) are introduced in section 1.11. Time-delay ODEs allow for delays between the onset of transcription and the synthesis of the protein. Advantages of using a continuous ap- proach instead of Boolean networks included the ability to give a more detailed representation of molecular mechanisms underlying gene regulation which can be analysed using dynamical systems theory e.g. bifurcation analysis. However, a large number of kinetic parameters are required to solve such models and realistic values for these parameters are not always known, meaning models are restricted to qualitative analysis or computational techniques can be used to provide estimates for parameters (see section 1.13 for more details).
The logical and continuous models described above are deterministic, i.e. they do not take into account stochastic processes. When the number of molecules in a system is small, stochastic effects can be seen. In stochastic models of gene regulation, gene expression levels are updated using a master equation. The master equation describes how the probability that the network is in a particular state changes over time. The master equation is difficult to solve, and is usually studied using stochastic simulations algorithms. The simulation of stochastic models is more computationally demanding, and requires more detailed experimental data to fit the model than do deterministic models. In developing embryos stochastic fluctuations in the levels of individual genes are not important, since the number of mRNA/protein molecules is large and degradation rates are slow [28]
Models of mesoderm and endoderm specification currently available in the literature consist of systems of ODEs for GRNs in the Xenopus [95, 123] and sea urchin [76, 77]. As already mentioned, a key concept in the differentiation of the primary germ layers is the formation of two populations of cells representing mesoderm and endoderm. The two populations of cells are shown to correspond to stable steady states of the models, with mutual antagonism between mesodermal and endodermal genes determining which state dominates. These models are mentioned in more detail in chapter 3.