• No se han encontrado resultados

2. Marco Referencial

3.3 Proceso metodológico

3.3.4 Identificación de categorías, relaciones e hipótesis

The literature that deals with the subfunctionalization process from a mathematical modelling perspective is limited. The early (and widely cited) work of Force [42] and Force and Lynch [82; 83] introduced the assumption of Poisson rates of mutation in the regulatory and coding regions, and derived some of the measures we covered in Section 3.2. Hughes and Liberles [53] were responsible for perhaps the most detailed analysis since the work of Force and Lynch [42; 82; 83]. In particular they presented an approximation to what they call the pseudogenization hazard rate, but until our recent contribution [109] no mathematical model for the overall process was explicitly set out.

Using their approximate pseudgeonization hazard rate, Hughes and Liberles [53] char- acterised the subfunctionalization process as having a broadly concave decreasing haz- ard rate. They contrasted this to a convex decreasing hazard rate associated with neofunctionalization (derived by similar approximation), which they argued was more inline with empirical reality.

Subsequently (e.g. Konrad et al. [66], Tuefel et al. [117]) the analysis of hazard rates by Hughes and Liberles [53] has been used as a reference to define phenomenological approximations to the rate of pseudogenization for gene duplicates evolving under subfunctionalization. The approximations are phenomenological in the sense that functions are chosen to produce the shape properties discussed by Hughes and Liber- les [53] without further analysis of the mechanics of the biological model.

We contend in [109] that this focus on hazard rates is slightly misplaced. Since the datasets which are ultimately analysed only detect pseudogenization, and not subfunc- tionalization, it should be the function in Equation (3.34), rather than the hazard rate (or approximations to it), which is fit to data, and hence which is of principle interest.

60 Hughes and Liberles approximation to the hazard rate

This difference is partly semantic, since the function Hughes and Liberles [53] wrote is in practice an approximation to a what we call the cause-specific pseudogenization rate for most of its definition, before switching to an approximation to what we call the pseudogenization rate (or pseudogenization modified-cause-specific hazard rate). To explicate, Hughes and Liberles [53] applied the following approximation (using the notation introduced in Section 3.2):

λzt « P z i

Ep∆Tiq forti´1 ďtăti, (3.43)

where the fixed pointsti are evaluated using

t0 “0 and ti “ti´1`Ep∆Tiq for 1ďiďz. (3.44)

That is, the (approximating) assumption was made that the hazard rates are piece- wise constant within such specified time intervalsrti´1, tis. Fortątz, λzt was assumed

to be 0. They wrote, λzt “ $ ’ ’ ’ ’ ’ ’ & ’ ’ ’ ’ ’ ’ % 2uc for 0ďtăt1 uc fort1 ďtătz´1 uc`ur fortz´1ďtătz 0 fortětz. (3.45)

No weight is given to the possibility that subfunctionalization has occurred fortătz,

and for t ě tz no weight is given to the possibility that it has not occurred. While it is true that by the time z mutations have occurred either subfunctionalization or pseudogenization must have occurred, this approximation implicitly assumes that subfunctionalization occurs at the expected time of the zth mutation t “ tz exactly. If the rate remained atuc`ur for all tątz´1, so that

λzt “ $ ’ ’ ’ & ’ ’ ’ % 2uc for 0ďtăt1 uc fort1 ďtătz´1 uc`ur fortětz´1, (3.46)

this would be a reasonable approximation to the cause-specific hazard rate. However switching to 0 after time tz indicates that the intent was to make an approximation to something akin to our pseudogenization rate instead. Hughes and Liberles [53] plotted the average of this approximation averaged over a range of z to attempt to infer the shape of the true hazard function — Figure 3.2 shows our recreation of such a plot, similar to Fig. 7 in their paper. Although some of the other examples they

Hughes and Liberles approximation to the hazard rate 61

looked at ended in short periods of convex decrease, they nonetheless characterised the pseudogenization rate of the subfunctionalization model as a ‘broadly concave decreasing’ function. This characterization is at odds with the predictions of our model, which we discuss further in Section 3.5.

t

0 2 4 6 8 10

6

t Z 0 0.5 1 1.5 2

Figure 3.2: A partial recreation of Hughes and Liberles [53] ‘Fig. 7’ showing their approximation to the mean rate of pseudogenization λZt with Z „ Unip2,16q, uc“1, ur “0.5.

The focus on shape properties of this approximation is important since this char- acterization has been used as the basis for various continuous-phenomenological ap- proximations (e.g. in [66; 117; 116]). Different parameterizations of these models are intended to represent the different biological models in the literature, based on the shape properties associated with the parameters. Subfunctionalization is associated with parameters which yield a concave decreasing function, while neofunctionaliza- tion is associated with a sigmoid shape ending in a period of convex decrease. We contend that subfunctionalization actually produces behaviour very similar to that which is usually associated with neofunctionalization, and as such this approach to distinguishing between the two biological models may be flawed.

62 Shape properties of the pseudogenization rate function

3.5

Shape properties of the pseudogenization rate