IV. Resultados
4.1. Elaborar expediente técnico de creación del área de conservación privada de
Fitness
The virtual ecosystem EcoSim allows the study of the complex relationship be- tween species genetic diversity and species fitness through an evolutionary process and is not limited to investigating these relationships under particular environ- mental conditions or at specific time periods as in most biological studies [82] [83] [84]. In EcoSim the environment is dynamic and is seen in the adaptation and genomic evolution of individuals. Thus, there are many factors affecting the genetic diversity of individuals and fitness of populations that can change over time and differs from one species to another. As we model long term evolution of many species in a dynamic environment the correlation between genetic diversity and species fitness may change over time. Shannon entropy is used as a measure of genetic diversity as presented in sections 4.2. At every time step entropy and fitness are calculated for all existing species. In order to investigate their possible correlations, first the Spearman’s cross correlation [85] between genetic diversity and fitness is calculated for all prey species. The Spearman measure ranks two sets of variables and tests for a linear relationship between the variables’ ranks. A perfect Spearman correlation of +1 or -1 occurs when each of the variables is a perfect monotone function of the other. The Spearman correlation coefficient is computed as follows: 1 −6 PN i=1d 2 i N3− N (6.1)
where N is the number of items, and di is the distance between each popu- lation’s rank of fitness and rank of diversity. A value of -1 represents negative correlation, 0.0 denotes no correlation, and 1.0 demonstrates positive correlation.
Figure 6.1: Different prey species correlation values between entropy and fitness. x-axis represents the different time shifts. Y-axis represents the correlation values. A positive correlation indicates that either low fitness accompanies low diversity or high fitness accompanies high diversity. Alternatively, if high fitness is associated with low diversity a negative correlation is detected. In the studied evolutionary ecosystem simulation the effect of the diversity measure on fitness is not imme- diate. There must be a time shift between the variation in genetic diversity and its effect on fitness. Also, since the causal nature between attributes is not known in advance, the correlation in both shift directions are calculated. The Spearman correlation coefficient is computed between these two time series for every possible shift between -s and +s time steps. In essence, the entropy at time t is correlated with fitness at time t + s where s ranges from -s to +s.
Although there are many factors that might affect fitness beside genetic diver- sity, strong correlation between entropy and fitness for all prey species was found. The cross-correlation charts for some prey species is presented in Fig.6.1. The x-axis in these charts represents the different shifts for the time series. The y-axis represents the cross-correlation value at the corresponding shift. The figure shows not only that different species have different cross-correlation values, but also that same species correlate differently based on the time shift. Note that multiple factors affect the behaviour of species, including the dynamic environment, co- evolution and changing parameters with time. The correlation values for the same
species may thus vary through the course of evolution, allowing us to investigate biologically meaningful relationships that may not be feasible by experimentation. This observation encouraged additional analyses by dividing the two time series into time frame windows and measuring correlations only within the specific time frame rather than the entire time series. In another words, these time series were split into sliding windows of 200 time steps centered at every time step within which all possible correlations are calculated with different shifts±s. The highest correlation value (whether positive or negative) is then chosen and assigns to the species at that time step.
The results of five different runs of the simulation are presented each one con- taining 16,000 time steps and generating around 110,000 instances on average. Three different classes are assigned to the correlation values. Correlation with values between -0.5 and 0.5 are class WEAK CORR representing either no or weak correlation. Correlation values above 0.5 are high positive (HIGHP) and correlation values below -0.5 are high negative (HIGHN) respectively. These cor- relation classes are calculated for all instances (which corresponds to the set of all species at every time step) in every run and present the percentage of each class with a window of 200 and maximum shift of 25 in both directions. The averages for five runs were 26.8%, 38.4%, 34.6% for classes HIGHP, HIGHN and WEAK CORR respectively.
To better validate the calculations, variations in window and shift values were investigated. Having a window of 200 and a maximum shift of 20 in both direc- tions gave 17%, 29.6% and 53.4% on average for five runs for HIGHP, HIGHN and WEAK CORR correlation classes respectively. Increasing the window and max- imum shift to 400 and 50 was also tested. The average percentages were 23.7%, 27.5% and 48.8% for HIGHP, HIGHN and WEAK CORR classes respectively. Increasing the shift values increases the percentage of high correlation instances, as more time is needed to detect an increase in fitness after an increase in genetic diversity. Also note that increasing the window does not necessarily increase the high correlation values as some fluctuations in the entropy or fitness time series could exist. The values of shift that leads to the highest correlation values were also examined. It was found that 37.7% of instances in 5 runs obtained highest
correlations from a positive shift between 10 and 25. In addition, an average of 38.7% for five runs found highest correlation in negative shift between -10 and -25. This shows that for more than 76% of the cases a window of 10 to 25 time steps was sufficient to see the effect of genetic diversity on the fitness or vice-versa. These values correspond roughly to one to three ’biological generations’ (average life span of an individual) which seems a reasonable time to observe the effect of genetic variations in a population.
In order to validate the correlation results, an additional test was performed. First, both fitness and genetic diversity time series were randomized and performed the same correlation calculation. Windows of 400 with a shift of 50 were set and Spearman’s cross correlation for all instances from five different simulation runs were calculated in order to compare with the original time series results. The resulting correlation values were discretized in the same way leading to 100% WEAK CORR, 0% HIGHP and 0% HIGHN. These results further validate the high correlation results obtained between genetic diversity and fitness.
The findings discussed previously of very high values for both negative and positive correlations support the claim that genetic diversity has a great influence on the well being of species. High positive correlation values mean that an in- crease in the genetic diversity, results in an increase in species fitness. There are many ways to interpret these results. For instance, a newly forming species in EcoSim with a small but sufficient population size would gradually increase its ge- netic diversity and subsequently positively correlates with its fitness. Also, these results may reflect that individuals in EcoSim adapt to their constantly changing environment. Adaptation could be mirrored by an increase in similarity of the species’ FCMs (and thus a decrease in entropy) as new behaviors arise for the new environment and then diffuses throughout the population. On the other hand, negative correlations imply that a species decreases diversity, which may happen once individuals have adapted to their environment in order to reach stability. In order to further validate these results and investigate the reasons behind these correlation values, a step forward was to build a classifier that could predict the correlation values.