Capítulo III. Análisis de los resultados
3.1 Perfil general de la demanda turística del puerto
Each unique combination of the varying initial conditions for virus properties were simulated with 5 year monitoring durations, and 4 day sampling frequencies (Table4.3). The replicate iterations were then averaged for each combination at each time point. Due to high data volume ( >1 million data points) and limited processing capability, the total data space was reduced by removing most of the data between crop years (days 1–99, and 256–365). Further data reduction approaches varied depending on the type of analysis. For instance, medians of each non-normally distributed variables were determined for each within-treatment crop year for expedited qualitative analyses.
4.4.1
Statistical Approach
Exploratory data analysis using the Shapiro-Wilk test for normality found that nearly all treatments of 20 independent variables did not follow normal distributions from between
Table 4.3: SCNSim parameters
Simulation Properties
Sampling Frequency 4 days
Iterations 10
Simulation duration 5 years
Virus Properties Mutation rates 0, 0.1, 0.2, 0.4, 0.6, 0.8 Virulence 0.1, 0.5, 1, 1.5, 2, 2.5, 4 Transmissibility 0.5 Infection Rate 0.2, 0.8 Durability 0.5 Viral Load 0.5
one to all virus factor combinations (crop year, infection rate, mutation rate, and virulence) and a significant portion had some level of heteroscedasticity. Therefore, non-parametric significance tests were used, such as the Kruskal-Wallis test, Kolmogrov-Smirnov test, and permutation tests for multi-factorial analysis of variance (ANOVA) using the R package lmPerm [133].
Criteria defined below characterizing virus agent effectiveness were tested between different treatments of viruses, or pathotypes. Typically, a pathotype describes a group of organisms with the same type of pathogenicity towards a specific host. In our data, we refer to each unique combination of virus properties as a pathotype. Results from the virus treatment simulations were compared to a control simulation which were run similarly sans any viral properties. Statistical analyses were performed using R Statistical Software [134, 135, 136,137].
4.4.2
Dimensionality Reduction
SCNSim produces a dataset of 31 variables, which not only complicates analysis but also exceeds computational limits of many softwares and processing capability of commercially available computer hardware. Scaled principle component analysis (PCA) was performed to simplify the dataset by qualitatively reducing the number of variables using theprcomp function in R [134]. The principle components (PCs) that made up the eigensystem subspace accounting for up to 80% of the sample space variation were selected for further testing. Variables were selected to each PC by examining their index of the loadings (IL,
ILij =
u2ij·λ2i sj
(4.1) where ILij is the index of loading of the ith PC and the jth variable, µij is the loading of the jthvariable in the ith PC, λi is the eigenvalue for the ith PC, and sj is the standard deviation of the jth variable which is equal to 1 when a scaled correlation matrix is used.
PCA analysis results revealed that principle components 1-4 accounted for 81.34% of the variance in the virus treatments data space (Table B.2). Many of the population variables (‘J1’, ‘J2’, etc.) were found to be closely correlated and therefore redundant. Variables ‘Nematodes’ and ‘Cyst’ were chosen to represent the SCN population, ‘Virulence’ and ‘Transmissibility’ were used to describe the viral epizoology, and ‘Fraction Infected’ and ‘Death by virus’ were used to describe disease impacts on SCN.
Finally nematode mortality rates attributed to viruses, d, was calculated by dividing ‘Death by virus’ by ‘Nematodes’ at each point in time and then taking the percentage (4.2).
d= No. Virus - Caused Deaths
No. Nematodes ×100% (4.2)
Mortality rate here is distinct from the baseline mortality rate, b, as well as the enhanced mortality rate which is equivalent to the total mortality rate.
4.4.3
Replication Ratio: Modified Reproduction Ratio
We adapted the basic reproduction ratio, R0 (2.4), to the outputs from SCNSim which we renamed the viral replication ratio, Rv (4.3), to distinguish it from R0 . Recall the basic reproduction ratio:
R0= βS
b+d+ρ (2.4)
where each of the terms is determined in the context of a population initially at maximum capacity and birth rates are not considered. Therefore, with the exception of the mortality rate terms b and d , the total population is relatively unchanging compared to the nematode population in SCNSim. Thus all the terms that make up Rv are normalized to the total nematode population:
Rv= β· (1−i)
b+d+ ˜ρ95
Transmissibility Virulence Virus Load Nematodes Prevalence Death by virus Eggs per Cyst Cyst Dead J1 J3 J4F J4M M F Temperature Health J2 0 . 1 5 . 0 0.0 0.5 1.0 0 . 1 0.5 0.0 0.5 1.0 PC 1, 43.03% of Variance PC 2, 16.80% of Variance
Figure 4.2: PCA correlation circle showing PC1 and PC2 for all predictors. Orange-colored variables were selected for further analysis.
where β is the dynamic transmissibility (as opposed to β0), i is the prevalence or infected portion of the nematode population, b is the baseline mortality rate sans any viral infection, d is the mortality rate attributed to viral infection, and ˜ρ95 is the avirulent proportion of i; the proportion of infected nematodes with high (≥95%) health. b was derived by averaging the 10 iterations of the fraction of dead nematodes to live nematodes in the control data while preserving the time-scale resolution.
Since SCNSim was designed without a fitness parameter in mind, it currently does not record the number of cured infections virus infections, we are unable to determine an equivalent recovery ratio ρ to the one used in (2.4). Therefore an approximation, ˜ρ95, representing the proportion of nematodes with an avirulent infection was created by taking the median of the subset of i where SCN Health≥95. In future improvements of SCNSim, either ρ or the number of cured viruses will be included in the data output.
Rv was plotted with respect to µ in order to find the critical mutation rates, µcrit, at the error threshold. µcrit’s were determined by interpolating the mutation rate at Rv = 1 using the slope between the points straddling the threshold. The directionality of the slope at each of these critical mutation rates was also noted.