2.5 ANALISIS ESTADISTICO
3.2. METODOLOGÍA 1 Muestreo
4.4.5. Composición de ácidos grasos 1 Perfil de ácidos grasos
4.4.5.3. Composición de los principales ácidos grasos 1 Ácidos grasos saturados
4.4.5.3.4. Patrones generales de los principales ácidos grasos
In all previous analysis of the dataset, the five socio-demographic covariates were considered to have a linear relationship with stillbirth rate. In this section, splines were used to explore whether a potential non-linear relationship existed.
Once more, a mixed model was considered, containing the fixed socio-demographic covariates and age groups in addition to a random effect for county. As in all preceding analyses using mixed models, the suppressed stillbirth counts are singularly imputed to be 1.
A systematic trial-and-error approach, described in Section 5.1.3, was used to determine the most appropriate knot placements for the spline model. Firstly, each continuous covariate was investigated one at a time within the model, with the other covariates remaining linear, to determine the optimal number and placement of knots for each covariate using B-splines.
For all covariates except country of birth, knots were found that improved the model fit and suggested a piecewise linear effect for that covariate. The most influential knot selections (according to AIC) are displayed in Table5.3.
Table 5.3: Optimal knot placements for the piecewise splines model for England and Wales
Variable Num. Interior Knots Interior Knots Boundary Knots Ethnicity 1 3 0, 82 NS-SeC 2 36, 83 8.8, 100 Marital status 2 37, 86 13, 99 Multiple births 2 1.4, 2.6, 3.5 0.4, 11
Higher order splines (quadratic, cubic and higher polynomials) were then considered, but none improved the model fits.
Model Selection of Splines
The B-spline functions for each variable were then added to our existing mixed effects model one at a time in a bi-directional stepwise manner; beginning with the covariate that had the largest drop in AIC value from the spline functions. This covariate was NS-SeC, measuring the percentage of women in each stratum with a lower NS-SeC status. After that, the variable with the second largest spline effect was marital status, then the percentage of multiple birth cases, then ethnicity. Country of birth was considered both as a linear covariate and in piecewise segments, but neither brought improvement to the fit and so it was removed from the model entirely. This model selection process can be seen in Table
Table 5.4: Model selection for piecewise spline functions on covariates for England and Wales
AIC Parameters
Model with linear covariates 10671 12 ...+ NS-SeC spline... 10659 14 ...+ marital status spline... 10658 16 ...+ multiple births spline... 10654 19 ...+ ethnicity spline... 10653 20 ...remove country of birth entirely 10653 19
Optimal Model using Splines
Table5.5shows results for the final splines model with its piecewise linear splines (again, using the singularly imputed data with suppressed birth counts fixed to 1).
Table 5.5: Model selected splines model (singularly imputed data) for England and Wales
Random Effects Variance SD Num. Groups
County 0.0058 0.07601 112
Fixed Effects OR 95% CI p-value
Baseline % non-white: 3% 1.258 (1.02, 1.56) 0.0331 % non-white: 82% 2.125 (1.69, 2.68) <0.001 % NS-SeC group 2/3: 36% 1.354 (1.20, 1.53) <0.001 % NS-SeC group 2/3: 83% 1.702 (1.45, 2.01) <0.001 % NS-SeC group 2/3: 100% 1.490 (1.11, 2.01) 0.0084 % unmarried: 37% 1.147 (1.05, 1.26) 0.0037 % unmarried: 86% 1.305 (1.11, 1.54) 0.0017 % unmarried: 99% 1.619 (1.27, 2.07) <0.001 % multiple births: 1.4% 1.208 (1.04, 1.40) 0.0117 % multiple births: 2.6% 1.165 (1.00, 1.36) 0.0474 % multiple births: 3.5% 1.258 (1.07, 1.48) 0.0052 % multiple births: 11% 1.185 (0.93, 1.51) 0.1689 Age Group <20 yrs 1.063 (0.87, 1.30) 0.5518 20-24 yrs 0.961 (0.89, 1.04) 0.3288 30-34 yrs 1.112 (1.04, 1.19) 0.0016 35-39 yrs 1.349 (1.25, 1.46) <0.001 40 yrs + 1.789 (1.63, 1.96) <0.001 AIC 10653 (20 parameters)
The intercept refers to a hypothetical baseline aggregate group of mothers aged 25-29 whose demographic corresponds to the minimum values for each covariate in the observed data (see Table4.2 for these minimum values). All of the included socio-demographic variable estimates appear significant with the exception of one spline interval for multiple births.
Interpreting spline estimates The spline odds ratio estimates can be a little challenging to interpret (see Equation5.12in Section5.1.3for more detail). Figure5.4aids understanding with a plot of the log odds for the spline function for each covariate. The plots show the optimal knots for each covariate (where the lines connect) and their effect on the stillbirth log-odds as the percentage increases with each covariate. The observed minimum value for each covariate is treated as the baseline; the log-odds estimates are calculated only for the variable’s range in the data.
For an example using the baby’s ethnicity, the log-odds for a group with 3% non-white babies compared to a baseline minimum 0% is 0.229 (corresponding to the odds ratio estimate 1.258 in Table5.5). Since the effect is linear between 0-3% This suggests that the log-odds of stillbirth increase by 0.076 for each additional percentage of non-white babies in the group up to the knot at 3%, and so for covariatex
values between 0 and 3, the log-odds are0.076x.
At the next knot, we see that the odds ratio for a group with 82% non-white babies is 2.215, and so the log-odds is 0.754. The linear rate of change between the knots atx= 3andx= 82must, therefore, be(0.754−(0.0764×3))/(82−3) = 0.00665, i.e. the log odds increase by 0.0066 for each additional percentage of non-white babies in the group after 3%. For example, a group with 70% non-white babies would have an odds ratio of1.258 + 0.0066×70 = 1.72for stillbirth compared to an all-white group.
Looking at the overall variable effects in Figure5.4, for ethnicity and marital status, the log-odds of stillbirth are monotonically increasing as the percentage of non-white babies and unmarried women increase, respectively. For socio-economic position, the log-odds generally increase as the percentage of mothers in NS-SeC Group 2/3 increase but reverse somewhat when the percentage of these groups go beyond 83%.
(a) Ethnicity and the log-odds of stillbirth
knot at 3%
(b) Socio-economic position and the log-odds of stillbirth
knots at 36% and 83%
(c) Marital Status and the log-odds of stillbirth
knots at 37% and 86%
(d) Multiple Births and the log-odds of stillbirth
knots at 1.4%, 2.6% and 3.5%
Figure 5.4: Effect of covariates’ piece-wise linear spline functions on the log-odds of stillbirth
The effect of multiple births on stillbirth seems to be the most changeable as the percentage of multiple birth events increase in the group. Figure5.4dshows that there are three turning points between 1.4% and 3.5% in which the log-odds shift from drastically increasing to decreasing effect. Practically speaking, they do not all seem necessary. A simplified version of this spline was investigated that removed the middle knot at 2.6% entirely, but this resulted in an increased AIC value and therefore a worse-fitting model. We note that almost half of the observed groups have multiple birth percentages that fall between 1.4 and 3.5, perhaps justifying this seemingly erratic behaviour of the piecewise splines.
There were no knots found in the covariate for country of birth that improved the model in comparison to its linear version.
It was possible that the non-linear relationships between stillbirth and the covariates discovered could be better explained by splitting the data into categories rather than segments of different linear functions. A version of the optimal model was constructed, this time treating each covariate at categorical with the categories divided at the same knot points. This resulted in a model with an AIC value 10892, much higher than the spline model’s 10652. This suggests that the spline model with the segments of linear functions captures the non-linear relationships between variables more appropriately.