Further exploration of the Gaussian process fitting procedure

3.2 Model definition

3.2.4 Further exploration of the Gaussian process fitting procedure

spatial transmissibility surfaces depicted in Fig 3.13 also reveal roughly similar patterns as the transmissibility surface in Fig 3.11, with elevated transmissibility in the southeastern US.

Under the first scenario, with spatial length scale of 100 km, the transmissibility surface is more locally variable than the surfaces produced using longer spatial length scales. This is to be expected, since a length scale of 100 km yields a very flexible transmissibility surface.

The transmissibility surface produced under the second scenario, with spatial length scale of 500 km, is less locally variable than the transmissibility surfaces produced using shorter length scales. The overall range of transmissibility values is also smaller, with Exp[ξξξ^T] ranging from 0.3 to 3.5, rather than from about 0.2 to over 6 for the SE scenarios with shorter length scale. This reduced variability is due to the surface’s higher rigidity. The transmissibility surface produced under the third scenario, with spatial length scale of 200 km and a RQ covariance function, is similar to the transmissibility surface produced using a SE covariance function and the same spatial length scale (Fig 3.11). This suggests that the estimated transmissibility surface is somewhat robust to the choice of covariance function, though it should be noted that both covariance functions belong to the same general class, yielding a surface with infinite mean-square differentiability everywhere (see §3.1.3).

We proceed using the mean posterior estimates for ξξξ^T and ξξξ^S obtained using a SE covariance function with spatial length scale of 200 km and temporal length scale of 8 half weeks. Fig 3.14 depicts the difference between the actual and expected outbreak onset times by location using the new model, Eq 3.45, with these posterior mean estimates substituted in.

Comparing with Fig 3.7 indicates that includingξξξ^T andξξξ^Sresolves many of the systematic discrepancies. Fig 3.15 depicts the expected and true cumulative number of locations infected over time under the new model. The shapes of the curves now match closely. The gap between the curves points to a remaining model mis-specification. In general, there are always a few more true outbreaks than expected, likely due to mid-range jumps of infection that the model cannot reliably predict. There appear to be about 50 such jumps at any given time; adding 50 to the expected cumulative onset curve makes the two curves match almost perfectly, except at the very beginning and very end of the epidemic. The discrepancy may be due to a mis-specification in the shape of the transmission kernel, and suggests that exploring more flexible kernel forms may be warranted.

● ●●

● ●

●●

● ●

●

●●

●

●●● ● ●

● ●●

●●

● ● ● ● ● ●● ●

Aug Sep Oct Nov Dec 0.5

1 2 4

Exp(ξt) ^{● ●}^●^●●

●●

● ●

●●

● ●●

●

●●

● ●

●

●● ● ● ●

● ●●

● ●● ● ● ● ● ●● ●

Aug Sep Oct Nov Dec 0.25

0.5 1 2 4

Exp(ξt)

● ● ● ●●

● ●

●

●●

●

●● ● ●

●

●●

●● ● ● ● ●● ●● ●

Aug Sep Oct Nov Dec 0.5

1 2 4

Exp(ξt)

Fig. 3.12 Mean exponentiated temporal transmissibility values, Exp[ξξξ^T^T^T] (black lines), with

±2 standard deviations (grey bands), using a SE covariance function with spatial length scale ofl=100 km and temporal length scalel=8 half-weeks (upper left), a SE covariance function with spatial length scale l=500km and temporal length scale l =8 half-weeks (upper right), and a rational quadratic covariance function with spatial length scalel=200 km and temporal length scalel=8 half-weeks (bottom). The exponentiated transmissibility adjustment Exp[ξ^T] may be interpreted as a multiplicative factor for the transmissibility termβ_d (see Eq 3.45) that varies over time. Values above 1 indicate higher-than-average transmissibility, and values lower than 1 indicate lower-than-average transmissibility. All three plots resemble the temporal transmissibility surface fit using a SE kernel with spatial length scalel=200 km and temporal length scalel=8 half-weeks (Fig 3.10).

Exp[ξ^S]

0.25 0.5 1 2 4

Exp[ξ^S]

0.5 1 2

Exp[ξ^S]

0.25 0.5 1 2 4

Fig. 3.13 Maps of the mean exponentiated geographic transmissibility values, Exp[ξξξ^S^S^S], using a SE covariance function with spatial length scalel=100 km (top),l=500 km (middle), and a rational quadratic covariance function with length scalel=200 km (bottom). In all three scenarios, the temporal characteristic length scale is 8 half-weeks. The exponentiated transmissibility adjustment Exp[ξ^S] may be interpreted as a multiplicative factor for the transmissibility termβ_d (see Eq 3.45) that varies across space. A SE covariance function with characteristic distance ofl=100 km yields a patchy spatial transmissibility surface, though there is still a high concentration of ZIPs in the southeast with elevated transmissibility. A SE covariance function with characteristic distance ofl=500 km yields a surface with relatively less variation than the shorter distance scales. A RQ covariance function withl=200 yields a spatial transmissibility surface that closely resembles the one estimated using the SE kernel with the same length scale, depicted in Fig 3.11.

-6 -4 -2 0 2 4 6

Fig. 3.14 Difference in weeks between the observed and expected epidemic onset time in each ZIP under the transmissibility-adjusted model, Eq 3.45. The area of each disc is proportional to the magnitude of the difference. Blue/purple discs correspond to ZIPs where the true epidemic onset time is later than the expected onset time, and green/red discs correspond to ZIPs where the opposite is true. The band of locations in Missouri, Kentucky, and Virginia with later-than-expected onsets in Fig 3.7 is no longer as apparent.

● ● ● ● ●

● ●● ● ●

● ● ●● ●● ●

●

● ●

●● ● ● ● ●

● ● ●

●●

● ●

● ● ● ●● ●

● ●

●

● ● ●● ● ● ●

Aug Sep Oct Nov

0 200 400 600 800

Cum.no.oflocationsinfected

Fig. 3.15 Expected (black) and observed (blue) cumulative number of locations infected over time under the transmissibility-adjusted model, Eq 3.45. The shapes of the two curves match closely. The lag between the expected and observed cumulative number of onsets may be due to mid-range jumps for which the transmission model still does not fully account. If this is true, then there were approximately 50 of these mid-range outbreaks at any given time during the outbreak; adding 50 to the expected cumulative number of outbreaks causes the two curves to match almost perfectly, except at the very beginning and very end of the epidemic.

geographic transmission model Eq 3.37 with known transmissibilityβ_d, and the transmissibility surfaces can be estimated using the synthetic onsets. In this section, two scenarios are considered. First, the transmissibilityβ_dis held constant across all locations and for all time, to ensure that the Gaussian process fits do not yield spurious patterns. Second, the transmissibilityβ_d is increased in the southeastern US (HHS regions 4 and 6) to check that the Gaussian process fits can correctly identify authentic differences in transmissibility. In both scenarios,β_d is held constant across time.

Outbreaks are simulated from the best mechanistic transmission model Eq 3.37 with β_ds, ν, and θ all equal to zero. As for the simulations presented in §3.2.2, θ is held at zero to avoid having to simulate full epidemic curves in each ZIP. Other parameter values are fixed atµ =0.25,ρ=62km, andγ =7.8, equal to the values used to produce Fig 3.6.

Also following §3.2.2, epidemics are seeded in the four locations with onset in the first week of the true outbreak, and β₀ is fixed at 0, so that no additional seeding occurs. For the constant-transmissibility scenario,β_d is fixed at 0.61 for all locations throughout the epidemic. For the spatially-varying transmissibility scenario,β_d is fixed at 1.64 in the 244 ZIPs in HHS regions 4 and 6, and at 1.64/4=0.41 in the 590 ZIPs in the rest of the country.

This makes the meanβ_d across all locations equal to 0.61, and makesβ_d in the southeast four times higher than it is in the rest of the country.

Three epidemics are simulated under each scenario. For each simulation, posterior Gaussian process estimates ofξξξ^T andξξξ^S are generated using the procedures described in

§3.2.3, using a SE covariance function with spatial length scalel =200 km and temporal length scalel=8 half-weeks. Figs 3.16-3.17 depict the mean posteriorξξξ^T andξξξ^Sfor the constantβ_d scenario, and Fig 3.18-3.19 depict the meanξξξ^T and ξξξ^Sfor the scenario with elevatedβ_d in the southeast. The fits for the constantβ_d scenario in Figs 3.16-3.17 show no substantial variation in transmissibility over time or space, as expected. For the second scenario with elevated transmissibility in the southeast, the temporal transmissibility surfaces ξξξ^T are significantly elevated at the start of the outbreak, and decrease near the fourth week of the simulation. This is because three of the four outbreak seeds are in the southeastern US (see Fig 3.6), so the early epidemic is associated with high transmissibility in all three cases. The spatial transmissibility surfaces in Fig 3.17 also reveal elevated transmissibility in the southeastern US. Interestingly, the transmissibility surfaces here have smaller ranges than the transmissibility surfaces obtained using the true onset times (see Figs 3.10 and 3.11).

Despite the true transmissibility in the southeast being four times higher than in the rest of the country, the posterior mean estimates for Exp[ξξξ^S]in that region are just over 1.25. This underestimate of the true elevation in transmissibility may be due in part to rigidity in the

posterior process imposed by the correlation structure, and may also be exacerbated by the smooth process having difficulty fitting to the sharp, sudden increase in transmissibility in the southeast. If it is true that the Gaussian process fits generally underestimate the overall variation in transmissibility, then the true transmissibility elevation in the southeastern US during the autumn 2009 A/H1N1pdm pandemic wave may have been very pronounced, in order to produce posterior meanξξξ^Sestimates well over 4.

●●

● ●●

●

●●

●● ●●●

● ● ● ●●● ●●

● ●● ● ● ● ● ●

0 4 8 12 16

0.75 1.25 1 1.5

Week Exp(ξt)

●● ● ●● ● ●

●

●●●

●

● ● ●●

●

●● ● ●

●●●● ●●

● ● ● ●●

0 4 8 12 16

0.75 1.25 1 1.5

Week Exp(ξt)

●

● ●●●

●

● ●● ●

●●

●

●●

●

●● ●● ●

● ● ●

0 4 8 12

0.75 1.25 1 1.5

Week Exp(ξt)

Fig. 3.16 Mean exponentiated temporal transmissibility values, Exp[ξξξ^T^T^T] (black lines), with

±2 standard deviations (grey bands), for three epidemic simulations using the transmission model Eq 3.37 with parameter valuesβ0=0,β_d=0.61,µ =0.25,ρ=62km,γ =7.8, and θ =0. The outbreaks are seeded in the four locations with earliest onset in during the true autumn wave of the 2009 A/H1N1pdm outbreak in the US (see Fig 3.6). The uncertainty bands all overlap with the horizontal line at Exp[ξξξ^T^T^T] = 1, correctly identifying no substantial variation in transmissibility over time.

In document Geographic and demographic transmission patterns of the 2009 A/H1N1 influenza pandemic in the United States (página 106-111)