Given a mathematical model that describes outbreak onset times in distinct metapopulation patches as an additive force of infection from each previously-infected patch, a transmission
network may be constructed that describes all possible routes of transmission. Reversing this transmission network in a particular way yields a Markov chain that can be used to trace the epidemic to its most likely sites of introduction. This section describes this strategy in detail.
4.2.1 Characterising the forward transmission network
Suppose an epidemic occurs onnmetapopulation patches, or ‘locations’. Each location’s outbreak onset time is observed. Without loss of generality, let the locations be assigned indicesi=1, . . . ,nin order of their outbreak onset time, from earliest to latest. Consider a functionλithat characterises the force of infection on locationiat its outbreak onset time as the sum of the forces exerted by all previously-infected locations, plus some background risk of seeding from outside the metapopulation. That is,
λi=β0+
i−1
∑
j=1λi,j (4.1)
whereβ0 is the force from external seeding, andλi,j is the force exerted on locationiby location j. The first outbreak (i=1) could only have been triggered by external seeding, so define∑0j=1λ1,j=0.
The partial forces of infectionλi,jcan be visualized as a transmission network, as depicted in the left-hand diagram in Fig 4.2. Locations are represented by nodes (circles), connected with arrows that indicate possible transmission pathways. In addition, n ‘seeding states’
(clouds) are introduced, each of which exerts a force ofβ0on a single location. Summing the forces of all of the arrows going into a nodeiyields Eq 4.1.
4.2.2 Reversing the infection process
To identify epidemic hubs, transmission chains are traced probabilistically back to likely points of introduction. This is done by reversing the direction of the transmission network and noting that, with the proper normalisations, the resulting ‘reverse transmission network’
represents a Markov chain for which the probability of transitioning from stateito state jis equivalent to the probability that locationiwas infected by ‘parent’ location j. The aim is to trace each outbreak to a most probable parent, then to a most probable parent’s parent, and so on, until the outbreak is ultimately traced back to a first ancestor, where the infection was introduced into the system – that is, a hub.
Fig. 4.2 Forward transmission network (left) and reverse transmission network (right) for an idealised outbreak taking place on three locations. Circles represent real locations, and clouds represent ‘seeding states’ – conceptual reservoirs of infection that contribute infective force from outside the population. In reality, it is more natural to think of a single external reservoir of infection that contributes a constant forceβ0on all locations; however, artificially separating the reservoirs is mathematically convenient. In this example, the outbreak begins in location 1, then infects location 2, and finally infects location 3, in three subsequent time steps. In the left-hand diagram, arrows denote possible transmission paths, and arrow labels give the partial forces of infection. In the right-hand diagram, arrows point towards possible
‘parent’ outbreaks, and arrow labels give the probability that the location at the tip of the arrow directly sparked the outbreak in the location at the tail of the arrow. Definitions of the arrow weights are given in §4.2.2. In this simplified setting, location 1 would be a hub, since the outbreaks in locations 2 and 3 can be traced back to the seeding state attached to location 1 with high probability.
Mathematically, this back-stepping procedure is done by taking powers of the reverse transmission network’s transition matrix. Thei,jth entry of the transition matrix gives the probability that the outbreak in location i was immediately triggered by location j (i.e.
that location jwas a ‘parent’ to the outbreak in locationi). Thei,jth entry of the squared transition matrix gives the probability that the outbreak in locationiwas triggered by location jvia any one intervening location (i.e., that location jwas a ‘grandparent’ to the outbreak in locationi). As these powers approach infinity, each outbreak is traced back to a most likely
‘seeding state’. In the limit, thei,jth entry of the pth power of the transition matrix gives the probability that the outbreak in locationiwas initially triggered by a seeding event in location j.
To illustrate the procedure, refer again to the idealised outbreak depicted in Fig 4.2.
Reversing the arrows in the left-hand diagram gives the reverse transmission network (right- hand diagram), where each arrow now points towards a possible contributor of infection. The transition probabilities are denoted
τi j =P(transmission from jtoi) = λi j λi
and
σi=P(external seeding ini) =β0 λi.
Theτi j represent the probability that the outbreak in locationicame from parent location j, and theσirepresent the probability that the outbreak in locationiwas due to a seeding event.
Defineτττn×nto be the matrix whosei,jth entry isτi j. Note thatτi j=0 for all j≥i, soτττ is strictly lower triangular. Also defineσσσn×nto be the matrix withσ1,σ2, . . . ,σnalong the diagonal and with zeros elsewhere. The transition matrixM2n×2nthat describes the reverse transmission network can be written using these matrices:
M= I 0
σ σ σ τττ
! .
The first n elements of the state space of M correspond to the seeding states (clouds in Fig 4.2), and the remainingnelements correspond to the real locations. Entry Mi j is the probability that ‘parent’ location jdirectly sparked locationi’s outbreak (or, equivalently, the probability that the reverse transmission process transitions from stateito state j). The identity matrix in the upper left block indicates that the seeding states are ultimate sources of infection; they can only transition to themselves. Similarly, the 000 matrix in the upper right
block indicates that transmission cannot occur from a real location to a seeding state. Theσσσ matrix in the lower left block captures the probability of a seeding event in each real location.
Theτττ matrix in the lower right captures the transmission probabilities between real locations.
Note that, as required, the row sums ofMall equal 1.
The pth power ofMcontains the probabilities of transitioning between any two nodes viap−1 intermediate steps. Finding the ultimate ancestor of each location’s outbreak, then, requires calculating limp→∞Mp≡M∞. Sinceτττ is strictly lower triangular (has zeros along its diagonal),τττm=000 form≥n+1. Thus,Mm=M∞form≥n+1, yielding
M∞= I 000
(I+τττ+τττ2+. . .)σσσ 000
! .
Element(M∞)i,j gives the probability that state jwas the ultimate source of the outbreak in locationi. The identity matrix in the upper left block indicates that seeding states are sources unto themselves; this is so by definition. Each real location’s ultimate source is a seeding state, since the real→real transitions in the lower-right block of the matrix all go to zero. The lower-left block ofM∞contains the values of greatest interest. Denote this block
Pn×n≡(I+τττ+τττ2+. . .)σσσ. The entriesPi,j are the probabilities that external seeding in
location jultimately led to an outbreak in locationi. The row sums∑jPi j equal 1 for all j. The column sums ofP, denotedCj=∑iPi j, can be interpreted as the expected number of outbreaks triggered by seeding in location j. Hubs are locations with highσ and highC values – that is, locations where external seeding probably triggered an outbreak, which then led to significant onward spread.
This result may also be derived in a more compact, though perhaps less intuitively satisfying, way. We seek a matrixPsuch that entryPi,j is the probability that seeding in location j caused the outbreak in locationi, via any number of intermediate steps. It is helpful to introduce notation for the following events:
j˚⇒i: “Seeding in location jultimately caused the outbreak in locationi”
j→i: “The outbreak in location jimmediately caused the outbreak in locationi”
So,Pi j =Pr(j˚⇒i). Note that Pr(j˚⇒ j) =σj. Fori̸= j, the law of total probability states that
Pr(j˚⇒i) =
∑
k
Pr(j˚⇒k)Pr(k→i).
The probabilities Pr(k→i)are thei,kth entries of the transition matrixτ. Taken together, this implies that
P=σσσ+τττP. (4.2)
Solving forPgives
P= (I−τττ)−1σσσ = (I+τττ+τττ2+. . .)σσσ. (4.3)
Sinceτττis strictly lower triangular,τττp=000 for p>n, and so the infinite sum is guaranteed to converge. Eq 4.3 matches the earlier result forP.