CAPÍTULO 2. CARACTERÍSTICAS DEL SISTEMA
2.9 Definición de los casos de uso del sistema
2.9.3 Descripción textual de los Casos de Usos del Sistema (CU)
We start by considering the basic Cormack–Jolly–Seber (CJS) model, where both the survival probabilities and recapture probabilities vary with time.
Let φj denote the probability of survival between times j and j + 1, and let pj denote the probability of recapture at time j. For a capture history with K = 7, we would have the following setup:
1−→ 2φ1 p
2
φ2
−→ 3p
3
φ3
−→ 4p
4
φ4
−→ 5p
5
φ5
−→ 6p
6
φ6
−→ 7p
7
.
Numbers 1 to 7 denote the sampling occasions. In this example, we can think of them as being j = 1, 2, . . . , 7. The survival probabilities {φj}61 are labelled above the arrow as each φj is defined to be the survival probability between times j and j + 1. Beneath each j, we have pj, which is defined as the proba-bility of recapture at time j.
Using the capture history from Equation (3.1), we have
0−→ 1−→ 0φ2
1−p3
φ3
−→ 1p4
φ4
−→ 1p5
φ5
−→ 1p6 −→ 0.
We replace j with the capture history at time j and change the labels of φj and
3.2. CAPTURE–RECAPTURE 31
pj accordingly. For example, the individual was not captured at time j = 3, and so we change the label beneath position 3 from p3to 1−p3. The individual was last seen at time j = 6, since at position 7 we have a record of 0. So we no longer have φ6 between position 6 and 7. It follows that the probability of this capture history Pr(CHi) is given by
φ2(1− p3)φ3p4φ4p5φ5p6(1− φ6) + φ2(1− p3)φ3p4φ4p5φ5p6φ6(1− p7). (3.2)
As the fate of the animal is not known after occasion 6, the first term of the above probability represents the probability that the animal did not survive after time 6, and the second term represents the probability that the animal had survived after time 6, but it was not recaptured at time 7.
The probability of a capture history from the CJS model can be written in the following form (Pledger et al.,2003),
Pr (CHi) =
K
X
d=`i
( d−1 Y
j=fi
φj
!
(1− φd)
d
Y
j=fi+1
pxjij(1− pj)1−xij
!)
. (3.3)
Note. We define the empty productsQd−1
j=fiandQd
j=fi+1to be 1 when fi > d−1 and fi + 1 > d respectively. We also set φK = 0 since we do not have any information on survival after the study between times K and K +1, and p1 = 1 as recapture can only start from time 2. There are K− 1 survival probabilities and K − 1 recapture probabilities. The CJS model has a total of 2K − 2 parameters.
Example 3.2.2 (CJS Model). Consider a study with length K = 3. For completeness, we consider all 23 − 1 = 7 observable capture histories. Using
3.2. CAPTURE–RECAPTURE 32
Equation (3.3), we then obtain the following probabilities,
We do not consider the probability Pr(000) in the CJS model as we do not observe it. The CJS model is constructed to be conditional on the first capture.
Among the 7 probabilities above, we observe that
Pr(001) = 1 (3.5a)
Pr(010) + Pr(011) = 1 (3.5b) Pr(100) + Pr(101) + Pr(110) + Pr(111) = 1. (3.5c)
It follows that there are only four independent capture histories (with associ-ated probabilities). This means that we only have four pieces of usable infor-mation from K = 3 years of data. In practice, if there are missing capture histories, we could have less usable information.
Figure3.1shows a tree diagram of the CJS capture histories. At each sam-pling occasion, there are 2j−1 terms that correspond to animals which were first caught in the same year j, j = 1, . . . , K and the probabilities of those capture history terms sum to 1. In general for a K–year study, we have 2K− 1 observable capture histories. But only 2K− K − 1 independent capture histo-ries can be used to estimate the parameters. This can be interpreted as there
3.2. CAPTURE–RECAPTURE 33
1 10 100 1000 . . .
. . .
1001 . . .
. . .
101 1010 . . .
. . .
1011 . . .
. . .
11 110 1100 . . .
. . .
1101 . . .
. . .
111 1110 . . .
. . .
1111 . . .
. . .
K=4
K=1 K=2 K=3
01 010 0100 . . .
. . .
0101 . . .
. . .
011 0110 . . .
. . .
0111 . . .
. . .
001 0010 . . .
. . .
0011 . . .
. . .
0001 . . .
. . .
Figure3.1:IllustrationofcapturehistoriesforCJSmodelsofcapture–recapturedata
3.2. CAPTURE–RECAPTURE 34
being 2K ways of arranging 0s and 1s less K constraints and less 1 invalid way of arrange them (the one correspond to a string of 0s). In other words, among the 2K− 1 observable histories, there are K constraints, one for each year of first capture. Hence, there are only 2K− K − 1 independent capture histories.
The CJS model assumes that all animals have the same{φj} and {pj}. We can relax this assumption by introducing different groups of animals. Within each group, the animals share the same survival and recapture probabilities.
Pledger (2000) noted that it is not necessary to assume animals have actual different groups. Often the groups are essentially an artefact to detect hetero-geneity in the data if it is present.
To describe heterogeneity of both capture and survival, it is assumed that there are C classes with probabilities {wc} of an animal being in class c. Each animal in class c has survival probability {φjc} and capture probability {pjc} for c = 1, . . . , C and j = 1, . . . , K. The mixture version of Equation (3.3) was derived in Pledger et al. (2003) to give the following probability of a capture history
Pr(CHi) =
K
X
d=`i
C
X
c=1
( wc
d−1
Y
j=fi
φjc
!
(1− φdc)
d
Y
j=fi+1
pxjcij(1− pjc)1−xij
!) , (3.6) where PC
c=1wc = 1 is a proper constraint. For example, we choose to set wC = 1−PC−1
c=1 wc.
When heterogeneity is present, we model{φjc} and {pjc} through different link functions g(φjc) (and/or g(pjc)), using the notations defined in Nota-tion 3.2.1.
Notation 3.2.1. For the survival probabilities {φjc}, we use µφ as baseline from group 1, τφj as a time component and ηφc as a heterogeneous component.
Since we have µφ describing the baseline from group 1, we set up constraints
3.2. CAPTURE–RECAPTURE 35
τφ1 = 0 and ηφ1 = 0. So that τφj are the differences in the time component between the baseline and time component at j; ηφc are the differences in the heterogeneous component between the baseline and group c.
Similarly for the capture probabilities {pjc}, we use µp as baseline from group 1, τpj as a time component and ηpc as a heterogeneous component, with constraints τp2 = 0 (since the recapture probability starts from time j = 2) and ηp1 = 0. By convention, we set φKc = 0 and p1c = 1 for all c. That is, the survival probability after the length of study K is 0 and the capture probability at time 1 is 1 for all groups.
For K = 4, C = 3, for example, we can model φjc using a linear link function,
φjc = µφ+ τφj+ ηφc (3.7a)
[φjc] =
µφ µφ+ ηφ2 µφ+ ηφ3 µφ+ τφ2 µφ+ τφ2+ ηφ2 µφ+ τφ2+ ηφ3 µφ+ τφ3 µφ+ τφ3+ ηφ2 µφ+ τφ3+ ηφ3
0 0 0
, (3.7b)
3.2. CAPTURE–RECAPTURE 36
or using a logistic link function,
log
In the matrix [φjc], each row describes the time effect and each column de-scribes the heterogeneous effect.
It is also possible to include interactions between the time component and the heterogeneous component in the model by setting
φjc = µφ+ τφj+ ηφc+ (τ η)φjc, (3.9)
where the matrix (τ η)φ takes the following form
(τ η)φ =
Extra constraints are needed on (τ η)φ, such as requiring each row and column to sum to zero. For simplicity, we set the first column and last row of (τ η)φ to
3.2. CAPTURE–RECAPTURE 37
zero as constraints. Note that using different constraints can result in different point estimates (for example each entry in the matrix φjc). Models with in-teractions are often more complicated than they should be and have too many parameters for successful model fitting (Pledger et al., 2010). We will discuss this feature at more length in Section6.4. A similar setup can be used for pjc.
Often we model probabilities using the logistic link function, so that the point estimates are between 0 and 1. When different link functions are used, we have different point estimates of the parameters. In particular, when using the method of maximum–likelihood estimation, we will get different results from using different link functions. The effect of using different link functions will be discussed in more detail in Section 6.3.
Note. Depending on the values taken by K and C, models can have different numbers of parameters. Table 3.2 summarises the numbers of model parame-ters of the 25 models given in Pledger et al. (2003). We use ‘·’ for a constant parameter, ‘t’ for a time varying parameter, ‘hC’ for a heterogeneous parameter with C classes and ‘×’ for a parameter with interactions.
The square brackets indicate that both parameters are heterogeneous but they share the same {wc}s; see Pledger et al.(2003).
The authors give the total number of estimable parameters in the last column of Table 1 in Pledger et al. (2003) as the column ‘Total’. In our table, we present the column ‘Total’ as the total number of parameters in the model. We will discuss the problem of ‘estimable parameters’ in Chapter 4 when considering parameter redundancy.