Capture-recapture (dCJS)
Cormack-Jolly-Seber models give the probability of a capture history for each of many individuals, conditional on first capture, based on parameters for survival and detection probability. nimbleEcology
provides a distribution for individual capture histories, with variants for time-independent and time-dependent survival and/or detection probability. Of course, the survival and detection parameters for the CJS probability may themselves depend on other parameters and/or random effects. The rest of this summary will focus on one individual’s capture history.
-
Define \(\phi_t\) as survival from time \(t-1\) to \(t\) and \(\mathbf{\phi} = (\phi_1, \ldots, \phi_T)\), where \(T\) is the length of the series. We use “time” and “sampling occasion” as synonyms in this context, so \(T\) is the number of sampling occasions. (Be careful with indexing. Sometimes you might see \(\phi_t\) defined as survival from time \(t\) to \(t+1\).) Define \(p_t\) as detection probability at time \(t\) and \(\mathbf{p} = (p_1, \ldots, p_T)\). Define the capture history as \(\mathbf{y} = y_{1:T} = (y_1, \ldots, y_T)\), where each \(y_t\) is 0 or 1. The notation \(y_{i:j}\) means the sequence of observations from time \(i\) to time \(j\). The first observation, \(y_0 = 1\), is not included in \(\mathbf{y}\) because it is known to be 1 since this model conditions on first capture.
+
Define \(\phi_t\) as survival from time \(t\) to \(t+1\) and \(\mathbf{\phi} = (\phi_1, \ldots, \phi_T)\), where \(T\) is the length of the series. We use “time” and “sampling occasion” as synonyms in this context, so \(T\) is the number of sampling occasions. (Be careful with indexing. Sometimes you might see \(\phi_t\) defined as survival from time \(t-1\) to \(t\).) Define \(p_t\) as detection probability at time \(t\) and \(\mathbf{p} = (p_1, \ldots, p_T)\). Define the capture history as \(\mathbf{y} = y_{1:T} = (y_1, \ldots, y_T)\), where each \(y_t\) is 0 or 1. The notation \(y_{i:j}\) means the sequence of observations from time \(i\) to time \(j\). The first observation, \(y_0 = 1\), is included in \(\mathbf{y}\).
There are multiple ways to write the CJS probability. We will do so in a state-space format because that leads to the more general DHMM case next. The probability of observations given parameters, \(P(\mathbf{y} | \mathbf{\phi}, \mathbf{p})\), is factored as: \[
P(\mathbf{y} | \mathbf{\phi}, \mathbf{p}) = \prod_{t = 1}^T P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p})
\]
Each factor \(P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p})\) is calculated as: \[
P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p}) = I(y_t = 1) (A_t p_t) + I(y_t = 0) (A_t (1-p_t) + (1-A_t))
\] The indicator function \(I(y_t = 1)\) is 1 if it \(y_t\) is 1, 0 otherwise, and vice versa for \(I(y_t = 0)\). Here \(A_t\) is the probability that the individual is alive at time \(t\) given \(y_{1:t-1}\), the data up to the previous time. This is calculated as: \[
-A_t = G_{t-1} \phi_t
+A_t = G_{t-1} \phi_{t-1}
\] where \(G_{t}\) is the probability that the individual is alive at time \(t\) given \(y_{1:t}\), the data up to the current time. This is calculated as: \[
G_{t} = I(y_t = 1) 1 + I(y_t = 0) \frac{A_t (1-p_t)}{A_t (1-p_t) + (1-A_t)}
\] The sequential calculation is initialized with \(G_0 = 1\). For time step \(t\), we calculate \(A_t\), then \(P(y_t | y_{1:t-1}, \mathbf{\phi}, \mathbf{p})\), then \(G_t\), leaving us ready for time step \(t+1\). This is a trivial hidden Markov model where the latent state, alive or dead, is not written explicitly.
@@ -556,7 +556,7 @@
CJS distributions in nimbleEcology
Arguments to dCJS_sv
are named. As in R, this is optional but helpful. Without names, the order matters.
probSurvive
is provided as a scalar value, assuming there is a variable called phi
.
-
In variants where probSurvive
is a vector (dOcc_vs
and dOcc_vv
), the \(t^{\mbox{th}}\) element of probSurvive
is \(phi_t\) above, namely the probability of survival from occasion \(t-1\) to \(t\).
+
In variants where probSurvive
is a vector (dOcc_vs
and dOcc_vv
), the \(t^{\mbox{th}}\) element of probSurvive
is \(phi_t\) above, namely the probability of survival from occasion \(t\) to \(t+1\).
probCapture
is provided as a vector value, assuming there is a matrix variable called p
. The value of probCapture
could be any vector from any variable in the model.
probCapture[t]
(i.e., the \(t^{\mbox{th}}\) element of probCapture
, which is p[i, t]
in this example) is \(p_t\) above, namely the probability of capture, if alive, at time \(t\).
The length parameter len
is in some cases redundant (the information could be determined by the length of other inputs), but nevertheless it is required.
diff --git a/man/dCJS.Rd b/man/dCJS.Rd
index fea2d7d..d17e542 100644
--- a/man/dCJS.Rd
+++ b/man/dCJS.Rd
@@ -30,14 +30,16 @@ rCJS_vv(n, probSurvive, probCapture, len = 0)
}
\arguments{
\item{x}{capture-history vector of 0s (not captured) and 1s (captured).
-Do not include the initial capture, which is assumed to have occurred
-prior to \code{x[1]}.}
+Include the initial capture, so \code{x[1]} should equal 1.}
\item{probSurvive}{survival probability, either a time-independent scalar
-(for dCJS_s*) or a time-dependent vector (for dCJS_v*).}
+(for dCJS_s*) or a time-dependent vector (for dCJS_v*) with length
+\code{len - 1}.}
\item{probCapture}{capture probability, either a time-independent scalar
-(for dCJS_*s) or a time-dependent vector (for dCJS_*v).}
+(for dCJS_*s) or a time-dependent vector (for dCJS_*v) with length \code{len}.
+If a vector, first element is ignored, as the total probability is conditioned
+on the capture at \code{t = 1}.}
\item{len}{length of capture history. Should equal \code{length(x)}}
@@ -62,18 +64,19 @@ These nimbleFunctions provide distributions that can be used directly in R or
in \code{nimble} hierarchical models (via \code{\link[nimble]{nimbleCode}}
and \code{\link[nimble]{nimbleModel}}).
-The letters following the 'dCJS_' indicate whether survival and/or
-capture probabilities, in that order, are scalar (s, meaning the
-probability applies to every \code{x[t]}) or vector (v, meaning the
-probability is a vector aligned with \code{x}). When
-\code{probCapture} and/or \code{probSurvive} is a vector, they must
-be the same length as \code{x}.
+The letters following the 'dCJS_' indicate whether survival and/or capture
+probabilities, in that order, are scalar (s, meaning the probability applies
+to every \code{x[t]}) or vector (v, meaning the probability is a vector
+aligned with \code{x}). When \code{probCapture} and/or \code{probSurvive} is
+a vector, they must be the same length as \code{x}.
It is important to use the time indexing correctly for survival.
-\code{probSurvive[t]} is the survival probabilty from time
-\code{t-1} to time \code{t}. Time indexing for detection is more
-obvious: \code{probDetect[t]} is the detection probability at time
-\code{t}.
+\code{probSurvive[t]} is the survival probabilty from time \code{t} to time
+\code{t + 1}. When a vector, \code{probSurvive} may have length greater than
+\code{length(x) - 1}, but all values beyond that index are ignored.
+
+Time indexing for detection is more obvious: \code{probDetect[t]} is the
+detection probability at time \code{t}.
When called from R, the \code{len} argument to \code{dCJS_**} is not
necessary. It will default to the length of \code{x}. When used in
@@ -85,9 +88,10 @@ For more explanation, see
\code{vignette("Introduction_to_nimbleEcology")}).
Compared to writing \code{nimble} models with a discrete latent state for
-true alive/dead status at each time and a separate scalar datum for each observation, use
-of these distributions allows one to directly sum (marginalize) over the
-discrete latent states and calculate the probability of the detection history for one individual jointly.
+true alive/dead status at each time and a separate scalar datum for each
+observation, use of these distributions allows one to directly sum
+(marginalize) over the discrete latent states and calculate the probability
+of the detection history for one individual jointly.
These are \code{nimbleFunction}s written in the format of user-defined
distributions for NIMBLE's extension of the BUGS model language. More
@@ -118,23 +122,23 @@ will make a similar invocation of \code{rCJS_ss}, with \code{n = 1}.
If both survival and capture probabilities are time-dependent, use
-\code{captures[i,1:T] ~ dCSJ_vv(survive[1:T], capture[1:T], T)}
+\code{captures[i,1:T] ~ dCSJ_vv(survive[1:(T-1)], capture[1:T], T)}
and so on for each combination of time-dependent and time-independent parameters.
}
\examples{
\donttest{
# Set up constants and initial values for defining the model
-dat <- c(1,1,0,0) # A vector of observations
-probSurvive <- 0.6
+dat <- c(1,1,0,0,0) # A vector of observations
+probSurvive <- c(0.6, 0.3, 0.3, 0.1)
probCapture <- 0.4
# Define code for a nimbleModel
nc <- nimbleCode({
- x[1:4] ~ dCJS_ss(probSurvive, probCapture, len = 4)
+ x[1:4] ~ dCJS_vs(probSurvive, probCapture, len = 4)
probSurvive ~ dunif(0,1)
- probCapture ~ dunif(0,1)
+ for (i in 1:4) probSurvive[i] ~ dunif(0, 1)
})
# Build the model, providing data and initial values
diff --git a/man/dDHMM.Rd b/man/dDHMM.Rd
index 7251029..8d04dfb 100644
--- a/man/dDHMM.Rd
+++ b/man/dDHMM.Rd
@@ -73,7 +73,8 @@ the number of time intervals.
\code{probTrans} has dimension S x S x (T - 1). \code{probTrans}[i, j, t] is
the probability that an individual in state \code{i} at time \code{t} takes on
-state \code{j} at time \code{t+1}.
+state \code{j} at time \code{t+1}. The length of the third dimension may be greater
+than (T - 1) but all values indexed greater than T - 1 will be ignored.
\code{initStates} has length S. \code{initStates[i]} is the
probability of being in state \code{i} at the first observation time.
diff --git a/man/dDynOcc.Rd b/man/dDynOcc.Rd
index 42a30a5..44463a2 100644
--- a/man/dDynOcc.Rd
+++ b/man/dDynOcc.Rd
@@ -142,7 +142,14 @@ The first two letters following the 'dDynOcc_' indicate whether the
probabilities of persistence and colonization are a constant scalar (s)
or time-indexed vector (v). For example, \code{dDynOcc_svm} takes scalar
persistence probability \code{probPersist} with a vector of colonization
-probabilities \code{probColonize[1:T]}.
+probabilities \code{probColonize[1:(T-1)]}.
+
+When vectors, \code{probColonize} and \code{probPersist} may be of any
+length greater than \code{length(x) - 1}. Only the first \code{length(x) - 1}
+indices are used, each corresponding to the transition from time t to t+1
+(e.g. \code{probColonize[2]} describes the transition probability from
+t = 2 to t = 3). All extra values are ignored. This is to make it easier to
+use one distribution for many sites, some requiring probabilities of length 1.
The third letter in the suffix indicates whether the detection probability
is a constant (scalar), time-dependent (vector), or both time-dependent and