diff --git a/inst/doc/Introduction_to_nimbleEcology.html b/inst/doc/Introduction_to_nimbleEcology.html index 69d58f5..645c179 100644 --- a/inst/doc/Introduction_to_nimbleEcology.html +++ b/inst/doc/Introduction_to_nimbleEcology.html @@ -12,7 +12,7 @@ - + Introduction to nimbleEcology @@ -303,7 +303,7 @@

Introduction to nimbleEcology

Perry de Valpine and Ben Goldstein

-

2019-09-23

+

2019-10-10

@@ -417,20 +417,20 @@

Simulate data

simNodes <- occupancy_model$getDependencies(c("psi", "p"), self = FALSE) occupancy_model$simulate(simNodes) occupancy_model$z -#> [1] 0 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 -#> [36] 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 +#> [1] 1 0 1 1 1 0 0 1 0 1 0 0 0 1 1 0 0 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 +#> [36] 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 head(occupancy_model$y, 10) ## first 10 rows #> [,1] [,2] [,3] [,4] [,5] -#> [1,] 0 0 0 0 0 +#> [1,] 0 0 1 0 0 #> [2,] 0 0 0 0 0 -#> [3,] 0 0 1 1 0 +#> [3,] 0 0 0 0 1 #> [4,] 0 0 0 0 0 -#> [5,] 0 0 0 0 0 -#> [6,] 0 0 0 1 0 +#> [5,] 0 0 1 0 0 +#> [6,] 0 0 0 0 0 #> [7,] 0 0 0 0 0 -#> [8,] 1 0 0 0 0 -#> [9,] 0 1 0 0 0 -#> [10,] 0 0 0 0 0 +#> [8,] 0 1 0 0 1 +#> [9,] 0 0 0 0 0 +#> [10,] 0 0 0 1 1 occupancy_model$setData('y') ## set "y" as data
@@ -484,15 +484,15 @@

Do it all for the new version of the model

#> [.quosures rlang #> c.quosures rlang #> print.quosures rlang -#> ── Attaching packages ─────────────────────────────────────────────────────────────── tidyverse 1.2.1 ── +#> ── Attaching packages ───────────────────────────────────────────────────────────── tidyverse 1.2.1 ── #> ✔ ggplot2 3.1.1 ✔ purrr 0.3.2 #> ✔ tibble 2.1.2 ✔ dplyr 0.8.1 #> ✔ tidyr 0.8.3 ✔ stringr 1.4.0 #> ✔ readr 1.3.1 ✔ forcats 0.4.0 -#> ── Conflicts ────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── +#> ── Conflicts ──────────────────────────────────────────────────────────────── tidyverse_conflicts() ── #> ✖ dplyr::filter() masks stats::filter() #> ✖ dplyr::lag() masks stats::lag() -

+

The posterior density plots show that the familiar, conventional version of the model yields the same posterior distribution as the new version, which uses dOcc_s.

@@ -513,7 +513,7 @@

Use the new version for something different: maximum likelihood or maximum a #> compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details. #> compilation finished. optim(c(0.5, 0.5), COccLogLik$run, control = list(fnscale = -1))$par -#> [1] 0.7256951 0.1377345 +#> [1] 0.7166556 0.1506934
@@ -527,14 +527,14 @@

A note on static typing

Capture-recapture (dCJS)

Cormack-Jolly-Seber models give the probability of a capture history for each of many individuals, conditional on first capture, based on parameters for survival and detection probability. nimbleEcology provides a distribution for individual capture histories, with variants for time-independent and time-dependent survival and/or detection probability. Of course, the survival and detection parameters for the CJS probability may themselves depend on other parameters and/or random effects. The rest of this summary will focus on one individual’s capture history.

-

Define \(\phi_t\) as survival from time \(t-1\) to \(t\) and \(\mathbf{\phi} = (\phi_1, \ldots, \phi_T)\), where \(T\) is the length of the series. We use “time” and “sampling occasion” as synonyms in this context, so \(T\) is the number of sampling occasions. (Be careful with indexing. Sometimes you might see \(\phi_t\) defined as survival from time \(t\) to \(t+1\).) Define \(p_t\) as detection probability at time \(t\) and \(\mathbf{p} = (p_1, \ldots, p_T)\). Define the capture history as \(\mathbf{y} = y_{1:T} = (y_1, \ldots, y_T)\), where each \(y_t\) is 0 or 1. The notation \(y_{i:j}\) means the sequence of observations from time \(i\) to time \(j\). The first observation, \(y_0 = 1\), is not included in \(\mathbf{y}\) because it is known to be 1 since this model conditions on first capture.

+

Define \(\phi_t\) as survival from time \(t\) to \(t+1\) and \(\mathbf{\phi} = (\phi_1, \ldots, \phi_T)\), where \(T\) is the length of the series. We use “time” and “sampling occasion” as synonyms in this context, so \(T\) is the number of sampling occasions. (Be careful with indexing. Sometimes you might see \(\phi_t\) defined as survival from time \(t-1\) to \(t\).) Define \(p_t\) as detection probability at time \(t\) and \(\mathbf{p} = (p_1, \ldots, p_T)\). Define the capture history as \(\mathbf{y} = y_{1:T} = (y_1, \ldots, y_T)\), where each \(y_t\) is 0 or 1. The notation \(y_{i:j}\) means the sequence of observations from time \(i\) to time \(j\). The first observation, \(y_0 = 1\), is included in \(\mathbf{y}\).

There are multiple ways to write the CJS probability. We will do so in a state-space format because that leads to the more general DHMM case next. The probability of observations given parameters, \(P(\mathbf{y} | \mathbf{\phi}, \mathbf{p})\), is factored as: \[ P(\mathbf{y} | \mathbf{\phi}, \mathbf{p}) = \prod_{t = 1}^T P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p}) \]

Each factor \(P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p})\) is calculated as: \[ P(y_{t} | y_{1:t-1}, \mathbf{\phi}, \mathbf{p}) = I(y_t = 1) (A_t p_t) + I(y_t = 0) (A_t (1-p_t) + (1-A_t)) \] The indicator function \(I(y_t = 1)\) is 1 if it \(y_t\) is 1, 0 otherwise, and vice versa for \(I(y_t = 0)\). Here \(A_t\) is the probability that the individual is alive at time \(t\) given \(y_{1:t-1}\), the data up to the previous time. This is calculated as: \[ -A_t = G_{t-1} \phi_t +A_t = G_{t-1} \phi_{t-1} \] where \(G_{t}\) is the probability that the individual is alive at time \(t\) given \(y_{1:t}\), the data up to the current time. This is calculated as: \[ G_{t} = I(y_t = 1) 1 + I(y_t = 0) \frac{A_t (1-p_t)}{A_t (1-p_t) + (1-A_t)} \] The sequential calculation is initialized with \(G_0 = 1\). For time step \(t\), we calculate \(A_t\), then \(P(y_t | y_{1:t-1}, \mathbf{\phi}, \mathbf{p})\), then \(G_t\), leaving us ready for time step \(t+1\). This is a trivial hidden Markov model where the latent state, alive or dead, is not written explicitly.

@@ -556,7 +556,7 @@

CJS distributions in nimbleEcology

  • Arguments to dCJS_sv are named. As in R, this is optional but helpful. Without names, the order matters.
  • probSurvive is provided as a scalar value, assuming there is a variable called phi.
  • -
  • In variants where probSurvive is a vector (dOcc_vs and dOcc_vv), the \(t^{\mbox{th}}\) element of probSurvive is \(phi_t\) above, namely the probability of survival from occasion \(t-1\) to \(t\).
  • +
  • In variants where probSurvive is a vector (dOcc_vs and dOcc_vv), the \(t^{\mbox{th}}\) element of probSurvive is \(phi_t\) above, namely the probability of survival from occasion \(t\) to \(t+1\).
  • probCapture is provided as a vector value, assuming there is a matrix variable called p. The value of probCapture could be any vector from any variable in the model.
  • probCapture[t] (i.e., the \(t^{\mbox{th}}\) element of probCapture, which is p[i, t] in this example) is \(p_t\) above, namely the probability of capture, if alive, at time \(t\).
  • The length parameter len is in some cases redundant (the information could be determined by the length of other inputs), but nevertheless it is required.
  • diff --git a/man/dCJS.Rd b/man/dCJS.Rd index fea2d7d..d17e542 100644 --- a/man/dCJS.Rd +++ b/man/dCJS.Rd @@ -30,14 +30,16 @@ rCJS_vv(n, probSurvive, probCapture, len = 0) } \arguments{ \item{x}{capture-history vector of 0s (not captured) and 1s (captured). -Do not include the initial capture, which is assumed to have occurred -prior to \code{x[1]}.} +Include the initial capture, so \code{x[1]} should equal 1.} \item{probSurvive}{survival probability, either a time-independent scalar -(for dCJS_s*) or a time-dependent vector (for dCJS_v*).} +(for dCJS_s*) or a time-dependent vector (for dCJS_v*) with length +\code{len - 1}.} \item{probCapture}{capture probability, either a time-independent scalar -(for dCJS_*s) or a time-dependent vector (for dCJS_*v).} +(for dCJS_*s) or a time-dependent vector (for dCJS_*v) with length \code{len}. +If a vector, first element is ignored, as the total probability is conditioned +on the capture at \code{t = 1}.} \item{len}{length of capture history. Should equal \code{length(x)}} @@ -62,18 +64,19 @@ These nimbleFunctions provide distributions that can be used directly in R or in \code{nimble} hierarchical models (via \code{\link[nimble]{nimbleCode}} and \code{\link[nimble]{nimbleModel}}). -The letters following the 'dCJS_' indicate whether survival and/or -capture probabilities, in that order, are scalar (s, meaning the -probability applies to every \code{x[t]}) or vector (v, meaning the -probability is a vector aligned with \code{x}). When -\code{probCapture} and/or \code{probSurvive} is a vector, they must -be the same length as \code{x}. +The letters following the 'dCJS_' indicate whether survival and/or capture +probabilities, in that order, are scalar (s, meaning the probability applies +to every \code{x[t]}) or vector (v, meaning the probability is a vector +aligned with \code{x}). When \code{probCapture} and/or \code{probSurvive} is +a vector, they must be the same length as \code{x}. It is important to use the time indexing correctly for survival. -\code{probSurvive[t]} is the survival probabilty from time -\code{t-1} to time \code{t}. Time indexing for detection is more -obvious: \code{probDetect[t]} is the detection probability at time -\code{t}. +\code{probSurvive[t]} is the survival probabilty from time \code{t} to time +\code{t + 1}. When a vector, \code{probSurvive} may have length greater than +\code{length(x) - 1}, but all values beyond that index are ignored. + +Time indexing for detection is more obvious: \code{probDetect[t]} is the +detection probability at time \code{t}. When called from R, the \code{len} argument to \code{dCJS_**} is not necessary. It will default to the length of \code{x}. When used in @@ -85,9 +88,10 @@ For more explanation, see \code{vignette("Introduction_to_nimbleEcology")}). Compared to writing \code{nimble} models with a discrete latent state for -true alive/dead status at each time and a separate scalar datum for each observation, use -of these distributions allows one to directly sum (marginalize) over the -discrete latent states and calculate the probability of the detection history for one individual jointly. +true alive/dead status at each time and a separate scalar datum for each +observation, use of these distributions allows one to directly sum +(marginalize) over the discrete latent states and calculate the probability +of the detection history for one individual jointly. These are \code{nimbleFunction}s written in the format of user-defined distributions for NIMBLE's extension of the BUGS model language. More @@ -118,23 +122,23 @@ will make a similar invocation of \code{rCJS_ss}, with \code{n = 1}. If both survival and capture probabilities are time-dependent, use -\code{captures[i,1:T] ~ dCSJ_vv(survive[1:T], capture[1:T], T)} +\code{captures[i,1:T] ~ dCSJ_vv(survive[1:(T-1)], capture[1:T], T)} and so on for each combination of time-dependent and time-independent parameters. } \examples{ \donttest{ # Set up constants and initial values for defining the model -dat <- c(1,1,0,0) # A vector of observations -probSurvive <- 0.6 +dat <- c(1,1,0,0,0) # A vector of observations +probSurvive <- c(0.6, 0.3, 0.3, 0.1) probCapture <- 0.4 # Define code for a nimbleModel nc <- nimbleCode({ - x[1:4] ~ dCJS_ss(probSurvive, probCapture, len = 4) + x[1:4] ~ dCJS_vs(probSurvive, probCapture, len = 4) probSurvive ~ dunif(0,1) - probCapture ~ dunif(0,1) + for (i in 1:4) probSurvive[i] ~ dunif(0, 1) }) # Build the model, providing data and initial values diff --git a/man/dDHMM.Rd b/man/dDHMM.Rd index 7251029..8d04dfb 100644 --- a/man/dDHMM.Rd +++ b/man/dDHMM.Rd @@ -73,7 +73,8 @@ the number of time intervals. \code{probTrans} has dimension S x S x (T - 1). \code{probTrans}[i, j, t] is the probability that an individual in state \code{i} at time \code{t} takes on -state \code{j} at time \code{t+1}. +state \code{j} at time \code{t+1}. The length of the third dimension may be greater +than (T - 1) but all values indexed greater than T - 1 will be ignored. \code{initStates} has length S. \code{initStates[i]} is the probability of being in state \code{i} at the first observation time. diff --git a/man/dDynOcc.Rd b/man/dDynOcc.Rd index 42a30a5..44463a2 100644 --- a/man/dDynOcc.Rd +++ b/man/dDynOcc.Rd @@ -142,7 +142,14 @@ The first two letters following the 'dDynOcc_' indicate whether the probabilities of persistence and colonization are a constant scalar (s) or time-indexed vector (v). For example, \code{dDynOcc_svm} takes scalar persistence probability \code{probPersist} with a vector of colonization -probabilities \code{probColonize[1:T]}. +probabilities \code{probColonize[1:(T-1)]}. + +When vectors, \code{probColonize} and \code{probPersist} may be of any +length greater than \code{length(x) - 1}. Only the first \code{length(x) - 1} +indices are used, each corresponding to the transition from time t to t+1 +(e.g. \code{probColonize[2]} describes the transition probability from +t = 2 to t = 3). All extra values are ignored. This is to make it easier to +use one distribution for many sites, some requiring probabilities of length 1. The third letter in the suffix indicates whether the detection probability is a constant (scalar), time-dependent (vector), or both time-dependent and