-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy path19-stan.Rtex
95 lines (76 loc) · 3.6 KB
/
19-stan.Rtex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
\section{Appendix}
\frame{
\sffamily
\frametitle{Stan example: Cox regression}
\begin{itemize}
\item {\bf Example (Cox regression):} Recall that standard Cox regression specifies the hazard function as:
$$ \lambda(t|z) = \lambda_0(t)\exp(z^\top \beta) $$
for an unspecified baseline hazard, $\lambda_0(t)$.
With right-censoring, one has the likelihood (for one patient):
$$
(\lambda_0(w_i) \exp(z_i^\top \beta))^{v_i} \exp\left(
-\exp(z_i^\top \beta) \int_0^{w_i} \lambda_0(u)du \right)
$$
where $w_i$ is the minimum of the failure time and the censoring time for the patient and $v_i$ is 1 if the person is not censored.
It turns out this likelihood can be written as a Poisson distribution when one assumes piecewise constant baseline hazard -- we'll use this in the implementation.
\end{itemize}
}
\frame{
\sffamily
\frametitle{Stan example: Bayesian Cox regression}
\begin{itemize}
\item {\bf Example (Cox regression):} A Bayesian treatment requires either a prior for $\lambda_0(t)$ or use of the Kalbfleisch (1978) result that Cox's partial likelihood can be used as an approximation to an actual likelihood, with a prior placed simply on the regression coefficients.
Considering the fully Bayesian approach, a canonical solution has been to assume a piecewise constant hazard model with independent jumps at the failure times. The size of the jumps is centered on those in a parametric hazard model, $\lambda_0^*(t)$ (such as a Weibull) using a gamma distribution for conjugacy.
$$
\lambda_0(t)dt \sim \mbox{Gamma}(c\lambda_0^*(t)dt, c)
$$
This construction is called a gamma process.
\end{itemize}
}
\frame{
\sffamily
\frametitle{Stan example: Stan overview}
Stan focuses on HMC, so its primary computation is of the full log posterior density.
\begin{itemize}
\item \texttt{model} block of code uses imperative syntax (order matters) to encode calculation of log posterior.
\item BUGS-style distributional statements are allowed.
\item Stan also has its own language for encoding computation, including full linear algebra.
\end{itemize}
Stan generates C++ from the model code to implement HMC for the model and then compiles it to an executable, similar to NIMBLE.
Note: Stan cannot handle discrete parameters (discrete data are fine), so any implementation would require marginalization over such parameters.
}
\frame{
\sffamily
\frametitle{Stan example: Stan overview}
Other blocks of Stan code for a model describe:
\begin{itemize}
\item \texttt{data}: input data and what NIMBLE calls constants
\item \texttt{transformed data}: fixed hyperparameters and transformations of input data
\item \texttt{parameters}: list of unknown parameters
\item \texttt{generated quantities}: posterior functionals of parameters and/or data
\end{itemize}
}
\begin{frame}[fragile]
\sffamily
\frametitle{Stan example: Stan code for Cox regression}
Here's the core \texttt{model} block for Cox regression for the BUGS leukemia example.
{\footnotesize
\begin{verbatim}
model {
beta ~ normal(0, 1000);
for(j in 1:NT) { ## unique times
## gamma process prior for baseline hazard, centered on exponential
dL0[j] ~ gamma(c * r * (t[j + 1] - t[j]), c);
for(i in 1:N) { ## patients
if (Y[i, j] != 0)
## Poisson representation of likelihood with piecewise
## baseline hazard; Y[i,j] is observation process:
## Y[i,j]=1 if patient i is observed and 0 if censored
dN[i, j] ~ poisson(Y[i, j] * exp(beta * Z[i]) * dL0[j]);
}
}
}
\end{verbatim}
}
See \texttt{hmc-cox.R} for full Stan implementation.
\end{frame}