Title: | R Interface to 'CmdStan' |
---|---|
Description: | A lightweight interface to 'Stan' <https://mc-stan.org>. The 'CmdStanR' interface is an alternative to 'RStan' that calls the command line interface for compilation and running algorithms instead of interfacing with C++ via 'Rcpp'. This has many benefits including always being compatible with the latest version of Stan, fewer installation errors, fewer unexpected crashes in RStudio, and a more permissive license. |
Authors: | Jonah Gabry [aut], Rok Češnovar [aut], Andrew Johnson [aut, cre] , Steve Bronder [aut], Ben Bales [ctb], Mitzi Morris [ctb], Mikhail Popov [ctb], Mike Lawrence [ctb], William Michael Landau [ctb] , Jacob Socolar [ctb], Martin Modrák [ctb], Ven Popov [ctb] |
Maintainer: | Andrew Johnson <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 0.8.1.9000 |
Built: | 2024-12-14 18:32:39 UTC |
Source: | https://github.com/stan-dev/cmdstanr |
Stan Development Team
CmdStanR: the R interface to CmdStan.
CmdStanR (cmdstanr package) is an interface to Stan (mc-stan.org) for R users. It provides the necessary objects and functions to compile a Stan program and run Stan's algorithms from R via CmdStan, the shell interface to Stan (mc-stan.org/users/interfaces/cmdstan).
The RStan interface (rstan package) is an in-memory interface to Stan and relies on R packages like Rcpp and inline to call C++ code from R. On the other hand, the CmdStanR interface does not directly call any C++ code from R, instead relying on the CmdStan interface behind the scenes for compilation, running algorithms, and writing results to output files.
Allows other developers to distribute R packages with pre-compiled Stan programs (like rstanarm) on CRAN. (Note: As of 2023, this can mostly be achieved with CmdStanR as well. See Developing using CmdStanR.)
Avoids use of R6 classes, which may result in more familiar syntax for many R users.
CRAN binaries available for Mac and Windows.
Compatible with latest versions of Stan. Keeping up with Stan releases
is complicated for RStan, often requiring non-trivial changes to the
rstan package and new CRAN releases of both rstan and
StanHeaders. With CmdStanR the latest improvements in Stan will be
available from R immediately after updating CmdStan using
cmdstanr::install_cmdstan()
.
Running Stan via external processes results in fewer unexpected crashes, especially in RStudio.
Less memory overhead.
More permissive license. RStan uses the GPL-3 license while the license for CmdStanR is BSD-3, which is a bit more permissive and is the same license used for CmdStan and the Stan C++ source code.
CmdStanR requires a working version of CmdStan. If
you already have CmdStan installed see cmdstan_model()
to get started,
otherwise see install_cmdstan()
to install CmdStan. The vignette
Getting started with CmdStanR
demonstrates the basic functionality of the package.
For a list of global options see cmdstanr_global_options.
Maintainer: Andrew Johnson [email protected] (ORCID)
Authors:
Jonah Gabry [email protected]
Rok Češnovar [email protected]
Steve Bronder
Other contributors:
Ben Bales [contributor]
Mitzi Morris [contributor]
Mikhail Popov [contributor]
Mike Lawrence [contributor]
William Michael Landau [email protected] (ORCID) [contributor]
Jacob Socolar [contributor]
Martin Modrák [contributor]
Ven Popov [contributor]
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Useful links:
Report bugs at https://github.com/stan-dev/cmdstanr/issues
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
draws
object from a CmdStanR fitted model objectCreate a draws
object supported by the posterior package. These
methods are just wrappers around CmdStanR's $draws()
method provided for convenience.
## S3 method for class 'CmdStanMCMC' as_draws(x, ...) ## S3 method for class 'CmdStanMLE' as_draws(x, ...) ## S3 method for class 'CmdStanLaplace' as_draws(x, ...) ## S3 method for class 'CmdStanVB' as_draws(x, ...) ## S3 method for class 'CmdStanGQ' as_draws(x, ...) ## S3 method for class 'CmdStanPathfinder' as_draws(x, ...)
## S3 method for class 'CmdStanMCMC' as_draws(x, ...) ## S3 method for class 'CmdStanMLE' as_draws(x, ...) ## S3 method for class 'CmdStanLaplace' as_draws(x, ...) ## S3 method for class 'CmdStanVB' as_draws(x, ...) ## S3 method for class 'CmdStanGQ' as_draws(x, ...) ## S3 method for class 'CmdStanPathfinder' as_draws(x, ...)
x |
A CmdStanR fitted model object. |
... |
Optional arguments passed to the |
To subset iterations, chains, or draws, use the
posterior::subset_draws()
method after creating the draws
object.
## Not run: fit <- cmdstanr_example() as_draws(fit) # posterior's as_draws_*() methods will also work posterior::as_draws_rvars(fit) posterior::as_draws_list(fit) ## End(Not run)
## Not run: fit <- cmdstanr_example() as_draws(fit) # posterior's as_draws_*() methods will also work posterior::as_draws_rvars(fit) posterior::as_draws_list(fit) ## End(Not run)
CmdStanMCMC
to mcmc.list
This function converts a CmdStanMCMC
object to an mcmc.list
object
compatible with the coda package. This is primarily intended for users
of Stan coming from BUGS/JAGS who are used to coda for plotting and
diagnostics. In general we recommend the more recent MCMC diagnostics in
posterior and the ggplot2-based plotting functions in
bayesplot, but for users who prefer coda this function provides
compatibility.
as_mcmc.list(x)
as_mcmc.list(x)
x |
A CmdStanMCMC object. |
An mcmc.list
object compatible with the coda package.
## Not run: fit <- cmdstanr_example() x <- as_mcmc.list(fit) ## End(Not run)
## Not run: fit <- cmdstanr_example() x <- as_mcmc.list(fit) ## End(Not run)
These are generic functions intended to primarily be used by developers of packages that interface with on CmdStanR. Developers can define methods on top of these generics to coerce objects into CmdStanR's fitted model objects.
as.CmdStanMCMC(object, ...) as.CmdStanMLE(object, ...) as.CmdStanLaplace(object, ...) as.CmdStanVB(object, ...) as.CmdStanPathfinder(object, ...) as.CmdStanGQ(object, ...) as.CmdStanDiagnose(object, ...)
as.CmdStanMCMC(object, ...) as.CmdStanMLE(object, ...) as.CmdStanLaplace(object, ...) as.CmdStanVB(object, ...) as.CmdStanPathfinder(object, ...) as.CmdStanGQ(object, ...) as.CmdStanDiagnose(object, ...)
object |
The object to be coerced. |
... |
Additional arguments to pass to methods. |
Create a new CmdStanModel
object from a file containing a Stan program
or from an existing Stan executable. The CmdStanModel
object stores the
path to a Stan program and compiled executable (once created), and provides
methods for fitting the model using Stan's algorithms.
See the compile
and ...
arguments for control over whether and how
compilation happens.
cmdstan_model(stan_file = NULL, exe_file = NULL, compile = TRUE, ...)
cmdstan_model(stan_file = NULL, exe_file = NULL, compile = TRUE, ...)
stan_file |
(string) The path to a |
exe_file |
(string) The path to an existing Stan model executable. Can
be provided instead of or in addition to |
compile |
(logical) Do compilation? The default is |
... |
Optionally, additional arguments to pass to the
|
A CmdStanModel
object.
install_cmdstan()
, $compile()
,
$check_syntax()
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
A CmdStanDiagnose
object is the object returned by the
$diagnose()
method of a CmdStanModel
object.
CmdStanDiagnose
objects have the following associated
methods:
Method | Description |
$gradients() |
Return gradients from diagnostic mode. |
$lp() |
Return the total log probability density (target ). |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanGQ
,
CmdStanLaplace
,
CmdStanMCMC
,
CmdStanMLE
,
CmdStanPathfinder
,
CmdStanVB
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
A CmdStanGQ
object is the fitted model object returned by the
$generate_quantities()
method of a
CmdStanModel
object.
CmdStanGQ
objects have the following associated methods,
all of which have their own (linked) documentation pages.
Method | Description |
$draws() |
Return the generated quantities as a draws_array . |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$summary() |
Run posterior::summarise_draws() . |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
Method | Description |
$time() |
Report the total run time. |
$output() |
Return the stdout and stderr of all chains or pretty print the output for a single chain. |
$return_codes() |
Return the return codes from the CmdStan runs. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanLaplace
,
CmdStanMCMC
,
CmdStanMLE
,
CmdStanPathfinder
,
CmdStanVB
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); }" ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) # stan program for standalone generated quantities # (could keep model block, but not necessary so removing it) gq_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } generated quantities { array[N] int y_rep = bernoulli_rng(rep_vector(theta, N)); }" ) mod_gq <- cmdstan_model(gq_program) fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123) str(fit_gq$draws()) library(posterior) as_draws_df(fit_gq$draws()) ## End(Not run)
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); }" ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) # stan program for standalone generated quantities # (could keep model block, but not necessary so removing it) gq_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } generated quantities { array[N] int y_rep = bernoulli_rng(rep_vector(theta, N)); }" ) mod_gq <- cmdstan_model(gq_program) fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123) str(fit_gq$draws()) library(posterior) as_draws_df(fit_gq$draws()) ## End(Not run)
A CmdStanLaplace
object is the fitted model object returned by the
$laplace()
method of a
CmdStanModel
object.
CmdStanLaplace
objects have the following associated methods,
all of which have their own (linked) documentation pages.
Method | Description |
$draws() |
Return approximate posterior draws as a draws_matrix . |
$mode() |
Return the mode as a CmdStanMLE object. |
$lp() |
Return the total log probability density (target ) computed in the model block of the Stan program. |
$lp_approx() |
Return the log density of the approximation to the posterior. |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$summary() |
Run posterior::summarise_draws() . |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
$save_latent_dynamics_files() |
Save diagnostic CSV files to a specified location. |
Method | Description |
$time() |
Report the run time of the Laplace sampling step. |
$output() |
Pretty print the output that was printed to the console. |
$return_codes() |
Return the return codes from the CmdStan runs. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanGQ
,
CmdStanMCMC
,
CmdStanMLE
,
CmdStanPathfinder
,
CmdStanVB
A CmdStanMCMC
object is the fitted model object returned by
the $sample()
method of a CmdStanModel
object.
Like CmdStanModel
objects, CmdStanMCMC
objects are R6
objects.
CmdStanMCMC
objects have the following associated
methods, all of which have their own (linked) documentation pages.
Method | Description |
$draws() |
Return posterior draws using formats from the posterior package. |
$sampler_diagnostics() |
Return sampler diagnostics as a draws_array . |
$lp() |
Return the total log probability density (target ). |
$inv_metric() |
Return the inverse metric for each chain. |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$num_chains() |
Return the number of MCMC chains. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$print() |
Run posterior::summarise_draws() . |
$summary() |
Run posterior::summarise_draws() . |
$diagnostic_summary() |
Get summaries of sampler diagnostics and warning messages. |
$cmdstan_summary() |
Run and print CmdStan's bin/stansummary . |
$cmdstan_diagnose() |
Run and print CmdStan's bin/diagnose . |
$loo() |
Run loo::loo.array() for approximate LOO-CV |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
$save_latent_dynamics_files() |
Save diagnostic CSV files to a specified location. |
Method | Description |
$output() |
Return the stdout and stderr of all chains or pretty print the output for a single chain. |
$time() |
Report total and chain-specific run times. |
$return_codes() |
Return the return codes from the CmdStan runs. |
Method | Description |
$expose_functions() |
Expose Stan functions for use in R. |
$init_model_methods() |
Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() |
Calculate log-prob. |
$grad_log_prob() |
Calculate log-prob and gradient. |
$hessian() |
Calculate log-prob, gradient, and hessian. |
$constrain_variables() |
Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() |
Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() |
Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() |
Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanGQ
,
CmdStanLaplace
,
CmdStanMLE
,
CmdStanPathfinder
,
CmdStanVB
A CmdStanMLE
object is the fitted model object returned by the
$optimize()
method of a CmdStanModel
object.
CmdStanMLE
objects have the following associated methods,
all of which have their own (linked) documentation pages.
Method | Description |
draws() |
Return the point estimate as a 1-row draws_matrix . |
$mle() |
Return the point estimate as a numeric vector. |
$lp() |
Return the total log probability density (target ). |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$summary() |
Run posterior::summarise_draws() . |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
Method | Description |
$time() |
Report the total run time. |
$output() |
Pretty print the output that was printed to the console. |
$return_codes() |
Return the return codes from the CmdStan runs. |
Method | Description |
$expose_functions() |
Expose Stan functions for use in R. |
$init_model_methods() |
Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() |
Calculate log-prob. |
$grad_log_prob() |
Calculate log-prob and gradient. |
$hessian() |
Calculate log-prob, gradient, and hessian. |
$constrain_variables() |
Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() |
Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() |
Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() |
Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanGQ
,
CmdStanLaplace
,
CmdStanMCMC
,
CmdStanPathfinder
,
CmdStanVB
A CmdStanModel
object is an R6 object created
by the cmdstan_model()
function. The object stores the path to a Stan
program and compiled executable (once created), and provides methods for
fitting the model using Stan's algorithms.
CmdStanModel
objects have the following associated
methods, many of which have their own (linked) documentation pages:
Method | Description |
$stan_file() |
Return the file path to the Stan program. |
$code() |
Return Stan program as a character vector. |
$print() |
Print readable version of Stan program. |
$check_syntax() |
Check Stan syntax without having to compile. |
$format() |
Format and canonicalize the Stan model code. |
Method | Description |
$compile() |
Compile Stan program. |
$exe_file() |
Return the file path to the compiled executable. |
$hpp_file() |
Return the file path to the .hpp file containing the generated C++ code. |
$save_hpp_file() |
Save the .hpp file containing the generated C++ code. |
$expose_functions() |
Expose Stan functions for use in R. |
Method | Description |
$diagnose() |
Run CmdStan's "diagnose" method to test gradients, return CmdStanDiagnose object. |
Method | Description |
$sample() |
Run CmdStan's "sample" method, return CmdStanMCMC object. |
$sample_mpi() |
Run CmdStan's "sample" method with MPI, return CmdStanMCMC object. |
$optimize() |
Run CmdStan's "optimize" method, return CmdStanMLE object. |
$variational() |
Run CmdStan's "variational" method, return CmdStanVB object. |
$pathfinder() |
Run CmdStan's "pathfinder" method, return CmdStanPathfinder object. |
$generate_quantities() |
Run CmdStan's "generate quantities" method, return CmdStanGQ object. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
A CmdStanPathfinder
object is the fitted model object returned by the
$pathfinder()
method of a
CmdStanModel
object.
CmdStanPathfinder
objects have the following associated methods,
all of which have their own (linked) documentation pages.
Method | Description |
$draws() |
Return approximate posterior draws as a draws_matrix . |
$lp() |
Return the total log probability density (target ) computed in the model block of the Stan program. |
$lp_approx() |
Return the log density of the approximation to the posterior. |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$summary() |
Run posterior::summarise_draws() . |
$cmdstan_summary() |
Run and print CmdStan's bin/stansummary . |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
$save_latent_dynamics_files() |
Save diagnostic CSV files to a specified location. |
Method | Description |
$time() |
Report the total run time. |
$output() |
Pretty print the output that was printed to the console. |
$return_codes() |
Return the return codes from the CmdStan runs. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanGQ
,
CmdStanLaplace
,
CmdStanMCMC
,
CmdStanMLE
,
CmdStanVB
Fit models for use in examples
cmdstanr_example( example = c("logistic", "schools", "schools_ncp"), method = c("sample", "optimize", "laplace", "variational", "pathfinder", "diagnose"), ..., quiet = TRUE, force_recompile = getOption("cmdstanr_force_recompile", default = FALSE) ) print_example_program(example = c("logistic", "schools", "schools_ncp"))
cmdstanr_example( example = c("logistic", "schools", "schools_ncp"), method = c("sample", "optimize", "laplace", "variational", "pathfinder", "diagnose"), ..., quiet = TRUE, force_recompile = getOption("cmdstanr_force_recompile", default = FALSE) ) print_example_program(example = c("logistic", "schools", "schools_ncp"))
example |
(string) The name of the example. The currently available examples are
To print the Stan code for a given |
method |
(string) Which fitting method should be used? The default is
the |
... |
Arguments passed to the chosen |
quiet |
(logical) If |
force_recompile |
Passed to the $compile() method. |
The fitted model object returned by the selected method
.
## Not run: print_example_program("logistic") fit_logistic_mcmc <- cmdstanr_example("logistic", chains = 2) fit_logistic_mcmc$summary() fit_logistic_optim <- cmdstanr_example("logistic", method = "optimize") fit_logistic_optim$summary() fit_logistic_vb <- cmdstanr_example("logistic", method = "variational") fit_logistic_vb$summary() print_example_program("schools") fit_schools_mcmc <- cmdstanr_example("schools") fit_schools_mcmc$summary() print_example_program("schools_ncp") fit_schools_ncp_mcmc <- cmdstanr_example("schools_ncp") fit_schools_ncp_mcmc$summary() # optimization fails for hierarchical model cmdstanr_example("schools", "optimize", quiet = FALSE) ## End(Not run)
## Not run: print_example_program("logistic") fit_logistic_mcmc <- cmdstanr_example("logistic", chains = 2) fit_logistic_mcmc$summary() fit_logistic_optim <- cmdstanr_example("logistic", method = "optimize") fit_logistic_optim$summary() fit_logistic_vb <- cmdstanr_example("logistic", method = "variational") fit_logistic_vb$summary() print_example_program("schools") fit_schools_mcmc <- cmdstanr_example("schools") fit_schools_mcmc$summary() print_example_program("schools_ncp") fit_schools_ncp_mcmc <- cmdstanr_example("schools_ncp") fit_schools_ncp_mcmc$summary() # optimization fails for hierarchical model cmdstanr_example("schools", "optimize", quiet = FALSE) ## End(Not run)
These options can be set via options()
for an entire R session.
cmdstanr_draws_format
: Which format provided by the posterior
package should be used when returning the posterior or approximate posterior
draws? The default depends on the model fitting method. See
draws for more details.
cmdstanr_force_recompile
: Should the default be to recompile models
even if there were no Stan code changes since last compiled? See
compile for more details. The default is FALSE
.
cmdstanr_max_rows
: The maximum number of rows of output to print when
using the $print()
method. The default is 10.
cmdstanr_print_line_numbers
: Should line numbers be included when
printing a Stan program? The default is FALSE
.
cmdstanr_no_ver_check
: Should the check for a more recent version of
CmdStan be disabled? The default is FALSE
.
cmdstanr_output_dir
: The directory where CmdStan should write its output
CSV files when fitting models. The default is a temporary directory. Files in
a temporary directory are removed as part of R garbage collection, while
files in an explicitly defined directory are not automatically deleted.
cmdstanr_verbose
: Should more information be printed
when compiling or running models, including showing how CmdStan was called
internally? The default is FALSE
.
cmdstanr_warn_inits
: Should a warning be thrown if initial values are
only provided for a subset of parameters? The default is TRUE
.
cmdstanr_write_stan_file_dir
: The directory where write_stan_file()
should write Stan files. The default is a temporary directory. Files in
a temporary directory are removed as part of R garbage collection, while
files in an explicitly defined directory are not automatically deleted.
mc.cores
: The number of cores to use for various parallelization tasks
(e.g. running MCMC chains, installing CmdStan). The default depends on the
use case and is documented with the methods that make use of mc.cores
.
A CmdStanVB
object is the fitted model object returned by the
$variational()
method of a
CmdStanModel
object.
CmdStanVB
objects have the following associated methods,
all of which have their own (linked) documentation pages.
Method | Description |
$draws() |
Return approximate posterior draws as a draws_matrix . |
$lp() |
Return the total log probability density (target ) computed in the model block of the Stan program. |
$lp_approx() |
Return the log density of the variational approximation to the posterior. |
$init() |
Return user-specified initial values. |
$metadata() |
Return a list of metadata gathered from the CmdStan CSV files. |
$code() |
Return Stan code as a character vector. |
Method | Description |
$summary() |
Run posterior::summarise_draws() . |
$cmdstan_summary() |
Run and print CmdStan's bin/stansummary . |
Method | Description |
$save_object() |
Save fitted model object to a file. |
$save_output_files() |
Save output CSV files to a specified location. |
$save_data_file() |
Save JSON data file to a specified location. |
$save_latent_dynamics_files() |
Save diagnostic CSV files to a specified location. |
Method | Description |
$time() |
Report the total run time. |
$output() |
Pretty print the output that was printed to the console. |
$return_codes() |
Return the return codes from the CmdStan runs. |
Method | Description |
$expose_functions() |
Expose Stan functions for use in R. |
$init_model_methods() |
Expose methods for log-probability, gradients, parameter constraining and unconstraining. |
$log_prob() |
Calculate log-prob. |
$grad_log_prob() |
Calculate log-prob and gradient. |
$hessian() |
Calculate log-prob, gradient, and hessian. |
$constrain_variables() |
Transform a set of unconstrained parameter values to the constrained scale. |
$unconstrain_variables() |
Transform a set of parameter values to the unconstrained scale. |
$unconstrain_draws() |
Transform all parameter draws to the unconstrained scale. |
$variable_skeleton() |
Helper function to re-structure a vector of constrained parameter values. |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other fitted model objects:
CmdStanDiagnose
,
CmdStanGQ
,
CmdStanLaplace
,
CmdStanMCMC
,
CmdStanMLE
,
CmdStanPathfinder
Write posterior draws objects to CSV files suitable for running standalone generated quantities with CmdStan.
draws_to_csv( draws, sampler_diagnostics = NULL, dir = tempdir(), basename = "fittedParams" )
draws_to_csv( draws, sampler_diagnostics = NULL, dir = tempdir(), basename = "fittedParams" )
draws |
A |
sampler_diagnostics |
Either |
dir |
(string) An optional path to the directory where the CSV files will be written. If not set, temporary directory is used. |
basename |
(string) If |
draws_to_csv()
generates a CSV suitable for running standalone generated
quantities with CmdStan. The CSV file contains a single comment #num_samples
,
which equals the number of iterations in the supplied draws object.
The comment is followed by the column names. The first column is the lp__
value,
followed by sampler diagnostics and finnaly other variables of the draws object.
#' If the draws object does not contain the lp__
or sampler diagnostics variables,
columns with zeros are created in order to conform with the requirements of the
standalone generated quantities method of CmdStan.
The column names line is finally followed by the values of the draws in the same order as the column names.
Paths to CSV files (one per chain).
## Not run: draws <- posterior::example_draws() draws_csv_files <- draws_to_csv(draws) print(draws_csv_files) # draws_csv_files <- draws_to_csv(draws, # sampler_diagnostic = sampler_diagnostics, # dir = "~/my_folder", # basename = "my-samples") ## End(Not run)
## Not run: draws <- posterior::example_draws() draws_csv_files <- draws_to_csv(draws) print(draws_csv_files) # draws_csv_files <- draws_to_csv(draws, # sampler_diagnostic = sampler_diagnostics, # dir = "~/my_folder", # basename = "my-samples") ## End(Not run)
This provides a knitr engine for Stan, suitable for usage when attempting
to render Stan chunks and compile the model code within to an executable with
CmdStan. Use register_knitr_engine()
to make this the default engine for
stan
chunks. See the vignette
R Markdown CmdStan Engine
for an example.
eng_cmdstan(options)
eng_cmdstan(options)
options |
(named list) Chunk options, as provided by |
## Not run: knitr::knit_engines$set(stan = cmdstanr::eng_cmdstan) ## End(Not run)
## Not run: knitr::knit_engines$set(stan = cmdstanr::eng_cmdstan) ## End(Not run)
stansummary
and diagnose
utilitiesRun CmdStan's stansummary
and diagnose
utilities. These are
documented in the CmdStan Guide:
https://mc-stan.org/docs/cmdstan-guide/stansummary.html
https://mc-stan.org/docs/cmdstan-guide/diagnose.html
Although these methods can be used for models fit using the
$variational()
method, much of the output is
currently only relevant for models fit using the
$sample()
method.
See the $summary() for computing similar summaries in R rather than calling CmdStan's utilites.
cmdstan_summary(flags = NULL) cmdstan_diagnose()
cmdstan_summary(flags = NULL) cmdstan_diagnose()
flags |
An optional character vector of flags (e.g.
|
CmdStanMCMC
, fit-method-summary
## Not run: fit <- cmdstanr_example("logistic") fit$cmdstan_diagnose() fit$cmdstan_summary() ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic") fit$cmdstan_diagnose() fit$cmdstan_summary() ## End(Not run)
Return Stan code
code()
code()
A character vector with one element per line of code.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit <- cmdstanr_example() fit$code() # character vector cat(fit$code(), sep = "\n") # pretty print ## End(Not run)
## Not run: fit <- cmdstanr_example() fit$code() # character vector cat(fit$code(), sep = "\n") # pretty print ## End(Not run)
The $constrain_variables()
method transforms input parameters
to the constrained scale.
constrain_variables( unconstrained_variables, transformed_parameters = TRUE, generated_quantities = TRUE )
constrain_variables( unconstrained_variables, transformed_parameters = TRUE, generated_quantities = TRUE )
unconstrained_variables |
(numeric) A vector of unconstrained parameters to constrain. |
transformed_parameters |
(logical) Whether to return transformed parameters implied by newly-constrained parameters (defaults to TRUE). |
generated_quantities |
(logical) Whether to return generated quantities implied by newly-constrained parameters (defaults to TRUE). |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$constrain_variables(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$constrain_variables(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
Warnings and summaries of sampler diagnostics. To instead get
the underlying values of the sampler diagnostics for each iteration and
chain use the $sampler_diagnostics()
method.
Currently parameter-specific diagnostics like R-hat and effective sample
size are not handled by this method. Those diagnostics are provided via
the $summary()
method (using
posterior::summarize_draws()
).
diagnostic_summary( diagnostics = c("divergences", "treedepth", "ebfmi"), quiet = FALSE )
diagnostic_summary( diagnostics = c("divergences", "treedepth", "ebfmi"), quiet = FALSE )
diagnostics |
(character vector) One or more diagnostics to check. The
currently supported diagnostics are |
quiet |
(logical) Should warning messages about the diagnostics be
suppressed? The default is |
A list with as many named elements as diagnostics
selected. The
possible elements and their values are:
"num_divergent"
: A vector of the number of divergences per chain.
"num_max_treedepth"
: A vector of the number of times max_treedepth
was hit per chain.
"ebfmi"
: A vector of E-BFMI values per chain.
CmdStanMCMC
and the
$sampler_diagnostics()
method
## Not run: fit <- cmdstanr_example("schools") fit$diagnostic_summary() fit$diagnostic_summary(quiet = TRUE) ## End(Not run)
## Not run: fit <- cmdstanr_example("schools") fit$diagnostic_summary() fit$diagnostic_summary(quiet = TRUE) ## End(Not run)
Extract posterior draws after MCMC or approximate posterior draws after variational approximation using formats provided by the posterior package.
The variables include the parameters, transformed parameters, and
generated quantities from the Stan program as well as lp__
, the total
log probability (target
) accumulated in the model block.
draws( variables = NULL, inc_warmup = FALSE, format = getOption("cmdstanr_draws_format") )
draws( variables = NULL, inc_warmup = FALSE, format = getOption("cmdstanr_draws_format") )
variables |
(character vector) Optionally, the names of the variables (parameters, transformed parameters, and generated quantities) to read in.
|
inc_warmup |
(logical) Should warmup draws be included? Defaults to
|
format |
(string) The format of the returned draws or point estimates. Must be a valid format from the posterior package. The defaults are the following.
To use a different format it can be specified as the full name of the
format from the posterior package (e.g. Changing the default format: To change the default format for an entire
R session use Note about efficiency: For models with a large number of parameters
(20k+) we recommend using the |
Depends on the value of format
. The defaults are:
For MCMC, a 3-D
draws_array
object (iteration x chain x
variable).
For standalone generated quantities, a
3-D draws_array
object (iteration x chain x
variable).
For variational inference, a 2-D
draws_matrix
object (draw x variable) because
there are no chains. An additional variable lp_approx__
is also included,
which is the log density of the variational approximation to the posterior
evaluated at each of the draws.
For optimization, a 1-row
draws_matrix
with one column per variable. These
are not actually draws, just point estimates stored in the draws_matrix
format. See $mle()
to extract them as a numeric vector.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: # logistic regression with intercept alpha and coefficients beta fit <- cmdstanr_example("logistic", method = "sample") # returned as 3-D array (see ?posterior::draws_array) draws <- fit$draws() dim(draws) str(draws) # can easily convert to other formats (data frame, matrix, list) # using the posterior package head(posterior::as_draws_matrix(draws)) # or can specify 'format' argument to avoid manual conversion # matrix format combines all chains draws <- fit$draws(format = "matrix") head(draws) # can select specific parameters fit$draws("alpha") fit$draws("beta") # selects entire vector beta fit$draws(c("alpha", "beta[2]")) # can be passed directly to bayesplot plotting functions bayesplot::color_scheme_set("brightblue") bayesplot::mcmc_dens(fit$draws(c("alpha", "beta"))) bayesplot::mcmc_scatter(fit$draws(c("beta[1]", "beta[2]")), alpha = 0.3) # example using variational inference fit <- cmdstanr_example("logistic", method = "variational") head(fit$draws("beta")) # a matrix by default head(fit$draws("beta", format = "df")) ## End(Not run)
## Not run: # logistic regression with intercept alpha and coefficients beta fit <- cmdstanr_example("logistic", method = "sample") # returned as 3-D array (see ?posterior::draws_array) draws <- fit$draws() dim(draws) str(draws) # can easily convert to other formats (data frame, matrix, list) # using the posterior package head(posterior::as_draws_matrix(draws)) # or can specify 'format' argument to avoid manual conversion # matrix format combines all chains draws <- fit$draws(format = "matrix") head(draws) # can select specific parameters fit$draws("alpha") fit$draws("beta") # selects entire vector beta fit$draws(c("alpha", "beta[2]")) # can be passed directly to bayesplot plotting functions bayesplot::color_scheme_set("brightblue") bayesplot::mcmc_dens(fit$draws(c("alpha", "beta"))) bayesplot::mcmc_scatter(fit$draws(c("beta[1]", "beta[2]")), alpha = 0.3) # example using variational inference fit <- cmdstanr_example("logistic", method = "variational") head(fit$draws("beta")) # a matrix by default head(fit$draws("beta", format = "df")) ## End(Not run)
The $grad_log_prob()
method provides access to the Stan
model's log_prob
function and its derivative.
grad_log_prob( unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL )
grad_log_prob( unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL )
unconstrained_variables |
(numeric) A vector of unconstrained parameters. |
jacobian |
(logical) Whether to include the log-density adjustments from un/constraining variables. |
jacobian_adjustment |
Deprecated. Please use |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$grad_log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$grad_log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
Return the data frame containing the gradients for all parameters.
gradients()
gradients()
A list of lists. See Examples.
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
The $hessian()
method provides access to the Stan model's
log_prob
, its derivative, and its hessian.
hessian(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
hessian(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
unconstrained_variables |
(numeric) A vector of unconstrained parameters. |
jacobian |
(logical) Whether to include the log-density adjustments from un/constraining variables. |
jacobian_adjustment |
Deprecated. Please use |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) # fit_mcmc$init_model_methods(hessian = TRUE) # fit_mcmc$hessian(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) # fit_mcmc$init_model_methods(hessian = TRUE) # fit_mcmc$hessian(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
Return user-specified initial values. If the user provided
initial values files or R objects (list of lists or function) via the
init
argument when fitting the model then these are returned (always in
the list of lists format). Currently it is not possible to extract initial
values generated automatically by CmdStan, although CmdStan may support
this in the future.
init()
init()
A list of lists. See Examples.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
## Not run: init_fun <- function() list(alpha = rnorm(1), beta = rnorm(3)) fit <- cmdstanr_example("logistic", init = init_fun, chains = 2) str(fit$init()) # partial inits (only specifying for a subset of parameters) init_list <- list( list(mu = 10, tau = 2), list(mu = -10, tau = 1) ) fit <- cmdstanr_example("schools_ncp", init = init_list, chains = 2, adapt_delta = 0.9) # only user-specified inits returned str(fit$init()) ## End(Not run)
## Not run: init_fun <- function() list(alpha = rnorm(1), beta = rnorm(3)) fit <- cmdstanr_example("logistic", init = init_fun, chains = 2) str(fit$init()) # partial inits (only specifying for a subset of parameters) init_list <- list( list(mu = 10, tau = 2), list(mu = -10, tau = 1) ) fit <- cmdstanr_example("schools_ncp", init = init_list, chains = 2, adapt_delta = 0.9) # only user-specified inits returned str(fit$init()) ## End(Not run)
The $init_model_methods()
method compiles and initializes the
log_prob
, grad_log_prob
, constrain_variables
, unconstrain_variables
and unconstrain_draws
functions. These are then available as methods of
the fitted model object. This requires the additional Rcpp
package,
which are not required for fitting models using
CmdStanR.
Note: there may be many compiler warnings emitted during compilation but these can be ignored so long as they are warnings and not errors.
init_model_methods(seed = 1, verbose = FALSE, hessian = FALSE)
init_model_methods(seed = 1, verbose = FALSE, hessian = FALSE)
seed |
(integer) The random seed to use when initializing the model. |
verbose |
(logical) Whether to show verbose logging during compilation. |
hessian |
(logical) Whether to expose the (experimental) hessian method. |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) ## End(Not run)
Extract the inverse metric (mass matrix) for each MCMC chain.
inv_metric(matrix = TRUE)
inv_metric(matrix = TRUE)
matrix |
(logical) If a diagonal metric was used, setting |
A list of length equal to the number of MCMC chains. See the matrix
argument for details.
## Not run: fit <- cmdstanr_example("logistic") fit$inv_metric() fit$inv_metric(matrix=FALSE) fit <- cmdstanr_example("logistic", metric = "dense_e") fit$inv_metric() ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic") fit$inv_metric() fit$inv_metric(matrix=FALSE) fit <- cmdstanr_example("logistic", metric = "dense_e") fit$inv_metric() ## End(Not run)
The $log_prob()
method provides access to the Stan model's
log_prob
function.
log_prob(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
log_prob(unconstrained_variables, jacobian = TRUE, jacobian_adjustment = NULL)
unconstrained_variables |
(numeric) A vector of unconstrained parameters. |
jacobian |
(logical) Whether to include the log-density adjustments from un/constraining variables. |
jacobian_adjustment |
Deprecated. Please use |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$log_prob(unconstrained_variables = c(0.5, 1.2, 1.1, 2.2)) ## End(Not run)
The $loo()
method computes approximate LOO-CV using the
loo package. In order to use this method you must compute and save
the pointwise log-likelihood in your Stan program. See loo::loo.array()
and the loo package vignettes
for details.
loo(variables = "log_lik", r_eff = TRUE, moment_match = FALSE, ...)
loo(variables = "log_lik", r_eff = TRUE, moment_match = FALSE, ...)
variables |
(character vector) The name(s) of the variable(s) in the
Stan program containing the pointwise log-likelihood. The default is to
look for |
r_eff |
(multiple options) How to handle the
|
moment_match |
(logical) Whether to use a
moment-matching correction for problematic
observations. The default is |
... |
Other arguments (e.g., |
The object returned by loo::loo.array()
or
loo::loo_moment_match.default()
.
The loo package website with documentation and vignettes.
## Not run: # the "logistic" example model has "log_lik" in generated quantities fit <- cmdstanr_example("logistic") loo_result <- fit$loo(cores = 2) print(loo_result) ## End(Not run)
## Not run: # the "logistic" example model has "log_lik" in generated quantities fit <- cmdstanr_example("logistic") loo_result <- fit$loo(cores = 2) print(loo_result) ## End(Not run)
The $lp()
method extracts lp__
, the total log probability
(target
) accumulated in the model block of the Stan program. For
variational inference the log density of the variational approximation to
the posterior is available via the $lp_approx()
method. For
Laplace approximation the unnormalized density of the approximation to
the posterior is available via the $lp_approx()
method.
See the Increment log density and Distribution Statements sections of the Stan Reference Manual for details on when normalizing constants are dropped from log probability calculations.
lp() lp_approx() lp_approx()
lp() lp_approx() lp_approx()
A numeric vector with length equal to the number of (post-warmup)
draws or length equal to 1
for optimization.
lp__
is the unnormalized log density on Stan's unconstrained space.
This will in general be different than the unnormalized model log density
evaluated at a posterior draw (which is on the constrained space). lp__
is
intended to diagnose sampling efficiency and evaluate approximations.
For variational inference lp_approx__
is the log density of the variational
approximation to lp__
(also on the unconstrained space). It is exposed in
the variational method for performing the checks described in Yao et al.
(2018) and implemented in the loo package.
For Laplace approximation lp_approx__
is the unnormalized density of the
Laplace approximation. It can be used to perform the same checks as in the
case of the variational method described in Yao et al. (2018).
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018). Yes, but did it work?: Evaluating variational inference. Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5581–5590.
CmdStanMCMC
, CmdStanMLE
, CmdStanLaplace
, CmdStanVB
## Not run: fit_mcmc <- cmdstanr_example("logistic") head(fit_mcmc$lp()) fit_mle <- cmdstanr_example("logistic", method = "optimize") fit_mle$lp() fit_vb <- cmdstanr_example("logistic", method = "variational") plot(fit_vb$lp(), fit_vb$lp_approx()) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic") head(fit_mcmc$lp()) fit_mle <- cmdstanr_example("logistic", method = "optimize") fit_mle$lp() fit_vb <- cmdstanr_example("logistic", method = "variational") plot(fit_vb$lp(), fit_vb$lp_approx()) ## End(Not run)
The $metadata()
method returns a list of information gathered
from the CSV output files, including the CmdStan configuration used when
fitting the model. See Examples and read_cmdstan_csv()
.
metadata()
metadata()
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") str(fit_mcmc$metadata()) fit_mle <- cmdstanr_example("logistic", method = "optimize") str(fit_mle$metadata()) fit_vb <- cmdstanr_example("logistic", method = "variational") str(fit_vb$metadata()) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") str(fit_mcmc$metadata()) fit_mle <- cmdstanr_example("logistic", method = "optimize") str(fit_mle$metadata()) fit_vb <- cmdstanr_example("logistic", method = "variational") str(fit_vb$metadata()) ## End(Not run)
The $mle()
method is only available for CmdStanMLE
objects.
It returns the penalized maximum likelihood estimate (posterior mode) as a
numeric vector with one element per variable. The returned vector does not
include lp__
, the total log probability (target
) accumulated in the
model block of the Stan program, which is available via the
$lp()
method and also included in the
$draws()
method.
mle(variables = NULL)
mle(variables = NULL)
variables |
(character vector) The variables (parameters, transformed parameters, and generated quantities) to include. If NULL (the default) then all variables are included. |
A numeric vector. See Examples.
## Not run: fit <- cmdstanr_example("logistic", method = "optimize") fit$mle("alpha") fit$mle("beta") fit$mle("beta[2]") ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic", method = "optimize") fit$mle("alpha") fit$mle("beta") fit$mle("beta[2]") ## End(Not run)
The $num_chains()
method returns the number of MCMC chains.
num_chains()
num_chains()
An integer.
## Not run: fit_mcmc <- cmdstanr_example(chains = 2) fit_mcmc$num_chains() ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example(chains = 2) fit_mcmc$num_chains() ## End(Not run)
For MCMC, the $output()
method returns the stdout and stderr
of all chains as a list of character vectors if id=NULL
. If the id
argument is specified it instead pretty prints the console output for a
single chain.
For optimization and variational inference $output()
just pretty prints
the console output.
output(id = NULL)
output(id = NULL)
id |
(integer) The chain id. Ignored if the model was not fit using MCMC. |
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") fit_mcmc$output(1) out <- fit_mcmc$output() str(out) fit_mle <- cmdstanr_example("logistic", method = "optimize") fit_mle$output() fit_vb <- cmdstanr_example("logistic", method = "variational") fit_vb$output() ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") fit_mcmc$output(1) out <- fit_mcmc$output() str(out) fit_mle <- cmdstanr_example("logistic", method = "optimize") fit_mle$output() fit_vb <- cmdstanr_example("logistic", method = "variational") fit_vb$output() ## End(Not run)
The $profiles()
method returns a list of data frames with
profiling data if any profiling data was written to the profile CSV files.
See save_profile_files()
to control where the files are saved.
Support for profiling Stan programs is available with CmdStan >= 2.26 and requires adding profiling statements to the Stan program.
profiles()
profiles()
A list of data frames with profiling data if the profiling CSV files were created.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( 'data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { profile("likelihood") { y ~ bernoulli(theta); } } generated quantities { array[N] int y_rep; profile("gq") { y_rep = bernoulli_rng(rep_vector(theta, N)); } } ' ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) fit$profiles() ## End(Not run)
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( 'data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { profile("likelihood") { y ~ bernoulli(theta); } } generated quantities { array[N] int y_rep; profile("gq") { y_rep = bernoulli_rng(rep_vector(theta, N)); } } ' ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) fit$profiles() ## End(Not run)
The $return_codes()
method returns a vector of return codes
from the CmdStan run(s). A return code of 0 indicates a successful run.
return_codes()
return_codes()
An integer vector of return codes with length equal to the number of CmdStan runs (number of chains for MCMC and one otherwise).
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: # example with return codes all zero fit_mcmc <- cmdstanr_example("schools", method = "sample") fit_mcmc$return_codes() # should be all zero # example of non-zero return code (optimization fails for hierarchical model) fit_opt <- cmdstanr_example("schools", method = "optimize") fit_opt$return_codes() # should be non-zero ## End(Not run)
## Not run: # example with return codes all zero fit_mcmc <- cmdstanr_example("schools", method = "sample") fit_mcmc$return_codes() # should be all zero # example of non-zero return code (optimization fails for hierarchical model) fit_opt <- cmdstanr_example("schools", method = "optimize") fit_opt$return_codes() # should be non-zero ## End(Not run)
Extract the values of sampler diagnostics for each iteration and
chain of MCMC. To instead get summaries of these diagnostics and associated
warning messages use the
$diagnostic_summary()
method.
sampler_diagnostics( inc_warmup = FALSE, format = getOption("cmdstanr_draws_format", "draws_array") )
sampler_diagnostics( inc_warmup = FALSE, format = getOption("cmdstanr_draws_format", "draws_array") )
inc_warmup |
(logical) Should warmup draws be included? Defaults to |
format |
(string) The draws format to return. See draws for details. |
Depends on format
, but the default is a 3-D
draws_array
object (iteration x chain x
variable). The variables for Stan's default MCMC algorithm are
"accept_stat__"
, "stepsize__"
, "treedepth__"
, "n_leapfrog__"
,
"divergent__"
, "energy__"
.
## Not run: fit <- cmdstanr_example("logistic") sampler_diagnostics <- fit$sampler_diagnostics() str(sampler_diagnostics) library(posterior) as_draws_df(sampler_diagnostics) # or specify format to get a data frame instead of calling as_draws_df fit$sampler_diagnostics(format = "df") ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic") sampler_diagnostics <- fit$sampler_diagnostics() str(sampler_diagnostics) library(posterior) as_draws_df(sampler_diagnostics) # or specify format to get a data frame instead of calling as_draws_df fit$sampler_diagnostics(format = "df") ## End(Not run)
This method is a wrapper around base::saveRDS()
that ensures
that all posterior draws and diagnostics are saved when saving a fitted
model object. Because the contents of the CmdStan output CSV files are only
read into R lazily (i.e., as needed), the $save_object()
method is the
safest way to guarantee that everything has been read in before saving.
See the "Saving fitted model objects" sections of the Getting started with CmdStanR vignette for some suggestions on faster model saving for large models.
save_object(file, ...)
save_object(file, ...)
file |
(string) Path where the file should be saved. |
... |
Other arguments to pass to |
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit <- cmdstanr_example("logistic") temp_rds_file <- tempfile(fileext = ".RDS") fit$save_object(file = temp_rds_file) rm(fit) fit <- readRDS(temp_rds_file) fit$summary() ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic") temp_rds_file <- tempfile(fileext = ".RDS") fit$save_object(file = temp_rds_file) rm(fit) fit <- readRDS(temp_rds_file) fit$summary() ## End(Not run)
All fitted model objects have methods for saving (moving to a specified location) the files created by CmdStanR to hold CmdStan output csv files and input data files. These methods move the files from their current location (possibly the temporary directory) to a user-specified location. The paths stored in the fitted model object will also be updated to point to the new file locations.
The versions without the save_
prefix (e.g., $output_files()
) return
the current file paths without moving any files.
save_output_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_latent_dynamics_files( dir = ".", basename = NULL, timestamp = TRUE, random = TRUE ) save_profile_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_data_file(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_config_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_metric_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) output_files(include_failed = FALSE) profile_files(include_failed = FALSE) latent_dynamics_files(include_failed = FALSE) data_file() config_files(include_failed = FALSE) metric_files(include_failed = FALSE)
save_output_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_latent_dynamics_files( dir = ".", basename = NULL, timestamp = TRUE, random = TRUE ) save_profile_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_data_file(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_config_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) save_metric_files(dir = ".", basename = NULL, timestamp = TRUE, random = TRUE) output_files(include_failed = FALSE) profile_files(include_failed = FALSE) latent_dynamics_files(include_failed = FALSE) data_file() config_files(include_failed = FALSE) metric_files(include_failed = FALSE)
dir |
(string) Path to directory where the files should be saved. |
basename |
(string) Base filename to use. See Details. |
timestamp |
(logical) Should a timestamp be added to the file name(s)?
Defaults to |
random |
(logical) Should random alphanumeric characters be added to the
end of the file name(s)? Defaults to |
include_failed |
(logical) Should CmdStan runs that failed also be
included? The default is |
The $save_*
methods print a message with the new file paths and (invisibly)
return a character vector of the new paths (or NA
for any that couldn't be
copied). They also have the side effect of setting the internal paths in the
fitted model object to the new paths.
The methods without the save_
prefix return character vectors of file
paths without moving any files.
For $save_output_files()
the files moved to dir
will have names of
the form basename-timestamp-id-random
, where
basename
is the user's provided basename
argument;
timestamp
is of the form format(Sys.time(), "%Y%m%d%H%M")
;
id
is the MCMC chain id (or 1
for non MCMC);
random
contains six random alphanumeric characters;
For $save_latent_dynamics_files()
everything is the same as for
$save_output_files()
except "-diagnostic-"
is included in the new
file name after basename
.
For $save_profile_files()
everything is the same as for
$save_output_files()
except "-profile-"
is included in the new
file name after basename
.
For $save_metric_files()
everything is the same as for
$save_output_files()
except "-metric-"
is included in the new
file name after basename
.
For $save_config_files()
everything is the same as for
$save_output_files()
except "-config-"
is included in the new
file name after basename
.
For $save_data_file()
no id
is included in the file name because even
with multiple MCMC chains the data file is the same.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit <- cmdstanr_example() fit$output_files() fit$data_file() # just using tempdir for the example my_dir <- tempdir() fit$save_output_files(dir = my_dir, basename = "banana") fit$save_output_files(dir = my_dir, basename = "tomato", timestamp = FALSE) fit$save_output_files(dir = my_dir, basename = "lettuce", timestamp = FALSE, random = FALSE) ## End(Not run)
## Not run: fit <- cmdstanr_example() fit$output_files() fit$data_file() # just using tempdir for the example my_dir <- tempdir() fit$save_output_files(dir = my_dir, basename = "banana") fit$save_output_files(dir = my_dir, basename = "tomato", timestamp = FALSE) fit$save_output_files(dir = my_dir, basename = "lettuce", timestamp = FALSE, random = FALSE) ## End(Not run)
The $summary()
method runs
summarise_draws()
from the posterior
package and returns the output. For MCMC, only post-warmup draws are
included in the summary.
There is also a $print()
method that prints the same summary stats but
removes the extra formatting used for printing tibbles and returns the
fitted model object itself. The $print()
method may also be faster than
$summary()
because it is designed to only compute the summary statistics
for the variables that will actually fit in the printed output whereas
$summary()
will compute them for all of the specified variables in order
to be able to return them to the user. See Examples.
summary(variables = NULL, ...)
summary(variables = NULL, ...)
variables |
(character vector) The variables to include. |
... |
Optional arguments to pass to |
The $summary()
method returns the tibble data frame created by
posterior::summarise_draws()
.
The $print()
method returns the fitted model object itself (invisibly),
which is the standard behavior for print methods in R.
CmdStanMCMC
, CmdStanMLE
, CmdStanLaplace
, CmdStanVB
, CmdStanGQ
## Not run: fit <- cmdstanr_example("logistic") fit$summary() fit$print() fit$print(max_rows = 2) # same as print(fit, max_rows = 2) # include only certain variables fit$summary("beta") fit$print(c("alpha", "beta[2]")) # include all variables but only certain summaries fit$summary(NULL, c("mean", "sd")) # can use functions created from formulas # for example, calculate Pr(beta > 0) fit$summary("beta", prob_gt_0 = ~ mean(. > 0)) # can combine user-specified functions with # the default summary functions fit$summary(variables = c("alpha", "beta"), posterior::default_summary_measures()[1:4], quantiles = ~ quantile2(., probs = c(0.025, 0.975)), posterior::default_convergence_measures() ) # the functions need to calculate the appropriate # value for a matrix input fit$summary(variables = "alpha", dim) # the usual [stats::var()] is therefore not directly suitable as it # will produce a covariance matrix unless the data is converted to a vector fit$print(c("alpha", "beta"), var2 = ~var(as.vector(.x))) ## End(Not run)
## Not run: fit <- cmdstanr_example("logistic") fit$summary() fit$print() fit$print(max_rows = 2) # same as print(fit, max_rows = 2) # include only certain variables fit$summary("beta") fit$print(c("alpha", "beta[2]")) # include all variables but only certain summaries fit$summary(NULL, c("mean", "sd")) # can use functions created from formulas # for example, calculate Pr(beta > 0) fit$summary("beta", prob_gt_0 = ~ mean(. > 0)) # can combine user-specified functions with # the default summary functions fit$summary(variables = c("alpha", "beta"), posterior::default_summary_measures()[1:4], quantiles = ~ quantile2(., probs = c(0.025, 0.975)), posterior::default_convergence_measures() ) # the functions need to calculate the appropriate # value for a matrix input fit$summary(variables = "alpha", dim) # the usual [stats::var()] is therefore not directly suitable as it # will produce a covariance matrix unless the data is converted to a vector fit$print(c("alpha", "beta"), var2 = ~var(as.vector(.x))) ## End(Not run)
Report the run time in seconds. For MCMC additional information
is provided about the run times of individual chains and the warmup and
sampling phases. For Laplace approximation the time only include the time
for drawing the approximate sample and does not include the time
taken to run the $optimize()
method.
time()
time()
A list with elements
total
: (scalar) The total run time. For MCMC this may be different than
the sum of the chain run times if parallelization was used.
chains
: (data frame) For MCMC only, timing info for the individual
chains. The data frame has columns "chain_id"
, "warmup"
, "sampling"
,
and "total"
.
CmdStanMCMC
, CmdStanMLE
, CmdStanVB
, CmdStanGQ
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") fit_mcmc$time() fit_vb <- cmdstanr_example("logistic", method = "variational") fit_vb$time() fit_mle <- cmdstanr_example("logistic", method = "optimize", jacobian = TRUE) fit_mle$time() # use fit_mle to draw samples from laplace approximation fit_laplace <- cmdstanr_example("logistic", method = "laplace", mode = fit_mle) fit_laplace$time() # just time for drawing sample not for running optimize fit_laplace$time()$total + fit_mle$time()$total # total time ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample") fit_mcmc$time() fit_vb <- cmdstanr_example("logistic", method = "variational") fit_vb$time() fit_mle <- cmdstanr_example("logistic", method = "optimize", jacobian = TRUE) fit_mle$time() # use fit_mle to draw samples from laplace approximation fit_laplace <- cmdstanr_example("logistic", method = "laplace", mode = fit_mle) fit_laplace$time() # just time for drawing sample not for running optimize fit_laplace$time()$total + fit_mle$time()$total # total time ## End(Not run)
The $unconstrain_draws()
method transforms all parameter draws
to the unconstrained scale. The method returns a list for each chain,
containing the parameter values from each iteration on the unconstrained
scale. If called with no arguments, then the draws within the fit object
are unconstrained. Alternatively, either an existing draws object or a
character vector of paths to CSV files can be passed.
unconstrain_draws( files = NULL, draws = NULL, format = getOption("cmdstanr_draws_format", "draws_array"), inc_warmup = FALSE )
unconstrain_draws( files = NULL, draws = NULL, format = getOption("cmdstanr_draws_format", "draws_array"), inc_warmup = FALSE )
files |
(character vector) The paths to the CmdStan CSV files. These can be files generated by running CmdStanR or running CmdStan directly. |
draws |
A |
format |
(string) The format of the returned draws. Must be a valid format from the posterior package. |
inc_warmup |
(logical) Should warmup draws be included? Defaults to
|
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) # Unconstrain all internal draws unconstrained_internal_draws <- fit_mcmc$unconstrain_draws() # Unconstrain external CmdStan CSV files unconstrained_csv <- fit_mcmc$unconstrain_draws(files = fit_mcmc$output_files()) # Unconstrain existing draws object unconstrained_draws <- fit_mcmc$unconstrain_draws(draws = fit_mcmc$draws()) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) # Unconstrain all internal draws unconstrained_internal_draws <- fit_mcmc$unconstrain_draws() # Unconstrain external CmdStan CSV files unconstrained_csv <- fit_mcmc$unconstrain_draws(files = fit_mcmc$output_files()) # Unconstrain existing draws object unconstrained_draws <- fit_mcmc$unconstrain_draws(draws = fit_mcmc$draws()) ## End(Not run)
The $unconstrain_variables()
method transforms input
parameters to the unconstrained scale.
unconstrain_variables(variables)
unconstrain_variables(variables)
variables |
(list) A list of parameter values to transform, in the same
format as provided to the |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$unconstrain_variables(list(alpha = 0.5, beta = c(0.7, 1.1, 0.2))) ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$unconstrain_variables(list(alpha = 0.5, beta = c(0.7, 1.1, 0.2))) ## End(Not run)
relist
The $variable_skeleton()
method returns the variable skeleton
needed by utils::relist()
to re-structure a vector of constrained
parameter values to a named list.
variable_skeleton(transformed_parameters = TRUE, generated_quantities = TRUE)
variable_skeleton(transformed_parameters = TRUE, generated_quantities = TRUE)
transformed_parameters |
(logical) Whether to include transformed
parameters in the skeleton (defaults to |
generated_quantities |
(logical) Whether to include generated quantities
in the skeleton (defaults to |
log_prob()
, grad_log_prob()
, constrain_variables()
,
unconstrain_variables()
, unconstrain_draws()
, variable_skeleton()
,
hessian()
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$variable_skeleton() ## End(Not run)
## Not run: fit_mcmc <- cmdstanr_example("logistic", method = "sample", force_recompile = TRUE) fit_mcmc$variable_skeleton() ## End(Not run)
The install_cmdstan()
function attempts to download and
install the latest release of CmdStan.
Installing a previous release or a new release candidate is also possible
by specifying the version
or release_url
argument.
See the first few sections of the CmdStan
installation guide
for details on the C++ toolchain required for installing CmdStan.
The rebuild_cmdstan()
function cleans and rebuilds the CmdStan
installation. Use this function in case of any issues when compiling models.
The cmdstan_make_local()
function is used to read/write makefile flags
and variables from/to the make/local
file of a CmdStan installation.
Writing to the make/local
file can be used to permanently add makefile
flags/variables to an installation. For example adding specific compiler
switches, changing the C++ compiler, etc. A change to the make/local
file
should typically be followed by calling rebuild_cmdstan()
.
The check_cmdstan_toolchain()
function attempts to check for the required
C++ toolchain. It is called internally by install_cmdstan()
but can also
be called directly by the user.
install_cmdstan( dir = NULL, cores = getOption("mc.cores", 2), quiet = FALSE, overwrite = FALSE, timeout = 1200, version = NULL, release_url = NULL, release_file = NULL, cpp_options = list(), check_toolchain = TRUE, wsl = FALSE ) rebuild_cmdstan( dir = cmdstan_path(), cores = getOption("mc.cores", 2), quiet = FALSE, timeout = 600 ) cmdstan_make_local(dir = cmdstan_path(), cpp_options = NULL, append = TRUE) check_cmdstan_toolchain(fix = FALSE, quiet = FALSE)
install_cmdstan( dir = NULL, cores = getOption("mc.cores", 2), quiet = FALSE, overwrite = FALSE, timeout = 1200, version = NULL, release_url = NULL, release_file = NULL, cpp_options = list(), check_toolchain = TRUE, wsl = FALSE ) rebuild_cmdstan( dir = cmdstan_path(), cores = getOption("mc.cores", 2), quiet = FALSE, timeout = 600 ) cmdstan_make_local(dir = cmdstan_path(), cpp_options = NULL, append = TRUE) check_cmdstan_toolchain(fix = FALSE, quiet = FALSE)
dir |
(string) The path to the directory in which to install CmdStan.
The default is to install it in a directory called |
cores |
(integer) The number of CPU cores to use to parallelize building
CmdStan and speed up installation. If |
quiet |
(logical) For |
overwrite |
(logical) Should CmdStan still be downloaded and installed
even if an installation of the same version is found in |
timeout |
(positive real) Timeout (in seconds) for the build stage of the installation. |
version |
(string) The CmdStan release version to install. The default
is |
release_url |
(string) The URL for the specific CmdStan release or
release candidate to install. See https://github.com/stan-dev/cmdstan/releases.
The URL should point to the tarball ( |
release_file |
(string) A file path to a CmdStan release tar.gz file
downloaded from the releases page: https://github.com/stan-dev/cmdstan/releases.
For example: |
cpp_options |
(list) Any makefile flags/variables to be written to
the |
check_toolchain |
(logical) Should |
wsl |
(logical) Should CmdStan be installed and run through the Windows
Subsystem for Linux (WSL). The default is |
append |
(logical) For |
fix |
For |
For cmdstan_make_local()
, if cpp_options=NULL
then the existing
contents of make/local
are returned without writing anything, otherwise
the updated contents are returned.
## Not run: check_cmdstan_toolchain() # install_cmdstan(cores = 4) cpp_options <- list( "CXX" = "clang++", "CXXFLAGS+= -march=native", PRECOMPILED_HEADERS = TRUE ) # cmdstan_make_local(cpp_options = cpp_options) # rebuild_cmdstan() ## End(Not run)
## Not run: check_cmdstan_toolchain() # install_cmdstan(cores = 4) cpp_options <- list( "CXX" = "clang++", "CXXFLAGS+= -march=native", PRECOMPILED_HEADERS = TRUE ) # cmdstan_make_local(cpp_options = cpp_options) # rebuild_cmdstan() ## End(Not run)
The $check_syntax()
method of a CmdStanModel
object
checks the Stan program for syntax errors and returns TRUE
(invisibly) if
parsing succeeds. If invalid syntax in found an error is thrown.
check_syntax( pedantic = FALSE, include_paths = NULL, stanc_options = list(), quiet = FALSE )
check_syntax( pedantic = FALSE, include_paths = NULL, stanc_options = list(), quiet = FALSE )
pedantic |
(logical) Should pedantic mode be turned on? The default is
|
include_paths |
(character vector) Paths to directories where Stan
should look for files specified in |
stanc_options |
(list) Any other Stan-to-C++ transpiler options to be
used when compiling the model. See the documentation for the
|
quiet |
(logical) Should informational messages be suppressed? The
default is |
The $check_syntax()
method returns TRUE
(invisibly) if the model
is valid.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: file <- write_stan_file(" data { int N; array[N] int y; } parameters { // should have <lower=0> but omitting to demonstrate pedantic mode real lambda; } model { y ~ poisson(lambda); } ") mod <- cmdstan_model(file, compile = FALSE) # the program is syntactically correct, however... mod$check_syntax() # pedantic mode will warn that lambda should be constrained to be positive # and that lambda has no prior distribution mod$check_syntax(pedantic = TRUE) ## End(Not run)
## Not run: file <- write_stan_file(" data { int N; array[N] int y; } parameters { // should have <lower=0> but omitting to demonstrate pedantic mode real lambda; } model { y ~ poisson(lambda); } ") mod <- cmdstan_model(file, compile = FALSE) # the program is syntactically correct, however... mod$check_syntax() # pedantic mode will warn that lambda should be constrained to be positive # and that lambda has no prior distribution mod$check_syntax(pedantic = TRUE) ## End(Not run)
The $compile()
method of a CmdStanModel
object checks the
syntax of the Stan program, translates the program to C++, and creates a
compiled executable. To just check the syntax of a Stan program without
compiling it use the $check_syntax()
method
instead.
In most cases the user does not need to explicitly call the $compile()
method as compilation will occur when calling cmdstan_model()
. However it
is possible to set compile=FALSE
in the call to cmdstan_model()
and
subsequently call the $compile()
method directly.
After compilation, the paths to the executable and the .hpp
file
containing the generated C++ code are available via the $exe_file()
and
$hpp_file()
methods. The default is to create the executable in the same
directory as the Stan program and to write the generated C++ code in a
temporary directory. To save the C++ code to a non-temporary location use
$save_hpp_file(dir)
.
compile( quiet = TRUE, dir = NULL, pedantic = FALSE, include_paths = NULL, user_header = NULL, cpp_options = list(), stanc_options = list(), force_recompile = getOption("cmdstanr_force_recompile", default = FALSE), compile_model_methods = FALSE, compile_standalone = FALSE, dry_run = FALSE, compile_hessian_method = FALSE, threads = FALSE )
compile( quiet = TRUE, dir = NULL, pedantic = FALSE, include_paths = NULL, user_header = NULL, cpp_options = list(), stanc_options = list(), force_recompile = getOption("cmdstanr_force_recompile", default = FALSE), compile_model_methods = FALSE, compile_standalone = FALSE, dry_run = FALSE, compile_hessian_method = FALSE, threads = FALSE )
quiet |
(logical) Should the verbose output from CmdStan during
compilation be suppressed? The default is |
dir |
(string) The path to the directory in which to store the CmdStan
executable (or |
pedantic |
(logical) Should pedantic mode be turned on? The default is
|
include_paths |
(character vector) Paths to directories where Stan
should look for files specified in |
user_header |
(string) The path to a C++ file (with a .hpp extension) to compile with the Stan model. |
cpp_options |
(list) Any makefile options to be used when compiling the
model ( |
stanc_options |
(list) Any Stan-to-C++ transpiler options to be used
when compiling the model. See the Examples section below as well as the
|
force_recompile |
(logical) Should the model be recompiled even if was
not modified since last compiled. The default is |
compile_model_methods |
(logical) Compile additional model methods
( |
compile_standalone |
(logical) Should functions in the Stan model be
compiled for use in R? If |
dry_run |
(logical) If |
compile_hessian_method |
(logical) Should the (experimental) |
threads |
Deprecated and will be removed in a future release. Please
turn on threading via |
The $compile()
method is called for its side effect of creating the
executable and adding its path to the CmdStanModel
object, but it also
returns the CmdStanModel
object invisibly.
After compilation, the $exe_file()
, $hpp_file()
, and $save_hpp_file()
methods can be used and return file paths.
The $check_syntax()
method to check
Stan syntax or enable pedantic model without compiling.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") # by default compilation happens when cmdstan_model() is called. # to delay compilation until calling the $compile() method set compile=FALSE mod <- cmdstan_model(file, compile = FALSE) mod$compile() mod$exe_file() # turn on threading support (for using functions that support within-chain parallelization) mod$compile(force_recompile = TRUE, cpp_options = list(stan_threads = TRUE)) mod$exe_file() # turn on pedantic mode (new in Stan v2.24) file_pedantic <- write_stan_file(" parameters { real sigma; // pedantic mode will warn about missing <lower=0> } model { sigma ~ exponential(1); } ") mod <- cmdstan_model(file_pedantic, pedantic = TRUE) ## End(Not run)
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") # by default compilation happens when cmdstan_model() is called. # to delay compilation until calling the $compile() method set compile=FALSE mod <- cmdstan_model(file, compile = FALSE) mod$compile() mod$exe_file() # turn on threading support (for using functions that support within-chain parallelization) mod$compile(force_recompile = TRUE, cpp_options = list(stan_threads = TRUE)) mod$exe_file() # turn on pedantic mode (new in Stan v2.24) file_pedantic <- write_stan_file(" parameters { real sigma; // pedantic mode will warn about missing <lower=0> } model { sigma ~ exponential(1); } ") mod <- cmdstan_model(file_pedantic, pedantic = TRUE) ## End(Not run)
The $diagnose()
method of a CmdStanModel
object
runs Stan's basic diagnostic feature that will calculate the gradients
of the initial state and compare them with gradients calculated by
finite differences. Discrepancies between the two indicate that there is
a problem with the model or initial states or else there is a bug in Stan.
diagnose( data = NULL, seed = NULL, init = NULL, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, epsilon = NULL, error = NULL )
diagnose( data = NULL, seed = NULL, init = NULL, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, epsilon = NULL, error = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
epsilon |
(positive real) The finite difference step size. Default value is 1e-6. |
error |
(positive real) The error threshold. Default value is 1e-6. |
A CmdStanDiagnose
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
## Not run: test <- cmdstanr_example("logistic", method = "diagnose") # retrieve the gradients test$gradients() ## End(Not run)
The $expose_functions()
method of a CmdStanModel
object
will compile the functions in the Stan program's functions
block and
expose them for use in R. This can also be specified via the
compile_standalone
argument to the $compile()
method.
This method is also available for fitted model objects (CmdStanMCMC
, CmdStanVB
, etc.).
See Examples.
Note: there may be many compiler warnings emitted during compilation but these can be ignored so long as they are warnings and not errors.
expose_functions(global = FALSE, verbose = FALSE)
expose_functions(global = FALSE, verbose = FALSE)
global |
(logical) Should the functions be added to the Global
Environment? The default is |
verbose |
(logical) Should detailed information about generated code be
printed to the console? Defaults to |
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: stan_file <- write_stan_file( " functions { real a_plus_b(real a, real b) { return a + b; } } parameters { real x; } model { x ~ std_normal(); } " ) mod <- cmdstan_model(stan_file) mod$expose_functions() mod$functions$a_plus_b(1, 2) fit <- mod$sample(refresh = 0) fit$expose_functions() # already compiled because of above but this would compile them otherwise fit$functions$a_plus_b(1, 2) ## End(Not run)
## Not run: stan_file <- write_stan_file( " functions { real a_plus_b(real a, real b) { return a + b; } } parameters { real x; } model { x ~ std_normal(); } " ) mod <- cmdstan_model(stan_file) mod$expose_functions() mod$functions$a_plus_b(1, 2) fit <- mod$sample(refresh = 0) fit$expose_functions() # already compiled because of above but this would compile them otherwise fit$functions$a_plus_b(1, 2) ## End(Not run)
The $format()
method of a CmdStanModel
object
runs stanc's auto-formatter on the model code. Either saves the formatted
model directly back to the file or prints it for inspection.
format( overwrite_file = FALSE, canonicalize = FALSE, backup = TRUE, max_line_length = NULL, quiet = FALSE )
format( overwrite_file = FALSE, canonicalize = FALSE, backup = TRUE, max_line_length = NULL, quiet = FALSE )
overwrite_file |
(logical) Should the formatted code be written back
to the input model file. The default is |
canonicalize |
(list or logical) Defines whether or not the compiler
should 'canonicalize' the Stan model, removing things like deprecated syntax.
Default is |
backup |
(logical) If |
max_line_length |
(integer) The maximum length of a line when formatting.
The default is |
quiet |
(logical) Should informational messages be suppressed? The
default is |
The $format()
method returns TRUE
(invisibly) if the model
is valid.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: # Example of removing unnecessary whitespace file <- write_stan_file(" data { int N; array[N] int y; } parameters { real lambda; } model { target += poisson_lpmf(y | lambda); } ") # set compile=FALSE then call format to fix old syntax mod <- cmdstan_model(file, compile = FALSE) mod$format(canonicalize = list("deprecations")) # overwrite the original file instead of just printing it mod$format(canonicalize = list("deprecations"), overwrite_file = TRUE) mod$compile() ## End(Not run)
## Not run: # Example of removing unnecessary whitespace file <- write_stan_file(" data { int N; array[N] int y; } parameters { real lambda; } model { target += poisson_lpmf(y | lambda); } ") # set compile=FALSE then call format to fix old syntax mod <- cmdstan_model(file, compile = FALSE) mod$format(canonicalize = list("deprecations")) # overwrite the original file instead of just printing it mod$format(canonicalize = list("deprecations"), overwrite_file = TRUE) mod$compile() ## End(Not run)
The $generate_quantities()
method of a CmdStanModel
object
runs Stan's standalone generated quantities to obtain generated quantities
based on previously fitted parameters.
generate_quantities( fitted_params, data = NULL, seed = NULL, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, parallel_chains = getOption("mc.cores", 1), threads_per_chain = NULL, opencl_ids = NULL )
generate_quantities( fitted_params, data = NULL, seed = NULL, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, parallel_chains = getOption("mc.cores", 1), threads_per_chain = NULL, opencl_ids = NULL )
fitted_params |
(multiple options) The parameter draws to use. One of the following:
NOTE: if you plan on making many calls to |
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
parallel_chains |
(positive integer) The maximum number of MCMC chains
to run in parallel. If |
threads_per_chain |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections within an MCMC chain (e.g., when
using the Stan functions |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
A CmdStanGQ
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); }" ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) # stan program for standalone generated quantities # (could keep model block, but not necessary so removing it) gq_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } generated quantities { array[N] int y_rep = bernoulli_rng(rep_vector(theta, N)); }" ) mod_gq <- cmdstan_model(gq_program) fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123) str(fit_gq$draws()) library(posterior) as_draws_df(fit_gq$draws()) ## End(Not run)
## Not run: # first fit a model using MCMC mcmc_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); }" ) mod_mcmc <- cmdstan_model(mcmc_program) data <- list(N = 10, y = c(1,1,0,0,0,1,0,1,0,0)) fit_mcmc <- mod_mcmc$sample(data = data, seed = 123, refresh = 0) # stan program for standalone generated quantities # (could keep model block, but not necessary so removing it) gq_program <- write_stan_file( "data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } generated quantities { array[N] int y_rep = bernoulli_rng(rep_vector(theta, N)); }" ) mod_gq <- cmdstan_model(gq_program) fit_gq <- mod_gq$generate_quantities(fit_mcmc, data = data, seed = 123) str(fit_gq$draws()) library(posterior) as_draws_df(fit_gq$draws()) ## End(Not run)
The $laplace()
method of a CmdStanModel
object produces a
sample from a normal approximation centered at the mode of a distribution
in the unconstrained space. If the mode is a maximum a posteriori (MAP)
estimate, the samples provide an estimate of the mean and standard
deviation of the posterior distribution. If the mode is a maximum
likelihood estimate (MLE), the sample provides an estimate of the standard
error of the likelihood. Whether the mode is the MAP or MLE depends on
the value of the jacobian
argument when running optimization. See the
CmdStan User’s Guide
for more details.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan.
laplace( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, mode = NULL, opt_args = NULL, jacobian = TRUE, draws = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
laplace( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, mode = NULL, opt_args = NULL, jacobian = TRUE, draws = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
Ignored for this method. |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
threads |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections (e.g., when
using the Stan functions |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
mode |
(multiple options) The mode to center the approximation at. One of the following:
In all cases the total time reported by |
opt_args |
(named list) A named list of optional arguments to pass to
$optimize() if |
jacobian |
(logical) Whether or not to enable the Jacobian adjustment
for constrained parameters. The default is |
draws |
(positive integer) The number of draws to take. |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
save_cmdstan_config |
(logical) When |
A CmdStanLaplace
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) fit_mode <- mod$optimize(data = stan_data, jacobian = TRUE) fit_laplace <- mod$laplace(data = stan_data, mode = fit_mode) fit_laplace$summary() # if mode isn't specified optimize is run internally first fit_laplace <- mod$laplace(data = stan_data) fit_laplace$summary() # plot approximate posterior bayesplot::mcmc_hist(fit_laplace$draws("theta")) ## End(Not run)
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) fit_mode <- mod$optimize(data = stan_data, jacobian = TRUE) fit_laplace <- mod$laplace(data = stan_data, mode = fit_mode) fit_laplace$summary() # if mode isn't specified optimize is run internally first fit_laplace <- mod$laplace(data = stan_data) fit_laplace$summary() # plot approximate posterior bayesplot::mcmc_hist(fit_laplace$draws("theta")) ## End(Not run)
The $optimize()
method of a CmdStanModel
object runs
Stan's optimizer to obtain a (penalized) maximum likelihood estimate or a
maximum a posteriori estimate (if jacobian=TRUE
). See the
CmdStan User's Guide
for more details.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan. See the CmdStan User’s Guide for more details on the
default arguments. The default values can also be obtained by checking the
metadata of an example model, e.g.,
cmdstanr_example(method="optimize")$metadata()
.
optimize( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, algorithm = NULL, jacobian = FALSE, init_alpha = NULL, iter = NULL, tol_obj = NULL, tol_rel_obj = NULL, tol_grad = NULL, tol_rel_grad = NULL, tol_param = NULL, history_size = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
optimize( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, algorithm = NULL, jacobian = FALSE, init_alpha = NULL, iter = NULL, tol_obj = NULL, tol_rel_obj = NULL, tol_grad = NULL, tol_rel_grad = NULL, tol_param = NULL, history_size = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
threads |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections (e.g., when
using the Stan functions |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
algorithm |
(string) The optimization algorithm. One of |
jacobian |
(logical) Whether or not to use the Jacobian adjustment for
constrained variables. For historical reasons, the default is |
init_alpha |
(positive real) The initial step size parameter. |
iter |
(positive integer) The maximum number of iterations. |
tol_obj |
(positive real) Convergence tolerance on changes in objective function value. |
tol_rel_obj |
(positive real) Convergence tolerance on relative changes in objective function value. |
tol_grad |
(positive real) Convergence tolerance on the norm of the gradient. |
tol_rel_grad |
(positive real) Convergence tolerance on the relative norm of the gradient. |
tol_param |
(positive real) Convergence tolerance on changes in parameter value. |
history_size |
(positive integer) The size of the history used when approximating the Hessian. Only available for L-BFGS. |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
save_cmdstan_config |
(logical) When |
A CmdStanMLE
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
The $pathfinder()
method of a CmdStanModel
object runs
Stan's Pathfinder algorithms. Pathfinder is a variational method for
approximately sampling from differentiable log densities. Starting from a
random initialization, Pathfinder locates normal approximations
to the target density along a quasi-Newton optimization path in
the unconstrained space, with local covariance estimated using
the negative inverse Hessian estimates produced by the LBFGS
optimizer. Pathfinder selects the normal approximation with the
lowest estimated Kullback-Leibler (KL) divergence to the true
posterior. Finally Pathfinder draws from that normal
approximation and returns the draws transformed to the
constrained scale. See the
CmdStan User’s Guide
for more details.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan
pathfinder( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, opencl_ids = NULL, num_threads = NULL, init_alpha = NULL, tol_obj = NULL, tol_rel_obj = NULL, tol_grad = NULL, tol_rel_grad = NULL, tol_param = NULL, history_size = NULL, single_path_draws = NULL, draws = NULL, num_paths = 4, max_lbfgs_iters = NULL, num_elbo_draws = NULL, save_single_paths = NULL, psis_resample = NULL, calculate_lp = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
pathfinder( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, opencl_ids = NULL, num_threads = NULL, init_alpha = NULL, tol_obj = NULL, tol_rel_obj = NULL, tol_grad = NULL, tol_rel_grad = NULL, tol_param = NULL, history_size = NULL, single_path_draws = NULL, draws = NULL, num_paths = 4, max_lbfgs_iters = NULL, num_elbo_draws = NULL, save_single_paths = NULL, psis_resample = NULL, calculate_lp = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
num_threads |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections (e.g., for multi-path pathfinder
as well as |
init_alpha |
(positive real) The initial step size parameter. |
tol_obj |
(positive real) Convergence tolerance on changes in objective function value. |
tol_rel_obj |
(positive real) Convergence tolerance on relative changes in objective function value. |
tol_grad |
(positive real) Convergence tolerance on the norm of the gradient. |
tol_rel_grad |
(positive real) Convergence tolerance on the relative norm of the gradient. |
tol_param |
(positive real) Convergence tolerance on changes in parameter value. |
history_size |
(positive integer) The size of the history used when approximating the Hessian. |
single_path_draws |
(positive integer) Number of draws a single
pathfinder should return. The number of draws PSIS sampling samples from
will be equal to |
draws |
(positive integer) Number of draws to return after performing
pareto smooted importance sampling (PSIS). This should be smaller than
|
num_paths |
(positive integer) Number of single pathfinders to run. |
max_lbfgs_iters |
(positive integer) The maximum number of iterations for LBFGS. |
num_elbo_draws |
(positive integer) Number of draws to make when calculating the ELBO of the approximation at each iteration of LBFGS. |
save_single_paths |
(logical) Whether to save the results of single pathfinder runs in multi-pathfinder. |
psis_resample |
(logical) Whether to perform pareto smoothed importance sampling.
If |
calculate_lp |
(logical) Whether to calculate the log probability of the draws.
If |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
save_cmdstan_config |
(logical) When |
A CmdStanPathfinder
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
The $sample()
method of a CmdStanModel
object runs Stan's
main Markov chain Monte Carlo algorithm.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan. See the
CmdStan User’s Guide
for more details.
After model fitting any diagnostics specified via the diagnostics
argument will be checked and warnings will be printed if warranted.
sample( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, chains = 4, parallel_chains = getOption("mc.cores", 1), chain_ids = seq_len(chains), threads_per_chain = NULL, opencl_ids = NULL, iter_warmup = NULL, iter_sampling = NULL, save_warmup = FALSE, thin = NULL, max_treedepth = NULL, adapt_engaged = TRUE, adapt_delta = NULL, step_size = NULL, metric = NULL, metric_file = NULL, inv_metric = NULL, init_buffer = NULL, term_buffer = NULL, window = NULL, fixed_param = FALSE, show_messages = TRUE, show_exceptions = TRUE, diagnostics = c("divergences", "treedepth", "ebfmi"), save_metric = NULL, save_cmdstan_config = NULL, cores = NULL, num_cores = NULL, num_chains = NULL, num_warmup = NULL, num_samples = NULL, validate_csv = NULL, save_extra_diagnostics = NULL, max_depth = NULL, stepsize = NULL )
sample( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, chains = 4, parallel_chains = getOption("mc.cores", 1), chain_ids = seq_len(chains), threads_per_chain = NULL, opencl_ids = NULL, iter_warmup = NULL, iter_sampling = NULL, save_warmup = FALSE, thin = NULL, max_treedepth = NULL, adapt_engaged = TRUE, adapt_delta = NULL, step_size = NULL, metric = NULL, metric_file = NULL, inv_metric = NULL, init_buffer = NULL, term_buffer = NULL, window = NULL, fixed_param = FALSE, show_messages = TRUE, show_exceptions = TRUE, diagnostics = c("divergences", "treedepth", "ebfmi"), save_metric = NULL, save_cmdstan_config = NULL, cores = NULL, num_cores = NULL, num_chains = NULL, num_warmup = NULL, num_samples = NULL, validate_csv = NULL, save_extra_diagnostics = NULL, max_depth = NULL, stepsize = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
chains |
(positive integer) The number of Markov chains to run. The default is 4. |
parallel_chains |
(positive integer) The maximum number of MCMC chains
to run in parallel. If |
chain_ids |
(integer vector) A vector of chain IDs. Must contain as many
unique positive integers as the number of chains. If not set, the default
chain IDs are used (integers starting from |
threads_per_chain |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections within an MCMC chain (e.g., when
using the Stan functions |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
iter_warmup |
(positive integer) The number of warmup iterations to run
per chain. Note: in the CmdStan User's Guide this is referred to as
|
iter_sampling |
(positive integer) The number of post-warmup iterations
to run per chain. Note: in the CmdStan User's Guide this is referred to as
|
save_warmup |
(logical) Should warmup iterations be saved? The default
is |
thin |
(positive integer) The period between saved samples. This should typically be left at its default (no thinning) unless memory is a problem. |
max_treedepth |
(positive integer) The maximum allowed tree depth for the NUTS engine. See the Tree Depth section of the CmdStan User's Guide for more details. |
adapt_engaged |
(logical) Do warmup adaptation? The default is |
adapt_delta |
(real in |
step_size |
(positive real) The initial step size for the discrete approximation to continuous Hamiltonian dynamics. This is further tuned during warmup. |
metric |
(string) One of |
metric_file |
(character vector) The paths to JSON or Rdump files (one
per chain) compatible with CmdStan that contain precomputed inverse
metrics. The |
inv_metric |
(vector, matrix) A vector (if |
init_buffer |
(nonnegative integer) Width of initial fast timestep adaptation interval during warmup. |
term_buffer |
(nonnegative integer) Width of final fast timestep adaptation interval during warmup. |
window |
(nonnegative integer) Initial width of slow timestep/metric adaptation interval. |
fixed_param |
(logical) When |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
diagnostics |
(character vector) The diagnostics to automatically check
and warn about after sampling. Setting this to an empty string These diagnostics are also available after fitting. The
Diagnostics like R-hat and effective sample size are not currently
available via the |
save_metric |
(logical) When |
save_cmdstan_config |
(logical) When |
cores , num_cores , num_chains , num_warmup , num_samples , save_extra_diagnostics , max_depth , stepsize , validate_csv
|
Deprecated and will be removed in a future release. |
A CmdStanMCMC
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample_mpi
,
model-method-variables
,
model-method-variational
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
The $sample_mpi()
method of a CmdStanModel
object is
identical to the $sample()
method but with support for
MPI (message passing interface). The target audience for MPI are
those with large computer clusters. For other users, the
$sample()
method provides both parallelization of
chains and threading support for within-chain parallelization.
In order to use MPI with Stan, an MPI implementation must be installed. For Unix systems the most commonly used implementations are MPICH and OpenMPI. The implementations provide an MPI C++ compiler wrapper (for example mpicxx), which is required to compile the model.
An example of compiling with MPI:
mpi_options = list(STAN_MPI=TRUE, CXX="mpicxx", TBB_CXX_TYPE="gcc") mod = cmdstan_model("model.stan", cpp_options = mpi_options)
The C++ options that must be supplied to the compile call are:
STAN_MPI
: Enables the use of MPI with Stan if TRUE
.
CXX
: The name of the MPI C++ compiler wrapper. Typically "mpicxx"
.
TBB_CXX_TYPE
: The C++ compiler the MPI wrapper wraps. Typically "gcc"
on Linux and "clang"
on macOS.
In the call to the $sample_mpi()
method it is also possible to provide
the name of the MPI launcher (mpi_cmd
, defaulting to "mpiexec"
) and any
other MPI launch arguments (mpi_args
). In most cases, it is enough to
only define the number of processes. To use n_procs
processes specify
mpi_args = list("n" = n_procs)
.
sample_mpi( data = NULL, mpi_cmd = "mpiexec", mpi_args = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, chains = 1, chain_ids = seq_len(chains), iter_warmup = NULL, iter_sampling = NULL, save_warmup = FALSE, thin = NULL, max_treedepth = NULL, adapt_engaged = TRUE, adapt_delta = NULL, step_size = NULL, metric = NULL, metric_file = NULL, inv_metric = NULL, init_buffer = NULL, term_buffer = NULL, window = NULL, fixed_param = FALSE, sig_figs = NULL, show_messages = TRUE, show_exceptions = TRUE, diagnostics = c("divergences", "treedepth", "ebfmi"), save_cmdstan_config = NULL, validate_csv = TRUE )
sample_mpi( data = NULL, mpi_cmd = "mpiexec", mpi_args = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, chains = 1, chain_ids = seq_len(chains), iter_warmup = NULL, iter_sampling = NULL, save_warmup = FALSE, thin = NULL, max_treedepth = NULL, adapt_engaged = TRUE, adapt_delta = NULL, step_size = NULL, metric = NULL, metric_file = NULL, inv_metric = NULL, init_buffer = NULL, term_buffer = NULL, window = NULL, fixed_param = FALSE, sig_figs = NULL, show_messages = TRUE, show_exceptions = TRUE, diagnostics = c("divergences", "treedepth", "ebfmi"), save_cmdstan_config = NULL, validate_csv = TRUE )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
mpi_cmd |
(string) The MPI launcher used for launching MPI
processes. The default launcher is |
mpi_args |
(list) A list of arguments to use when launching MPI
processes. For example, |
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
chains |
(positive integer) The number of Markov chains to run. The default is 4. |
chain_ids |
(integer vector) A vector of chain IDs. Must contain as many
unique positive integers as the number of chains. If not set, the default
chain IDs are used (integers starting from |
iter_warmup |
(positive integer) The number of warmup iterations to run
per chain. Note: in the CmdStan User's Guide this is referred to as
|
iter_sampling |
(positive integer) The number of post-warmup iterations
to run per chain. Note: in the CmdStan User's Guide this is referred to as
|
save_warmup |
(logical) Should warmup iterations be saved? The default
is |
thin |
(positive integer) The period between saved samples. This should typically be left at its default (no thinning) unless memory is a problem. |
max_treedepth |
(positive integer) The maximum allowed tree depth for the NUTS engine. See the Tree Depth section of the CmdStan User's Guide for more details. |
adapt_engaged |
(logical) Do warmup adaptation? The default is |
adapt_delta |
(real in |
step_size |
(positive real) The initial step size for the discrete approximation to continuous Hamiltonian dynamics. This is further tuned during warmup. |
metric |
(string) One of |
metric_file |
(character vector) The paths to JSON or Rdump files (one
per chain) compatible with CmdStan that contain precomputed inverse
metrics. The |
inv_metric |
(vector, matrix) A vector (if |
init_buffer |
(nonnegative integer) Width of initial fast timestep adaptation interval during warmup. |
term_buffer |
(nonnegative integer) Width of final fast timestep adaptation interval during warmup. |
window |
(nonnegative integer) Initial width of slow timestep/metric adaptation interval. |
fixed_param |
(logical) When |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
diagnostics |
(character vector) The diagnostics to automatically check
and warn about after sampling. Setting this to an empty string These diagnostics are also available after fitting. The
Diagnostics like R-hat and effective sample size are not currently
available via the |
save_cmdstan_config |
(logical) When |
validate_csv |
Deprecated. Use |
A CmdStanMCMC
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
The Stan Math Library's documentation (mc-stan.org/math) for more details on MPI support in Stan.
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-variables
,
model-method-variational
## Not run: # mpi_options <- list(STAN_MPI=TRUE, CXX="mpicxx", TBB_CXX_TYPE="gcc") # mod <- cmdstan_model("model.stan", cpp_options = mpi_options) # fit <- mod$sample_mpi(..., mpi_args = list("n" = 4)) ## End(Not run)
## Not run: # mpi_options <- list(STAN_MPI=TRUE, CXX="mpicxx", TBB_CXX_TYPE="gcc") # mod <- cmdstan_model("model.stan", cpp_options = mpi_options) # fit <- mod$sample_mpi(..., mpi_args = list("n" = 4)) ## End(Not run)
The $variables()
method of a CmdStanModel
object returns
a list, each element representing a Stan model block: data
, parameters
,
transformed_parameters
and generated_quantities
.
Each element contains a list of variables, with each variables represented
as a list with infromation on its scalar type (real
or int
) and
number of dimensions.
transformed data
is not included, as variables in that block are not
part of the model's input or output.
variables()
variables()
The $variables()
returns a list with information on input and
output variables for each of the Stan model blocks.
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variational
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") # create a `CmdStanModel` object, compiling the model is not required mod <- cmdstan_model(file, compile = FALSE) mod$variables() ## End(Not run)
## Not run: file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") # create a `CmdStanModel` object, compiling the model is not required mod <- cmdstan_model(file, compile = FALSE) mod$variables() ## End(Not run)
The $variational()
method of a CmdStanModel
object runs
Stan's Automatic Differentiation Variational Inference (ADVI) algorithms.
The approximation is a Gaussian in the unconstrained variable space. Stan
implements two ADVI algorithms: the algorithm="meanfield"
option uses a
fully factorized Gaussian for the approximation; the algorithm="fullrank"
option uses a Gaussian with a full-rank covariance matrix for the
approximation. See the
CmdStan User’s Guide
for more details.
Any argument left as NULL
will default to the default value used by the
installed version of CmdStan.
variational( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, algorithm = NULL, iter = NULL, grad_samples = NULL, elbo_samples = NULL, eta = NULL, adapt_engaged = NULL, adapt_iter = NULL, tol_rel_obj = NULL, eval_elbo = NULL, output_samples = NULL, draws = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
variational( data = NULL, seed = NULL, refresh = NULL, init = NULL, save_latent_dynamics = FALSE, output_dir = getOption("cmdstanr_output_dir"), output_basename = NULL, sig_figs = NULL, threads = NULL, opencl_ids = NULL, algorithm = NULL, iter = NULL, grad_samples = NULL, elbo_samples = NULL, eta = NULL, adapt_engaged = NULL, adapt_iter = NULL, tol_rel_obj = NULL, eval_elbo = NULL, output_samples = NULL, draws = NULL, show_messages = TRUE, show_exceptions = TRUE, save_cmdstan_config = NULL )
data |
(multiple options) The data to use for the variables specified in the data block of the Stan program. One of the following:
|
seed |
(positive integer(s)) A seed for the (P)RNG to pass to CmdStan.
In the case of multi-chain sampling the single |
refresh |
(non-negative integer) The number of iterations between
printed screen updates. If |
init |
(multiple options) The initialization method to use for the variables declared in the parameters block of the Stan program. One of the following:
|
save_latent_dynamics |
(logical) Should auxiliary diagnostic information
about the latent dynamics be written to temporary diagnostic CSV files?
This argument replaces CmdStan's |
output_dir |
(string) A path to a directory where CmdStan should write
its output CSV files. For MCMC there will be one file per chain; for other
methods there will be a single file. For interactive use this can typically
be left at
|
output_basename |
(string) A string to use as a prefix for the names of
the output CSV files of CmdStan. If |
sig_figs |
(positive integer) The number of significant figures used
when storing the output values. By default, CmdStan represent the output
values with 6 significant figures. The upper limit for |
threads |
(positive integer) If the model was
compiled with threading support, the number of
threads to use in parallelized sections (e.g., when using the Stan
functions |
opencl_ids |
(integer vector of length 2) The platform and device IDs of
the OpenCL device to use for fitting. The model must be compiled with
|
algorithm |
(string) The algorithm. Either |
iter |
(positive integer) The maximum number of iterations. |
grad_samples |
(positive integer) The number of samples for Monte Carlo estimate of gradients. |
elbo_samples |
(positive integer) The number of samples for Monte Carlo estimate of ELBO (objective function). |
eta |
(positive real) The step size weighting parameter for adaptive step size sequence. |
adapt_engaged |
(logical) Do warmup adaptation? |
adapt_iter |
(positive integer) The maximum number of adaptation iterations. |
tol_rel_obj |
(positive real) Convergence tolerance on the relative norm of the objective. |
eval_elbo |
(positive integer) Evaluate ELBO every Nth iteration. |
output_samples |
(positive integer) Use |
draws |
(positive integer) Number of approximate posterior samples to draw and save. |
show_messages |
(logical) When |
show_exceptions |
(logical) When |
save_cmdstan_config |
(logical) When |
A CmdStanVB
object.
The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.
The Stan and CmdStan documentation:
Stan documentation: mc-stan.org/users/documentation
CmdStan User’s Guide: mc-stan.org/docs/cmdstan-guide
Other CmdStanModel methods:
model-method-check_syntax
,
model-method-compile
,
model-method-diagnose
,
model-method-expose_functions
,
model-method-format
,
model-method-generate-quantities
,
model-method-laplace
,
model-method-optimize
,
model-method-pathfinder
,
model-method-sample
,
model-method-sample_mpi
,
model-method-variables
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
## Not run: library(cmdstanr) library(posterior) library(bayesplot) color_scheme_set("brightblue") # Set path to CmdStan # (Note: if you installed CmdStan via install_cmdstan() with default settings # then setting the path is unnecessary but the default below should still work. # Otherwise use the `path` argument to specify the location of your # CmdStan installation.) set_cmdstan_path(path = NULL) # Create a CmdStanModel object from a Stan program, # here using the example model that comes with CmdStan file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(file) mod$print() # Print with line numbers. This can be set globally using the # `cmdstanr_print_line_numbers` option. mod$print(line_numbers = TRUE) # Data as a named list (like RStan) stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1)) # Run MCMC using the 'sample' method fit_mcmc <- mod$sample( data = stan_data, seed = 123, chains = 2, parallel_chains = 2 ) # Use 'posterior' package for summaries fit_mcmc$summary() # Check sampling diagnostics fit_mcmc$diagnostic_summary() # Get posterior draws draws <- fit_mcmc$draws() print(draws) # Convert to data frame using posterior::as_draws_df as_draws_df(draws) # Plot posterior using bayesplot (ggplot2) mcmc_hist(fit_mcmc$draws("theta")) # Run 'optimize' method to get a point estimate (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file instead of a list my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.json") fit_optim <- mod$optimize(data = my_data_file, seed = 123) fit_optim$summary() # Run 'optimize' again with 'jacobian=TRUE' and then draw from Laplace approximation # to the posterior fit_optim <- mod$optimize(data = my_data_file, jacobian = TRUE) fit_laplace <- mod$laplace(data = my_data_file, mode = fit_optim, draws = 2000) fit_laplace$summary() # Run 'variational' method to use ADVI to approximate posterior fit_vb <- mod$variational(data = stan_data, seed = 123) fit_vb$summary() mcmc_hist(fit_vb$draws("theta")) # Run 'pathfinder' method, a new alternative to the variational method fit_pf <- mod$pathfinder(data = stan_data, seed = 123) fit_pf$summary() mcmc_hist(fit_pf$draws("theta")) # Run 'pathfinder' again with more paths, fewer draws per path, # better covariance approximation, and fewer LBFGSs iterations fit_pf <- mod$pathfinder(data = stan_data, num_paths=10, single_path_draws=40, history_size=50, max_lbfgs_iters=100) # Specifying initial values as a function fit_mcmc_w_init_fun <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function() list(theta = runif(1)) ) fit_mcmc_w_init_fun_2 <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = function(chain_id) { # silly but demonstrates optional use of chain_id list(theta = 1 / (chain_id + 1)) } ) fit_mcmc_w_init_fun_2$init() # Specifying initial values as a list of lists fit_mcmc_w_init_list <- mod$sample( data = stan_data, seed = 123, chains = 2, refresh = 0, init = list( list(theta = 0.75), # chain 1 list(theta = 0.25) # chain 2 ) ) fit_optim_w_init_list <- mod$optimize( data = stan_data, seed = 123, init = list( list(theta = 0.75) ) ) fit_optim_w_init_list$init() ## End(Not run)
read_cmdstan_csv()
is used internally by CmdStanR to read
CmdStan's output CSV files into R. It can also be used by CmdStan users as
a more flexible and efficient alternative to rstan::read_stan_csv()
. See
the Value section for details on the structure of the returned list.
It is also possible to create CmdStanR's fitted model objects directly from
CmdStan CSV files using the as_cmdstan_fit()
function.
read_cmdstan_csv( files, variables = NULL, sampler_diagnostics = NULL, format = getOption("cmdstanr_draws_format", NULL) ) as_cmdstan_fit( files, check_diagnostics = TRUE, format = getOption("cmdstanr_draws_format") )
read_cmdstan_csv( files, variables = NULL, sampler_diagnostics = NULL, format = getOption("cmdstanr_draws_format", NULL) ) as_cmdstan_fit( files, check_diagnostics = TRUE, format = getOption("cmdstanr_draws_format") )
files |
(character vector) The paths to the CmdStan CSV files. These can be files generated by running CmdStanR or running CmdStan directly. |
variables |
(character vector) Optionally, the names of the variables (parameters, transformed parameters, and generated quantities) to read in.
|
sampler_diagnostics |
(character vector) Works the same way as
|
format |
(string) The format for storing the draws or point estimates. The default depends on the method used to fit the model. See draws for details, in particular the note about speed and memory for models with many parameters. |
check_diagnostics |
(logical) For models fit using MCMC, should
diagnostic checks be performed after reading in the files? The default is
|
as_cmdstan_fit()
returns a CmdStanMCMC, CmdStanMLE, CmdStanLaplace or
CmdStanVB object. Some methods typically defined for those objects will not
work (e.g. save_data_file()
) but the important methods like $summary()
,
$draws()
, $sampler_diagnostics()
and others will work fine.
read_cmdstan_csv()
returns a named list with the following components:
metadata
: A list of the meta information from the run that produced the
CSV file(s). See Examples below.
The other components in the returned list depend on the method that produced the CSV file(s).
For sampling the returned list also includes the following components:
time
: Run time information for the individual chains. The returned object
is the same as for the $time() method except the total run
time can't be inferred from the CSV files (the chains may have been run in
parallel) and is therefore NA
.
inv_metric
: A list (one element per chain) of inverse mass matrices
or their diagonals, depending on the type of metric used.
step_size
: A list (one element per chain) of the step sizes used.
warmup_draws
: If save_warmup
was TRUE
when fitting the model then a
draws_array
(or different format if format
is
specified) of warmup draws.
post_warmup_draws
: A draws_array
(or
different format if format
is specified) of post-warmup draws.
warmup_sampler_diagnostics
: If save_warmup
was TRUE
when fitting the
model then a draws_array
(or different format if
format
is specified) of warmup draws of the sampler diagnostic variables.
post_warmup_sampler_diagnostics
: A
draws_array
(or different format if format
is
specified) of post-warmup draws of the sampler diagnostic variables.
For optimization the returned list also includes the following components:
point_estimates
: Point estimates for the model parameters.
For laplace and variational inference the returned list also includes the following components:
draws
: A draws_matrix
(or different format
if format
is specified) of draws from the approximate posterior
distribution.
For standalone generated quantities the returned list also includes the following components:
generated_quantities
: A draws_array
of
the generated quantities.
## Not run: # Generate some CSV files to use for demonstration fit1 <- cmdstanr_example("logistic", method = "sample", save_warmup = TRUE) csv_files <- fit1$output_files() print(csv_files) # Creating fitting model objects # Create a CmdStanMCMC object from the CSV files fit2 <- as_cmdstan_fit(csv_files) fit2$print("beta") # Using read_cmdstan_csv # # Read in everything x <- read_cmdstan_csv(csv_files) str(x) # Don't read in any of the sampler diagnostic variables x <- read_cmdstan_csv(csv_files, sampler_diagnostics = "") # Don't read in any of the parameters or generated quantities x <- read_cmdstan_csv(csv_files, variables = "") # Read in only specific parameters and sampler diagnostics x <- read_cmdstan_csv( csv_files, variables = c("alpha", "beta[2]"), sampler_diagnostics = c("n_leapfrog__", "accept_stat__") ) # For non-scalar parameters all elements can be selected or only some elements, # e.g. all of the vector "beta" but only one element of the vector "log_lik" x <- read_cmdstan_csv( csv_files, variables = c("beta", "log_lik[3]") ) ## End(Not run)
## Not run: # Generate some CSV files to use for demonstration fit1 <- cmdstanr_example("logistic", method = "sample", save_warmup = TRUE) csv_files <- fit1$output_files() print(csv_files) # Creating fitting model objects # Create a CmdStanMCMC object from the CSV files fit2 <- as_cmdstan_fit(csv_files) fit2$print("beta") # Using read_cmdstan_csv # # Read in everything x <- read_cmdstan_csv(csv_files) str(x) # Don't read in any of the sampler diagnostic variables x <- read_cmdstan_csv(csv_files, sampler_diagnostics = "") # Don't read in any of the parameters or generated quantities x <- read_cmdstan_csv(csv_files, variables = "") # Read in only specific parameters and sampler diagnostics x <- read_cmdstan_csv( csv_files, variables = c("alpha", "beta[2]"), sampler_diagnostics = c("n_leapfrog__", "accept_stat__") ) # For non-scalar parameters all elements can be selected or only some elements, # e.g. all of the vector "beta" but only one element of the vector "log_lik" x <- read_cmdstan_csv( csv_files, variables = c("beta", "log_lik[3]") ) ## End(Not run)
Registers CmdStanR's knitr engine eng_cmdstan()
for processing Stan chunks.
Refer to the vignette
R Markdown CmdStan Engine
for a demonstration.
register_knitr_engine(override = TRUE)
register_knitr_engine(override = TRUE)
override |
(logical) Override knitr's built-in, RStan-based engine for
Stan? The default is |
If override = TRUE
(default), this registers CmdStanR's knitr engine as the
engine for stan
chunks, replacing knitr's built-in, RStan-based engine. If
override = FALSE
, this registers a cmdstan
engine so that both engines
may be used in the same R Markdown document. If the template supports syntax
highlighting for the Stan language, the cmdstan
chunks will have stan
syntax highlighting applied to them.
See the vignette R Markdown CmdStan Engine for an example.
Note: When running chunks interactively in RStudio (e.g. when using
R Notebooks), it has
been observed that the built-in, RStan-based engine is used for stan
chunks even when CmdStanR's engine has been registered in the session. When
the R Markdown document is knit/rendered, the correct engine is used. As a
workaround, when running chunks interactively, it is recommended to use the
override = FALSE
option and change stan
chunks to be cmdstan
chunks.
If you would like to keep stan
chunks as stan
chunks, it is possible to
specify engine = "cmdstan"
in the chunk options after registering the
cmdstan
engine with override = FALSE
.
Use the set_cmdstan_path()
function to tell CmdStanR where the
CmdStan installation in located. Once the path has been set,
cmdstan_path()
will return the full path to the CmdStan installation and
cmdstan_version()
will return the CmdStan version number. See Details
for how to avoid manually setting the path in each R session.
set_cmdstan_path(path = NULL) cmdstan_path() cmdstan_version(error_on_NA = TRUE)
set_cmdstan_path(path = NULL) cmdstan_path() cmdstan_version(error_on_NA = TRUE)
path |
(string) The full file path to the CmdStan installation. If
|
error_on_NA |
(logical) Should an error be thrown if CmdStan is not
found. The default is |
Before the package can be used it needs to know where the CmdStan installation is located. When the package is loaded it tries to help automate this to avoid having to manually set the path every session:
If the environment variable "CMDSTAN"
exists at load time
then its value will be automatically set as the default path to CmdStan for
the R session. If the environment variable "CMDSTAN"
is set, but a valid
CmdStan is not found in the supplied path, the path is treated as a top
folder that contains CmdStan installations. In that case, the CmdStan
installation with the largest version number will be set as the path to
CmdStan for the R session.
If no environment variable is found when loaded but any directory in the
form ".cmdstan/cmdstan-[version]"
(e.g., ".cmdstan/cmdstan-2.23.0"
),
exists in the user's home directory (Sys.getenv("HOME")
, not the current
working directory) then the path to the cmdstan with the largest version
number will be set as the path to CmdStan for the R session. This is the
same as the default directory that install_cmdstan()
would use to install
the latest version of CmdStan.
It is always possible to change the path after loading the package using
set_cmdstan_path(path)
.
A string. Either the file path to the CmdStan installation or the CmdStan version number.
CmdStan version string if available. If CmdStan is not found and
error_on_NA
is FALSE
, cmdstan_version()
returns NULL
.
Convenience function for writing Stan code to a (possibly
temporary) file with a .stan
extension. By default, the
file name is chosen deterministically based on a hash
of the Stan code, and the file is not overwritten if it already has correct
contents. This means that calling this function multiple times with the same
Stan code will reuse the compiled model. This also however means that the
function is potentially not thread-safe. Using hash_salt = Sys.getpid()
should ensure thread-safety in the rare cases when it is needed.
write_stan_file( code, dir = getOption("cmdstanr_write_stan_file_dir", tempdir()), basename = NULL, force_overwrite = FALSE, hash_salt = "" )
write_stan_file( code, dir = getOption("cmdstanr_write_stan_file_dir", tempdir()), basename = NULL, force_overwrite = FALSE, hash_salt = "" )
code |
(character vector) The Stan code to write to the file. This can be a character vector of length one (a string) containing the entire Stan program or a character vector with each element containing one line of the Stan program. |
dir |
(string) An optional path to the directory where the file will be
written. If omitted, a global option |
basename |
(string) If |
force_overwrite |
(logical) If set to |
hash_salt |
(string) Text to add to the model code prior to hashing to
determine the file name if |
The path to the file.
# stan program as a single string stan_program <- " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); } " f <- write_stan_file(stan_program) print(f) lines <- readLines(f) print(lines) cat(lines, sep = "\n") # stan program as character vector of lines f2 <- write_stan_file(lines) identical(readLines(f), readLines(f2))
# stan program as a single string stan_program <- " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { y ~ bernoulli(theta); } " f <- write_stan_file(stan_program) print(f) lines <- readLines(f) print(lines) cat(lines, sep = "\n") # stan program as character vector of lines f2 <- write_stan_file(lines) identical(readLines(f), readLines(f2))
Write data to a JSON file readable by CmdStan
write_stan_json(data, file, always_decimal = FALSE)
write_stan_json(data, file, always_decimal = FALSE)
data |
(list) A named list of R objects. |
file |
(string) The path to where the data file should be written. |
always_decimal |
(logical) Force generate non-integers with decimal
points to better distinguish between integers and floating point values.
If |
write_stan_json()
performs several conversions before writing the JSON
file:
logical
-> integer
(TRUE
-> 1
, FALSE
-> 0
)
data.frame
-> matrix
(via data.matrix()
)
list
-> array
table
-> vector
, matrix
, or array
(depending on dimensions of table)
The list
to array
conversion is intended to make it easier to prepare
the data for certain Stan declarations involving arrays:
vector[J] v[K]
(or equivalently array[K] vector[J] v
as of Stan 2.27)
can be constructed in R as a list with K
elements where each element a
vector of length J
matrix[I,J] v[K]
(or equivalently array[K] matrix[I,J] m
as of Stan
2.27 ) can be constructed in R as a list with K
elements where each element
an IxJ
matrix
These can also be passed in from R as arrays instead of lists but the list
option is provided for convenience. Unfortunately for arrays with more than
one dimension, e.g., vector[J] v[K,L]
(or equivalently
array[K,L] vector[J] v
as of Stan 2.27) it is not possible to use an R
list and an array must be used instead. For this example the array in R
should have dimensions KxLxJ
.
x <- matrix(rnorm(10), 5, 2) y <- rpois(nrow(x), lambda = 10) z <- c(TRUE, FALSE) data <- list(N = nrow(x), K = ncol(x), x = x, y = y, z = z) # write data to json file file <- tempfile(fileext = ".json") write_stan_json(data, file) # check the contents of the file cat(readLines(file), sep = "\n") # demonstrating list to array conversion # suppose x is declared as `vector[3] x[2]` (or equivalently `array[2] vector[3] x`) # we can use a list of length 2 where each element is a vector of length 3 data <- list(x = list(1:3, 4:6)) file <- tempfile(fileext = ".json") write_stan_json(data, file) cat(readLines(file), sep = "\n")
x <- matrix(rnorm(10), 5, 2) y <- rpois(nrow(x), lambda = 10) z <- c(TRUE, FALSE) data <- list(N = nrow(x), K = ncol(x), x = x, y = y, z = z) # write data to json file file <- tempfile(fileext = ".json") write_stan_json(data, file) # check the contents of the file cat(readLines(file), sep = "\n") # demonstrating list to array conversion # suppose x is declared as `vector[3] x[2]` (or equivalently `array[2] vector[3] x`) # we can use a list of length 2 where each element is a vector of length 3 data <- list(x = list(1:3, 4:6)) file <- tempfile(fileext = ".json") write_stan_json(data, file) cat(readLines(file), sep = "\n")