| Title: | Create Mixture Models From Predictive Samples |
|---|---|
| Description: | Combines predictions from individual time series or panel data models into an ensemble using stacking (Yao, Vehtari, Simpson, and Gelman (2018) <doi:10.1214/17-BA1091>) based on the Continuous Ranked Probability Score (CRPS) (Gneiting and Raftery (2007) <doi:10.1198/016214506000001437>) over k-step ahead predictions. Predictions must be predictive distributions represented by samples, typically posterior predictive simulation draws from a Markov chain Monte Carlo (MCMC) algorithm. Given training data with observed values and predictive samples from different models, optimal stacking weights are computed to minimize expected cross-validation predictive error. These weights can then be used to generate samples from the mixture model by drawing from individual model predictions in the correct proportions. |
| Authors: | Nikos Bosse [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-7750-5280>), Yuling Yao [aut], Sam Abbott [aut] (ORCID: <https://orcid.org/0000-0001-8057-8037>), Sebastian Funk [aut] (ORCID: <https://orcid.org/0000-0002-2842-3406>) |
| Maintainer: | Nikos Bosse <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2.9000 |
| Built: | 2026-06-01 07:51:11 UTC |
| Source: | https://github.com/epiforecasts/lopensemble |
given true values and predictive samples from different models, 'crps_weights' returns the stacking weights which produce the ensemble that minimises the Continuos Ranked Probability Score (CRPS).
crps_weights(data, lambda = NULL, gamma = NULL, dirichlet_alpha = 1.001)crps_weights(data, lambda = NULL, gamma = NULL, dirichlet_alpha = 1.001)
data |
a data.frame with the following entries:
|
lambda |
weights given to timepoints. If |
gamma |
weights given to regions. If |
dirichlet_alpha |
prior for the weights. Default is 1.001 |
returns a vector with the model weights
Strictly Proper Scoring Rules, Prediction,and Estimation, Tilmann Gneiting and Adrian E. Raftery, 2007, Journal of the American Statistical Association, Volume 102, 2007 - Issue 477
Using Stacking to Average Bayesian Predictive Distributions, Yuling Yao , Aki Vehtari, Daniel Simpson, and Andrew Gelman, 2018, Bayesian Analysis 13, Number 3, pp. 917–1003
## Not run: library("data.table") splitdate <- as.Date("2020-03-28") data <- setDT(example_data) traindata <- data[date <= splitdate] testdata <- data[date > splitdate] weights <- crps_weights(traindata) ## End(Not run)## Not run: library("data.table") splitdate <- as.Date("2020-03-28") data <- setDT(example_data) traindata <- data[date <= splitdate] testdata <- data[date > splitdate] weights <- crps_weights(traindata) ## End(Not run)
The function takes a data.frame with predictive samples generated from different models as well as weights corresponding to these models as input. It then returns predictive samples from a mixture model generated by stacking the original models using these weights.
mixture_from_samples(data, weights = NULL, ...)mixture_from_samples(data, weights = NULL, ...)
data |
a data.frame with the following entries:
|
weights |
stacking weights used to combine the original model to a mixture model. If NULL (default), weights will first be estimated using [crps_weights()]. |
... |
any additional parameters to pass to [crps_weights()] if 'weights' is NULL. |
data.frame with samples from the mixture model. The following columns are returned:
observed, the true observed values, if they were given as input
predicted, predicted values corresponding to the true values in observed
model, the name of the model used to generate the correspondig predictions
geography (optional), the regions for which predictions are generated. If geography is missing, it will be assumed there are no geographical differenes to take into account. Internally, regions will be ordered alphabetically
date (the date of the corresponding prediction / true value). Also works with numbers to indicate timesteps
Using Stacking to Average Bayesian Predictive Distributions, Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman, 2018, Bayesian Analysis 13, Number 3, pp. 917–1003
## Not run: library("data.table") data <- setDT(example_data) weights <- c(0.2, 0.3, 0.4, 0.1) mix <- mixture_from_samples(data, weights = weights) ## End(Not run)## Not run: library("data.table") data <- setDT(example_data) weights <- c(0.2, 0.3, 0.4, 0.1) mix <- mixture_from_samples(data, weights = weights) ## End(Not run)