Package 'forecast.vocs'

Title: Forecast Case and Sequence Notifications using Variant of Concern Strain Dynamics
Description: Contains models and tools to produce short-term forecasts for both case and sequence notifications assuming circulation of either one or two variants. Tools are also provided to allow the evaluation of the use of sequence data for short-term forecasts in both real-world settings and in user generated scenarios.
Authors: Sam Abbott [aut, cre] , Sebastian Funk [ctb]
Maintainer: Sam Abbott <[email protected]>
License: MIT + file LICENSE
Version: 0.9.0
Built: 2024-12-04 05:44:48 UTC
Source: https://github.com/epiforecasts/forecast.vocs

Help Index


Add the forecast dates to a plot

Description

Add the forecast dates to a plot

Usage

add_forecast_dates(plot, forecast_dates = NULL)

Arguments

plot

ggplot2 object

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

Value

A ggplot2 plot with dates of data unavailability added.

See Also

Plotting functions plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()


Launch shinystan

Description

Launch shinystan an interactive tool for stan model evaluation

Usage

bp_launch_shinystan(fit)

Arguments

fit

List of output as returned by fv_sample().

See Also

Functions to explore and validate models fv_score_forecast(), plot_pairs()

Examples

obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12")
)

dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_delta = 0.99, max_treedepth = 15)
bp_launch_shinystan(fit)

Check a data.frame

Description

Check a data.frame

Usage

check_dataframe(dataframe, req_vars, req_types, rows)

Arguments

dataframe

A data.frame to check.

req_vars

A character vector of variables that are required.

req_types

A character vector of types for each required variable.

rows

Integer specifying the number of rows the data.frame should have.

See Also

Functions used for checking inputs check_observations(), check_param(), check_quantiles()


Check observations are in the correct format

Description

Check observations are in the correct format

Usage

check_observations(obs)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

See Also

Functions used for checking inputs check_dataframe(), check_param(), check_quantiles()

Examples

obs <- latest_obs(germany_covid19_delta_obs)
check_observations(obs)

Check a parameter is the correct type and length

Description

Check a parameter is the correct type and length

Usage

check_param(param, name = "param", type = "numeric", length)

Arguments

param

A parameter to check the format of.

name

A character string naming the variable to check.

type

A character string identifying the allowed parameter type (must be a type with a is.type function except for a Date).

length

Numeric, allowed length of the variable. Defaults to any allowed length.

See Also

Functions used for checking inputs check_dataframe(), check_observations(), check_quantiles()


Check Quantiles Required are Present

Description

Check Quantiles Required are Present

Usage

check_quantiles(posterior, req_probs = c(0.5, 0.95, 0.2, 0.8))

Arguments

posterior

A dataframe containing quantiles identified using the q5 naming scheme.

req_probs

A numeric vector of required probabilities.

See Also

Functions used for checking inputs check_dataframe(), check_observations(), check_param()


Convert to stanfit object

Description

Convert to stanfit object

Usage

convert_to_stanfit(fit)

Arguments

fit

List of output as returned by fv_sample().

Value

The model fit as a stanfit object

See Also

Functions used for postprocessing of model fits extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_delta = 0.99, max_treedepth = 15)
convert_to_stanfit(fit)

Define data availability scenarios

Description

Define data availability scenarios

Usage

define_scenarios(
  seq_lag = 0:3,
  seq_samples = seq(1, by = -0.25, length.out = 4),
  voc_scale = list(c(0, 0.5))
)

Arguments

seq_lag

The number of weeks that sequences lag the date. Default is to test 0 to 3 weeks of lag.

seq_samples

Fraction of samples to include (deterministic scaling). The default is to test all samples down to 25% of samples by 25% increments.

voc_scale

A list of mean and standard deviations to use to inform the prior for additional transmissibility of the VOC variant. The default a uninformed no prior knowledge prior (0, 0.5). adjusted growth (0.74, 0.1).

Value

A data frame of scenario definitions with ids

See Also

Functions to define and create data scenarios generate_obs_scenario(), update_obs_availability()

Examples

define_scenarios()

Extract posterior draws

Description

Extract posterior draws

Usage

extract_draws(fit, ...)

Arguments

fit

A list as produced by fv_sample().

...

Additional parameters passed to cmdstanr::draws()

Value

A cmdstanr::draws() object from the posterior package.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_delta = 0.99, max_treedepth = 15)
extract_draws(fit)

Extract forecast dates

Description

Extract forecast dates based on the availability of both case and sequence data.

Usage

extract_forecast_dates(posterior)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

Value

A data.frame containing at least two vectors: Data unavailable indicating the type of data that is missing, and date giving the date data was last available for.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

p <- fv_example(strains = 2, type = "posterior")

extract_forecast_dates(p)

Filter data based on availability and forecast date

Description

Filter data based on availability and forecast date

Usage

filter_by_availability(
  obs,
  date = max(obs$date),
  seq_date = date,
  case_date = date
)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

date

Date at which to filter. Defaults to the maximum date in obs.

seq_date

Date from which to use available sequence data. Defaults to the date.

case_date

Date from which to use available case data. Defaults to the date.

Value

A data.frame of observations filter for the latest available data for the specified dates of interest.

See Also

Preprocessing functions fv_dow_period(), latest_obs(), piecewise_steps()

Examples

options(mc.cores = 4)
obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- rbind(
  update_obs_availability(obs, seq_lag = 3),
  update_obs_availability(obs, seq_lag = 1)
)
# filter out duplicates and up to the present date
filter_by_availability(dt)

# filter to only use sequence data up the the 12th of June
filter_by_availability(dt, seq_date = "2021-06-12")

# as above but only use
filter_by_availability(dt,
  seq_date = "2021-06-12",
  case_date = "2021-07-01"
)

Forecast using branching processes at a target date

Description

Forecast using branching processes at a target date

Usage

forecast(
  obs,
  forecast_date = max(obs$date),
  seq_date = forecast_date,
  case_date = forecast_date,
  data_list = forecast.vocs::fv_as_data_list,
  inits = forecast.vocs::fv_inits,
  fit = forecast.vocs::fv_sample,
  posterior = forecast.vocs::fv_tidy_posterior,
  extract_forecast = forecast.vocs::fv_extract_forecast,
  horizon = 4,
  r_init = c(0, 0.25),
  r_step = 1,
  r_forecast = TRUE,
  beta = c(0, 0.1),
  lkj = 0.5,
  period = NULL,
  special_periods = c(),
  voc_scale = c(0, 0.2),
  voc_label = "VOC",
  strains = 2,
  variant_relationship = "correlated",
  overdispersion = TRUE,
  models = NULL,
  likelihood = TRUE,
  output_loglik = FALSE,
  debug = FALSE,
  keep_fit = TRUE,
  scale_r = 1,
  digits = 3,
  timespan = 7,
  probs = c(0.05, 0.2, 0.8, 0.95),
  id = 0,
  ...
)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

forecast_date

Date at which to forecast. Defaults to the maximum date in obs.

seq_date

Date from which to use available sequence data. Defaults to the date.

case_date

Date from which to use available case data. Defaults to the date.

data_list

A function that returns a list of data as ingested by the inits and fit function. Must use arguments as defined in fv_as_data_list(). If not supplied the package default fv_as_data_list() is used.

inits

A function that returns a function to samples initial conditions with the same arguments as fv_inits(). If not supplied the package default fv_inits() is used.

fit

A function that fits the supplied model with the same arguments and return values as fv_sample(). If not supplied the package default fv_sample() is used which performs MCMC sampling using cmdstanr.

posterior

A function that summarises the output from the supplied fitting function with the same arguments and return values (depending on the requirement for downstream package functionality to function) as fv_tidy_posterior(). If not supplied the package default fv_tidy_posterior() is used.

extract_forecast

A function that extracts the forecast from the summarised posterior. If not supplied the package default fv_extract_forecast() is used.

horizon

Integer forecast horizon. Defaults to 4.

r_init

Numeric vector of length 2. Mean and standard deviation for the normal prior on the initial log growth rate.

r_step

Integer, defaults to 1. The number of observations between each change in the growth rate.

r_forecast

Logical, defaults TRUE. Should the growth rate be forecast beyond the data horizon.

beta

Numeric vector, defaults to c(0, 0.5). Represents the mean and standard deviation of the normal prior (truncated at 1 and -1) on the weighting in the differenced AR process of the previous difference. Placing a tight prior around zero effectively reduces the AR process to a random walk on the growth rate.

lkj

Numeric defaults to 0.5. The assumed prior covariance between variants growth rates when using the "correlated" model. This sets the shape parameter for the Lewandowski-Kurowicka-Joe (LKJ) prior distribution. If set to 1 assigns a uniform prior for all correlations, values less than 1 indicate increased belief in strong correlations and values greater than 1 indicate increased belief weaker correlations. Our default setting places increased weight on some correlation between strains.

period

Logical defaults to NULL. If specified should be a function that accepts a vector of dates. This can be used to assign periodic effects to dates which will then be adjusted for in the case model. An example is adjusting for day of the week effects for which the fv_dow_period() can be used.

special_periods

A vector of dates to pass to the period function argument with the same name to be treated as "special" for example holidays being treated as sundays in fv_dow_period().

voc_scale

Numeric vector of length 2. Prior mean and standard deviation for the initial growth rate modifier due to the variant of concern.

voc_label

A character string, default to "VOC". Defines the label to assign to variant of concern specific parameters. Example usage is to rename parameters to use variant specific terminology.

strains

Integer number of strains to use. Defaults to 2. Current maximum is 2. A numeric vector can be passed if forecasts from multiple strain models are desired.

variant_relationship

Character string, defaulting to "correlated". Controls the relationship of strains with options being "correlated" (strains growth rates are correlated over time), "scaled" (a fixed scaling between strains), and "independent" (fully independent strains after initial scaling).

overdispersion

Logical, defaults to TRUE. Should the observations used include overdispersion.

models

A model as supplied by fv_model(). If not supplied the default for that strain is used. If multiple strain models are being forecast then models should be a list models.

likelihood

Logical, defaults to TRUE. Should the likelihood be included in the model

output_loglik

Logical, defaults to FALSE. Should the log-likelihood be output. Disabling this will speed up fitting if evaluating the model fit is not required.

debug

Logical, defaults to FALSE. Should within model debug information be returned.

keep_fit

Logical, defaults to TRUE. Should the stan model fit be kept and returned. Dropping this can substantially reduce memory usage in situations where multiple models are being fit.

scale_r

Numeric, defaults to 1. Rescale the timespan over which the growth rate and reproduction number is calculated. An example use case is rescaling the growth rate from weekly to be scaled by the mean of the generation time (for COVID-19 for example this would be 5.5 / 7.

digits

Numeric, defaults to 3. Number of digits to round summary statistics to.

timespan

Integer, defaults to 7. Indicates the number of days between each observation. Defaults to a week.

probs

A vector of numeric probabilities to produce quantile summaries for. By default these are the 5%, 20%, 80%, and 95% quantiles which are also the minimum set required for plotting functions to work (such as plot_cases(), plot_rt(), and plot_voc_frac()).

id

ID to assign to this forecast. Defaults to 0.

...

Additional parameters passed to fv_sample().

Value

A data.frame containing the output of fv_sample() in each row as well as the summarised posterior, forecast and information about the parameters specified.

See Also

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_across_scenarios(), forecast_n_strain(), plot.fv_forecast(), summary.fv_forecast(), unnest_posterior()

Examples

options(mc.cores = 4)

forecasts <- forecast(
  germany_covid19_delta_obs,
  forecast_date = as.Date("2021-06-12"),
  horizon = 4,
  strains = c(1, 2),
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)

# inspect forecasts
forecasts

# extract the model summary
summary(forecasts, type = "model")

# plot case posterior predictions
plot(forecasts, log = TRUE)

# plot voc posterior predictions
plot(forecasts, type = "voc_frac")

# extract the case forecast
summary(forecasts, type = "cases", forecast = TRUE)

Forecast across multiple dates

Description

Forecast across multiple dates

Usage

forecast_across_dates(
  obs,
  forecast_dates = unique(obs[!is.na(seq_available)])$date[-c(1:3)],
  ...
)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

forecast_dates

A list of dates to forecast at.

...

Additional parameters passed to forecast().

Value

A data.table each row containing the output from running forecast() on a single forecast date.

See Also

Functions used for forecasting across models, dates, and scenarios forecast_across_scenarios(), forecast_n_strain(), forecast(), plot.fv_forecast(), summary.fv_forecast(), unnest_posterior()

Examples

library(ggplot2)
options(mc.cores = 4)

forecasts <- forecast_across_dates(
  germany_covid19_delta_obs,
  forecast_dates = c(as.Date("2021-05-01"), as.Date("2021-06-12")),
  horizon = 4,
  strains = 2,
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)

# inspect forecasts
forecasts

# unnest posteriors
posteriors <- unnest_posterior(forecasts)

# plot case posterior predictions
plot_cases(posteriors, log = TRUE) +
  facet_grid(vars(forecast_date), vars(voc_scale))

Forecast across multiple scenarios and dates

Description

Forecast across multiple scenarios and dates

Usage

forecast_across_scenarios(obs, scenarios, ...)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

scenarios

A data.frame of scenarios as produced by define_scenarios(). If an obs variable is present this is used as the scenario data but otherwise generate_obs_scenario() is used to generate this data from the other variables in scenarios.

...

Additional parameters passed to forecast_across_dates().

Value

A data table each rows containing the output from running forecast() on a single scenario for a single forecast date.

See Also

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_n_strain(), forecast(), plot.fv_forecast(), summary.fv_forecast(), unnest_posterior()

Examples

library(ggplot2)
options(mc.cores = 4)

scenarios <- define_scenarios(
  voc_scale = list(c(0, 0.5), c(0.5, 0.25)),
  seq_lag = 1,
  seq_samples = 1
)
scenarios

forecasts <- forecast_across_scenarios(
  germany_covid19_delta_obs,
  scenarios,
  forecast_dates = c(as.Date("2021-05-01"), as.Date("2021-06-12")),
  horizon = 4,
  strains = 2,
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)

# inspect forecasts
forecasts

# unnest posteriors
posteriors <- unnest_posterior(forecasts)

# plot case posterior predictions
plot_cases(posteriors, log = TRUE) +
  facet_grid(vars(forecast_date))

Forecast for a single model and summarise

Description

Forecast for a single model and summarise

Usage

forecast_n_strain(
  data,
  model = NULL,
  inits = forecast.vocs::fv_inits,
  fit = forecast.vocs::fv_sample,
  posterior = forecast.vocs::fv_tidy_posterior,
  extract_forecast = forecast.vocs::fv_extract_forecast,
  strains = 2,
  voc_label = "VOC",
  probs = c(0.05, 0.2, 0.8, 0.95),
  digits = 3,
  scale_r = 1,
  timespan = 7,
  ...
)

Arguments

data

A list of data as produced by fv_as_data_list().

model

A cmdstanr model object as loaded by fv_model().

inits

A function that returns a function to samples initial conditions with the same arguments as fv_inits(). If not supplied the package default fv_inits() is used.

fit

A function that fits the supplied model with the same arguments and return values as fv_sample(). If not supplied the package default fv_sample() is used which performs MCMC sampling using cmdstanr.

posterior

A function that summarises the output from the supplied fitting function with the same arguments and return values (depending on the requirement for downstream package functionality to function) as fv_tidy_posterior(). If not supplied the package default fv_tidy_posterior() is used.

extract_forecast

A function that extracts the forecast from the summarised posterior. If not supplied the package default fv_extract_forecast() is used.

strains

Integer number of strains. Defaults to 2. Current maximum is 2.

voc_label

A character string, default to "VOC". Defines the label to assign to variant of concern specific parameters. Example usage is to rename parameters to use variant specific terminology.

probs

A vector of numeric probabilities to produce quantile summaries for. By default these are the 5%, 20%, 80%, and 95% quantiles which are also the minimum set required for plotting functions to work (such as plot_cases(), plot_rt(), and plot_voc_frac()).

digits

Numeric, defaults to 3. Number of digits to round summary statistics to.

scale_r

Numeric, defaults to 1. Rescale the timespan over which the growth rate and reproduction number is calculated. An example use case is rescaling the growth rate from weekly to be scaled by the mean of the generation time (for COVID-19 for example this would be 5.5 / 7.

timespan

Integer, defaults to 7. Indicates the number of days between each observation. Defaults to a week.

...

Additional parameters passed to fv_sample().

See Also

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_across_scenarios(), forecast(), plot.fv_forecast(), summary.fv_forecast(), unnest_posterior()


Format data for use with stan

Description

Format data for use with stan

Usage

fv_as_data_list(
  obs,
  horizon = 4,
  r_init = c(0, 0.25),
  r_step = 1,
  r_forecast = TRUE,
  beta = c(0, 0.5),
  lkj = 0.5,
  voc_scale = c(0, 0.2),
  period = NULL,
  special_periods = c(),
  variant_relationship = "correlated",
  overdispersion = TRUE,
  likelihood = TRUE,
  output_loglik = TRUE,
  debug = FALSE
)

Arguments

obs

A data frame with the following variables: date, cases, seq_voc, and seq_total.

horizon

Integer forecast horizon. Defaults to 4.

r_init

Numeric vector of length 2. Mean and standard deviation for the normal prior on the initial log growth rate.

r_step

Integer, defaults to 1. The number of observations between each change in the growth rate.

r_forecast

Logical, defaults TRUE. Should the growth rate be forecast beyond the data horizon.

beta

Numeric vector, defaults to c(0, 0.5). Represents the mean and standard deviation of the normal prior (truncated at 1 and -1) on the weighting in the differenced AR process of the previous difference. Placing a tight prior around zero effectively reduces the AR process to a random walk on the growth rate.

lkj

Numeric defaults to 0.5. The assumed prior covariance between variants growth rates when using the "correlated" model. This sets the shape parameter for the Lewandowski-Kurowicka-Joe (LKJ) prior distribution. If set to 1 assigns a uniform prior for all correlations, values less than 1 indicate increased belief in strong correlations and values greater than 1 indicate increased belief weaker correlations. Our default setting places increased weight on some correlation between strains.

voc_scale

Numeric vector of length 2. Prior mean and standard deviation for the initial growth rate modifier due to the variant of concern.

period

Logical defaults to NULL. If specified should be a function that accepts a vector of dates. This can be used to assign periodic effects to dates which will then be adjusted for in the case model. An example is adjusting for day of the week effects for which the fv_dow_period() can be used.

special_periods

A vector of dates to pass to the period function argument with the same name to be treated as "special" for example holidays being treated as sundays in fv_dow_period().

variant_relationship

Character string, defaulting to "correlated". Controls the relationship of strains with options being "correlated" (strains growth rates are correlated over time), "scaled" (a fixed scaling between strains), and "independent" (fully independent strains after initial scaling).

overdispersion

Logical, defaults to TRUE. Should the observations used include overdispersion.

likelihood

Logical, defaults to TRUE. Should the likelihood be included in the model

output_loglik

Logical, defaults to FALSE. Should the log-likelihood be output. Disabling this will speed up fitting if evaluating the model fit is not required.

debug

Logical, defaults to FALSE. Should within model debug information be returned.

Value

A list as required by stan.

See Also

Functions used for modelling fv_inits(), fv_model(), fv_sample()

Examples

fv_as_data_list(latest_obs(germany_covid19_delta_obs))

Calculate the day of the week periodicity

Description

This helper function allows the user to generate a vector of day of the the week periods.

Usage

fv_dow_period(t, start_date, specials = c(), special_to = "Sunday")

Arguments

t

An integer indicating the number of dates

start_date

A date indicating the start date

specials

A vector of special dates to modify the day of the week for.

special_to

A character string indicating which day of the week or other label to assign holidays. By default this is set to "Sunday"

Value

A vector indicating the period of the dates.

See Also

Preprocessing functions filter_by_availability(), latest_obs(), piecewise_steps()

Examples

fv_dow_period(t = 10, start_date = as.Date("2021-12-01"))

Load a package example

Description

Loads examples of posterior and forecast summaries produced using example scripts. Used to streamline examples, in package tests and to enable users to explore package functionality without needing to install cmdstanr.

Usage

fv_example(strains = 1, type = "posterior")

Arguments

strains

Integer number of strains. Defaults to 2. Current maximum is 2.

type

A character string indicating the example to load. Supported options are "posterior", "forecast", "observations", and "script" which are the output of fv_tidy_posterior(), fv_extract_forecast(), filter_by_availability (with the date argument set to "2021-08-26" applied to the germany_covid19_delta_obs package dataset), and the script used to generate these examples respectively.

Value

A data.table of summarised output

See Also

Package data sets germany_covid19_delta_obs

Examples

# Load the summarised posterior from an example fit of the one strain model
fv_example(strains = 1, type = "posterior")

# Load the summarised forecast from this posterior
fv_example(strains = 1, type = "forecast")

# Load the script used to generate these examples
# Optionally source this script to regenerate the example
readLines(fv_example(strains = 1, type = "script"))

Extract forecasts from a summarised posterior

Description

Uses the observed variable returned by fv_tidy_posterior() to return posterior predictions for forecast dates only.

Usage

fv_extract_forecast(posterior)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

Value

A data.frame of forecasts in the format returned by fv_tidy_posterior() but with fitting variables dropped.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

p <- fv_example(strains = 2, type = "posterior")

fv_extract_forecast(p)

Set up initial conditions for model

Description

Set up initial conditions for model

Usage

fv_inits(data, strains = 2)

Arguments

data

A list of data as produced by fv_as_data_list().

strains

Integer number of strains. Defaults to 2. Current maximum is 2.

Value

A function that when called returns a list of initial conditions for the package stan models.

See Also

Functions used for modelling fv_as_data_list(), fv_model(), fv_sample()

Examples

dt <- fv_as_data_list(latest_obs(germany_covid19_delta_obs))
inits <- fv_inits(dt)
inits
inits()

Load and compile a strain model

Description

Load and compile a strain model

Usage

fv_model(model, include, strains = 2, compile = TRUE, verbose = FALSE, ...)

Arguments

model

A character string indicating the path to the model. If not supplied the package default model is used.

include

A character string specifying the path to any stan files to include in the model. If missing the package default is used.

strains

Integer number of strains. Defaults to 2. Current maximum is 2.

compile

Logical, defaults to TRUE. Should the model be loaded and compiled using cmdstanr::cmdstan_model().

verbose

Logical, defaults to TRUE. Should verbose messages be shown.

...

Additional arguments passed to cmdstanr::cmdstan_model().

Value

A cmdstanr model.

See Also

Functions used for modelling fv_as_data_list(), fv_inits(), fv_sample()

Examples

# one strain model
mod <- fv_model(strains = 1)

# two strain model
two_strain_mod <- fv_model(strains = 2)

Summarise the posterior

Description

A generic wrapper around posterior::summarise_draws() with opinionated defaults. See fv_tidy_posterior() for a more opinionated wrapper with further cleaning and tidying including linking to observed data, tidying parameter names, and transforming parameters for interpretability.

Usage

fv_posterior(fit, probs = c(0.05, 0.2, 0.8, 0.95), digits = 3, ...)

Arguments

fit

List of output as returned by fv_sample().

probs

A vector of numeric probabilities to produce quantile summaries for. By default these are the 5%, 20%, 80%, and 95% quantiles which are also the minimum set required for plotting functions to work (such as plot_cases(), plot_rt(), and plot_voc_frac()).

digits

Numeric, defaults to 3. Number of digits to round summary statistics to.

...

Additional arguments that may be passed but will not be used.

Value

A dataframe summarising the model posterior.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

options(mc.cores = 4)
obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_delta = 0.99, max_treedepth = 15)
fv_posterior(fit)

Fit a brancing process strain model

Description

Fit a brancing process strain model

Usage

fv_sample(
  data,
  model = forecast.vocs::fv_model(strains = 2),
  diagnostics = TRUE,
  ...
)

Arguments

data

A list of data as produced by fv_as_data_list().

model

A cmdstanr model object as loaded by fv_model().

diagnostics

Logical, defaults to TRUE. Should fitting diagnostics be returned as a data.frame.

...

Additional parameters passed to the sample method of cmdstanr.

Value

A data.frame containing the cmdstanr fit, the input data, the fitting arguments, and optionally summary diagnostics.

See Also

Functions used for modelling fv_as_data_list(), fv_inits(), fv_model()

Examples

options(mc.cores = 4)

# format example data
obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)

# single strain model
inits <- fv_inits(dt, strains = 1)
mod <- fv_model(strains = 1)
fit <- fv_sample(
  dt,
  model = mod, init = inits,
  adapt_delta = 0.99, max_treedepth = 15
)
fit

# two strain model
inits <- fv_inits(dt, strains = 2)

mod <- fv_model(strains = 2)

two_strain_fit <- fv_sample(dt,
  model = mod, init = inits,
  adapt_delta = 0.99, max_treedepth = 15
)
two_strain_fit

Evaluate forecasts using proper scoring rules

Description

Acts as a wrapper to scoringutils::score(). In particular, handling filtering the output for various forecast.vocs functions and linking this output to observed data. See the documentation for the scoringutils package for more on forecast scoring and the documentation and examples below for simple examples in the context of forecast.vocs. Internally name clashes between scoringutils variables and forecast.vocs variables are handled.

Usage

fv_score_forecast(forecast, obs, log = FALSE, check = TRUE, round_to = 3, ...)

Arguments

forecast

A posterior forecast or posterior prediction as returned by summary.fv_posterior(), summary.fv_forecast() or fv_extract_forecast(). Internally case forecasts are filtered for using the value_type variable if present as are only overall or combined case counts (i.e as returned) by the 1 and 2 strain models. If looking for more complex scoring it may be wise to implement a custom wrapper.

obs

A data frame of observed data as produced by latest_obs().

log

Logical, defaults to FALSE. Should scores be calculated on the log scale (with a 0.01 shift) for both observations and forecasts. Scoring in this way can be thought of as a relative score vs the more usual absolute measure. It may be useful when targets are on very different scales or when the forecaster is more interested in good all round performance versus good performance for targets with large values.

check

Logical, defaults to FALSE. Should scoringutils::check_forecasts() be used to check input forecasts.

round_to

Integer defaults to 3. Number of digits to round scoring output to.

...

Additional arguments passed to scoringutils::score().

Value

A data.table as returned by scoringutils::score().

See Also

Functions to explore and validate models bp_launch_shinystan(), plot_pairs()

Examples

options(mc.cores = 4)
library(data.table)
library(scoringutils)

# Fit and forecast using both the one and two strain models
forecasts <- forecast(
  germany_covid19_delta_obs,
  forecast_date = as.Date("2021-06-12"),
  horizon = 4,
  strains = c(1, 2),
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)

# Extract forecasts
forecasts <- summary(forecasts, target = "forecast", type = "cases")

# Filter for the latest available observations
obs <- latest_obs(germany_covid19_delta_obs)

# score on the absolute scale
scores <- fv_score_forecast(forecasts, obs)
summarise_scores(scores, by = "strains")

# score overall on a log scale
log_scores <- fv_score_forecast(forecasts, obs, log = TRUE)
summarise_scores(log_scores, by = "strains")

# score by horizon
summarise_scores(scores, by = c("strains", "horizon"))

# score by horizon on a log scale
summarise_scores(log_scores, by = c("strains", "horizon"))

Summarise the posterior tidily

Description

A very opinionated wrapper around posterior::summarise_draws() with cleaning and tidying including linking to observed data, tidying parameter names, and transforming parameters for interpretability. See fv_posterior() for a more generic solution.

Usage

fv_tidy_posterior(
  fit,
  probs = c(0.05, 0.2, 0.8, 0.95),
  digits = 3,
  voc_label = "VOC",
  scale_r = 1,
  timespan = 7
)

Arguments

fit

List of output as returned by fv_sample().

probs

A vector of numeric probabilities to produce quantile summaries for. By default these are the 5%, 20%, 80%, and 95% quantiles which are also the minimum set required for plotting functions to work (such as plot_cases(), plot_rt(), and plot_voc_frac()).

digits

Numeric, defaults to 3. Number of digits to round summary statistics to.

voc_label

A character string, default to "VOC". Defines the label to assign to variant of concern specific parameters. Example usage is to rename parameters to use variant specific terminology.

scale_r

Numeric, defaults to 1. Rescale the timespan over which the growth rate and reproduction number is calculated. An example use case is rescaling the growth rate from weekly to be scaled by the mean of the generation time (for COVID-19 for example this would be 5.5 / 7.

timespan

Integer, defaults to 7. Indicates the number of days between each observation. Defaults to a week.

Value

A dataframe summarising the model posterior. Output is stratified by value_type with posterior summaries by case, voc, voc advantage vs non-voc over time, rt, growth, model, and the raw posterior summary.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

options(mc.cores = 4)
obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_delta = 0.99, max_treedepth = 15)
fv_tidy_posterior(fit)

Generate Simulated Observations

Description

Generate simulated observations from the prior or posterior distributions of a forecast.vocs model. When a single strain model is used only case data is simulated but when a multiple strain model is used sequence data is also simulated.

Usage

generate_obs(
  obs,
  strains = 2,
  model = forecast.vocs::fv_model(strains = strains),
  data_list = forecast.vocs::fv_as_data_list,
  inits = forecast.vocs::fv_inits,
  fit = forecast.vocs::fv_sample,
  type = "prior",
  datasets = 10,
  ...
)

Arguments

obs

Observed data to use to parameterise the model and used for fitting when the posterior is required.

strains

Integer number of strains to use. Defaults to 2. Current maximum is 2. A numeric vector can be passed if forecasts from multiple strain models are desired.

model

A cmdstanr model object as loaded by fv_model().

data_list

A function that returns a list of data as ingested by the inits and fit function. Must use arguments as defined in fv_as_data_list(). If not supplied the package default fv_as_data_list() is used.

inits

A function that returns a function to samples initial conditions with the same arguments as fv_inits(). If not supplied the package default fv_inits() is used.

fit

A function that fits the supplied model with the same arguments and return values as fv_sample(). If not supplied the package default fv_sample() is used which performs MCMC sampling using cmdstanr.

type

A character string indicating the type of data to generate. Supported options are data based on the "prior" or data based on the "posterior" with the default being the prior.

datasets

Numeric, defaults to 10. Number of datasets to generate.

...

Additional arguments to pass fv_as_data_list().

Value

A dataframe with a sampled dataset on each row with the following variables: parameters (prior/posterior parameters used to generate the data), obs (simulated observed data), data, (the simulated data formatted using the supplied data_list function (by default fv_as_data_list()) with the same arguments as specified for simulation).

See Also

Functions to generate simulated data sample_sequences()

Examples

options(mc.cores = 4)
obs <- latest_obs(germany_covid19_delta_obs)

sim_obs <- generate_obs(obs, voc_scale = c(0.8, 0.1), r_init = c(-0.1, 0.05))

# fit a simulated dataset
sim_dt <- sim_obs$data[[1]]
inits <- fv_inits(sim_dt)
fit <- fv_sample(
  sim_dt,
  init = inits, adapt_delta = 0.95, max_treedepth = 15
)

# summarise and plot simualated fit
posterior <- fv_tidy_posterior(fit)

plot_cases(posterior, log = TRUE)

plot_voc(posterior)

plot_rt(posterior)

Define observed data for a scenario

Description

Define observed data for a scenario

Usage

generate_obs_scenario(obs, seq_lag, seq_samples)

Arguments

obs

A dataframe of observations as returned by latest_obs() or similar.

seq_lag

Number, number of weeks to lag sequence data behind date of observation.

seq_samples

Fraction of sequence samples to include.

Value

A data.frame of scenario definitions with ids

See Also

Functions to define and create data scenarios define_scenarios(), update_obs_availability()

Examples

generate_obs_scenario(latest_obs(germany_covid19_delta_obs), 4, 0.1)

Test positive COVID-19 cases and sequences in Germany

Description

Test positive COVID-19 cases and sequences stratified by voc variant status summarised by week for Germany. Data is sourced from the RKI via the Germany/Poland forecasting hub.

Usage

germany_covid19_delta_obs

Format

An object of class data.table (inherits from data.frame) with 114 rows and 9 columns.

Value

A data.table with the following variables: date, location_name, location, cases, seq_total, seq_voc, share_voc, cases_available, and seq_available.

See Also

Package data sets fv_example()


Filter for latest observations of all types

Description

Filter for latest observations of all types

Usage

latest_obs(obs)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

Value

A data.frame of observations filtered for the latest available data.

See Also

Preprocessing functions filter_by_availability(), fv_dow_period(), piecewise_steps()

Examples

dt <- rbind(
  update_obs_availability(germany_covid19_delta_obs, seq_lag = 3),
  update_obs_availability(germany_covid19_delta_obs, seq_lag = 1)
)
latest_obs(dt)

Calculate piecewise steps

Description

This helper function streamlines the calculation of piecewise steps. This may be useful when specifying random walks, AR processes, etc.

Usage

piecewise_steps(t, step, offset = 0, steps_post_offset = TRUE)

Arguments

t

Integer, the timespan over which to calculate steps

step

Integer, the frequency at which to step.

offset

Integer, the amount to offset steps. This can be used to index steps from this index.

steps_post_offset

Logical, defaults to TRUE. Should steps be added after the offset.

Value

A list containing two elements: n (the number of steps) and steps the location of steps as a binary variable.

See Also

Preprocessing functions filter_by_availability(), fv_dow_period(), latest_obs()


Plot the posterior prediction for cases

Description

Plot the posterior prediction for cases

Usage

plot_cases(
  posterior,
  obs = NULL,
  forecast_dates = NULL,
  all_obs = FALSE,
  central = FALSE,
  col = NULL,
  log = TRUE
)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

obs

A data frame of observed data as produced by latest_obs().

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

all_obs

Logical, defaults to FALSE. Should all observations be plot or just those in the date range of the estimates being plot.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

col

A character string denoting the variable to use to stratify the ribbon plot. Defaults to "type" which indicates the data stream.

log

Logical, defaults to TRUE. Should cases be plot on the log 2 scale?

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")

# default with log transform
plot_cases(posterior)

# without log transform
plot_cases(posterior, log = FALSE)

Default posterior plot

Description

Default posterior plot

Usage

plot_default(
  posterior,
  target,
  obs = NULL,
  forecast_dates = NULL,
  central = FALSE,
  all_obs = FALSE,
  ...
)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

target

A character string indicating which variable to extract from the posterior list.

obs

A data frame of observed data as produced by latest_obs().

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

all_obs

Logical, defaults to FALSE. Should all observations be plot or just those in the date range of the estimates being plot.

...

Additional arguments passed to ggplot2::aes()

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()


Plot the posterior prediction for the growth rate

Description

Plot the posterior prediction for the growth rate

Usage

plot_growth(posterior, forecast_dates = NULL, central = FALSE, col = NULL)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

col

A character string denoting the variable to use to stratify the ribbon plot. Defaults to "type" which indicates the data stream.

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
plot_growth(posterior)

Pairs plot of parameters of interest and fitting diagnostics

Description

Pairs plot of parameters of interest and fitting diagnostics

Usage

plot_pairs(
  fit,
  pars = c("r_init", "r_scale", "beta", "voc_beta", "voc_scale[1]", "init_cases[1]",
    "init_cases[2]", "eta[1]", "voc_eta[1]", "sqrt_phi[1]", "sqrt_phi[2]", "sqrt_phi"),
  diagnostics = TRUE,
  ...
)

Arguments

fit

List of output as returned by fv_sample().

pars

Character vector of parameters to try and include in the plot. Will only be included if present in the fitted model.

diagnostics

Logical, defaults to TRUE. Should fitting diagnostics be returned as a data.frame.

...

Additional parameters passed to bayesplot::mcmc_pairs().

Value

A ggplot2 based pairs plot of parameters of interest

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Functions to explore and validate models bp_launch_shinystan(), fv_score_forecast()

Examples

obs <- filter_by_availability(
  germany_covid19_delta_obs,
  date = as.Date("2021-06-12"),
)
dt <- fv_as_data_list(obs)
inits <- fv_inits(dt)
fit <- fv_sample(dt, init = inits, adapt_voc = 0.99, max_treedepth = 15)
plot_pairs(fit)

Plot posterior predictions

Description

Plot posterior predictions

Usage

plot_posterior(
  posterior,
  obs = NULL,
  forecast_dates = NULL,
  central = FALSE,
  all_obs = FALSE,
  voc_label = "variant of concern"
)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

obs

A data frame of observed data as produced by latest_obs().

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

all_obs

Logical, defaults to FALSE. Should all observations be plot or just those in the date range of the estimates being plot.

voc_label

Character string giving the name to assign to the variant of concern. Defaults to "variant of concern".

Value

A named list of all supported package plots with sensible defaults.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
plot_posterior(posterior)

Plot the posterior prediction for the reproduction number

Description

Plot the posterior prediction for the reproduction number

Usage

plot_rt(posterior, forecast_dates = NULL, central = FALSE, col = NULL)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

col

A character string denoting the variable to use to stratify the ribbon plot. Defaults to "type" which indicates the data stream.

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
plot_rt(posterior)

Add the default plot theme

Description

Add the default plot theme

Usage

plot_theme(plot)

Arguments

plot

ggplot2 object

Value

A ggplot2 plot with the package theme applied.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_voc_advantage(), plot_voc_frac(), save_plots()


Plot the posterior prediction for the transmission advantage for the variant of concern

Description

Plot the posterior prediction for the transmission advantage for the variant of concern

Usage

plot_voc_advantage(
  posterior,
  forecast_dates = NULL,
  central = FALSE,
  voc_label = "variant of concern",
  ...
)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

voc_label

Character string giving the name to assign to the variant of concern. Defaults to "variant of concern".

...

Additional parameters passed to plot_default().

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
plot_voc_advantage(posterior)

Plot the population posterior prediction for the fraction of samples with the variant of concern

Description

Plot the population posterior prediction for the fraction of samples with the variant of concern

Usage

plot_voc_frac(
  posterior,
  obs = NULL,
  forecast_dates = NULL,
  all_obs = FALSE,
  central = FALSE,
  voc_label = "variant of concern",
  logit = TRUE,
  ...
)

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

obs

A data frame of observed data as produced by latest_obs().

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

all_obs

Logical, defaults to FALSE. Should all observations be plot or just those in the date range of the estimates being plot.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

voc_label

Character string giving the name to assign to the variant of concern. Defaults to "variant of concern".

logit

Logical, defaults to TRUE. Should variant proportions be plot on the logit scale.

...

Additional parameters passed to plot_default().

Value

A ggplot2 plot.

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
plot_voc_frac(posterior)

Plot method for forecast

Description

plot method for class "fv_forecast". The type of plot produced can be controlled using the target and type arguments with the latter only being functional when target is set to "posterior" or "forecast".

Usage

## S3 method for class 'fv_forecast'
plot(x, obs = NULL, target = "posterior", type = "cases", ...)

Arguments

x

A data.table of output as produced by forecast() of class "fv_forecast".

obs

A data frame of observed data as produced by latest_obs().

target

A character string indicating the target object within the forecast() to produce plots for. Current options are: posterior predictions ("posterior"), posterior forecasts ("forecast"), and the model fit ("fit"). When "posterior" or "forecast" are used then plot.fv_posterior() is called whereas when "fit" is used plot_pairs() is used.

type

A character string indicating the type of plot required, defaulting to "cases". Current options are: "cases" which calls plot_cases(), "voc_frac" which calls plot_voc_frac(), "voc_advantage" which calls plot_voc_advantage(), "growth" which calls plot_growth(), "rt" which calls plot_rt(), and "all" which produces a list of all plots by call plot_posterior().

...

Pass additional arguments to lower level plot functions.

Value

ggplot2 object

See Also

plot.fv_posterior

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_across_scenarios(), forecast_n_strain(), forecast(), summary.fv_forecast(), unnest_posterior()

Plotting functions add_forecast_dates(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

options(mc.cores = 4)

forecasts <- forecast(
  germany_covid19_delta_obs,
  forecast_date = as.Date("2021-06-12"),
  horizon = 4,
  strains = c(1, 2),
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)
# inspect forecasts
forecasts

# plot case posterior predictions
plot(forecasts, log = TRUE)

# plot case posterior predictions with central estimates
plot(forecasts, log = TRUE, central = TRUE)

# plot voc posterior predictions
plot(forecasts, type = "voc_frac")

Plot method for fv_tidy_posterior

Description

plot method for class "fv_posterior". This function wraps all lower level plot functions.

Usage

## S3 method for class 'fv_posterior'
plot(
  x,
  obs = NULL,
  type = "cases",
  forecast_dates = NULL,
  central = FALSE,
  all_obs = FALSE,
  voc_label = "variant of concern",
  ...
)

Arguments

x

A data.table of output as produced by fv_tidy_posterior().

obs

A data frame of observed data as produced by latest_obs().

type

A character string indicating the type of plot required, defaulting to "cases". Current options are: "cases" which calls plot_cases(), "voc_frac" which calls plot_voc_frac(), "voc_advantage" which calls plot_voc_advantage(), "growth" which calls plot_growth(), "rt" which calls plot_rt(), and "all" which produces a list of all plots by call plot_posterior().

forecast_dates

A data.frame in the format produced by extract_forecast_dates() (with at least a date variable and a Data unavailable variable)). Specifies when date availability should be add to plots. May contain faceting variables.

central

Logical, defaults to FALSE. Should the mean and median central estimates be plot as dashed and solid lines respectively. Requires mean and median variables to be present in the input.

all_obs

Logical, defaults to FALSE. Should all observations be plot or just those in the date range of the estimates being plot.

voc_label

Character string giving the name to assign to the variant of concern. Defaults to "variant of concern".

...

Pass additional arguments to lower level plot functions.

Value

ggplot2 object

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac(), save_plots()

Examples

posterior <- fv_example(strains = 2, type = "posterior")

# plot cases on the log scale
plot(posterior, type = "cases", log = TRUE)

# plot cases with central estimates
plot(posterior, type = "cases", log = FALSE, central = TRUE)

# plot fraction that have the variant of concern
plot(posterior, type = "voc_frac")

# plot the transmission advantage for the the variant of concern
plot(posterior, type = "voc_advantage")

# plot the growth rates for both voc and non-voc cases
plot(posterior, type = "growth")

# plot the reproduction number estimates
plot(posterior, type = "rt")

Print method for fv_tidy_posterior

Description

print method for class "fv_posterior". Prints the available value types and then falls back to the data.table print method.

Usage

## S3 method for class 'fv_posterior'
print(x, ...)

Arguments

x

An output from output from fv_tidy_posterior().

...

Pass additional arguments to data.table printing method.

Value

A summary data.frame

See Also

fv_tidy_posterior

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), quantiles_to_long(), summary.fv_posterior(), update_voc_label()

Examples

posterior <- fv_example(strains = 2, type = "posterior")

# case summary
posterior

Convert summarised quantiles from wide to long format

Description

Convert summarised quantiles from wide to long format

Usage

quantiles_to_long(posterior)

Arguments

posterior

A dataframe as output by fv_tidy_posterior(), fv_extract_forecast(), etc.

Value

A data frame of quantiles in long format.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), summary.fv_posterior(), update_voc_label()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
long_posterior <- quantiles_to_long(posterior)
long_posterior

Sample Sequence Observation Model

Description

Sample Sequence Observation Model

Usage

sample_sequences(frac_voc, seq_total, phi)

Arguments

frac_voc

A numeric vector of expected proportions positive for the variant of concern.

seq_total

An integer vector of total sequences available.

phi

The overdispersion of the sampling process. If not supplied then no overdispersion is used (i.e a binomial observation model vs a beta binomial observation model).

Value

A vector of observed sequences positive for the variant of concern.

See Also

Functions to generate simulated data generate_obs()

Examples

# dummy sequence data
frac_voc <- seq(0, 1, by = 0.1)
seq_total <- seq(10, length.out = length(frac_voc), by = 100)

# binomial observation model
sample_sequences(frac_voc, seq_total)

# beta binomial observation model
sample_sequences(frac_voc, seq_total, 0.5)

Save plots by name

Description

Save plots by name

Usage

save_plots(plots, save_path = NULL, type = "png", ...)

Arguments

plots

A named list of ggplot2 plots.

save_path

A character string indicating where to save plots if required.

type

A character string indicating the format to use to save plots.

...

Additional arguments passed to ggplot2::ggsave()

See Also

Plotting functions add_forecast_dates(), plot.fv_forecast(), plot.fv_posterior(), plot_cases(), plot_default(), plot_growth(), plot_pairs(), plot_posterior(), plot_rt(), plot_theme(), plot_voc_advantage(), plot_voc_frac()

Examples

posterior <- fv_example(strains = 2, type = "posterior")
p <- plot(posterior, type = "all")
save_plots(p, save_path = tempdir())

Summary method for forecast

Description

summary method for class "fv_forecast".

Usage

## S3 method for class 'fv_forecast'
summary(
  object,
  target = "posterior",
  type = "model",
  as_dt = FALSE,
  forecast = FALSE,
  ...
)

Arguments

object

A data.table output from forecast() of class "fv_forecast".

target

A character string indicating the target object within the forecast() to summarise. Current options are: posterior predictions ("posterior"), posterior forecasts ("forecast"), the model fit ("fit"), and the model diagnostics ("diagnostics"). When "posterior" or "forecast" are used then summary.fv_posterior() is called on the nested posterior or forecast.

type

A character string used to filter the summarised output and defaulting to "model". Current options are: "model" which returns a summary of key model parameters, "cases" which returns summarised cases, "voc_frac" which returns summarised estimates of the fraction of cases that have the variant of concern, "voc_advantage" that returns summarised estimates of the the transmission advantage of the variant of concern, "growth" which returns summarised variant specific and overall growth rates, "rt" which returns summarised variant specific and overall reproduction number estimates, "raw" which returns a raw posterior summary, and "all" which returns all tidied posterior estimates.

as_dt

Logical defaults to FALSE. Once any filtering has been applied should summary() fall back to using the default data.table method.

forecast

Logical defaults to FALSE. Should fv_extract_forecast() be used to return only forecasts rather than complete posterior.

...

Additional summary arguments.

Value

A summary data.table.

See Also

summary.fv_posterior forecast unnest_posterior

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_across_scenarios(), forecast_n_strain(), forecast(), plot.fv_forecast(), unnest_posterior()

Examples

options(mc.cores = 4)

forecasts <- forecast(
  germany_covid19_delta_obs,
  forecast_date = as.Date("2021-06-12"),
  horizon = 4,
  strains = c(1, 2),
  adapt_delta = 0.99,
  max_treedepth = 15,
  variant_relationship = "scaled"
)
# inspect forecasts
forecasts

# extract the model summary
summary(forecasts, type = "model")

# extract the fit object
summary(forecasts, target = "fit")

# extract the case forecast
summary(forecasts, type = "cases", forecast = TRUE)

Summary method for fv_tidy_posterior

Description

summary method for class "fv_tidy_posterior". Can be used to filter the posterior for variables of interest, to return forecasts only, and to summarise using the data.table method

Usage

## S3 method for class 'fv_posterior'
summary(object, type = "model", forecast = FALSE, as_dt = FALSE, ...)

Arguments

object

An object of the class fv_posterior as returned by fv_tidy_posterior() .

type

A character string used to filter the summarised output and defaulting to "model". Current options are: "model" which returns a summary of key model parameters, "cases" which returns summarised cases, "voc_frac" which returns summarised estimates of the fraction of cases that have the variant of concern, "voc_advantage" that returns summarised estimates of the the transmission advantage of the variant of concern, "growth" which returns summarised variant specific and overall growth rates, "rt" which returns summarised variant specific and overall reproduction number estimates, "raw" which returns a raw posterior summary, and "all" which returns all tidied posterior estimates.

forecast

Logical defaults to FALSE. Should fv_extract_forecast() be used to return only forecasts rather than complete posterior.

as_dt

Logical defaults to FALSE. Once any filtering has been applied should summary() fall back to using the default data.table method.

...

Additional summary arguments.

Value

A summary data.table table unless type "all" is used in which case the output is still of type "fv_posterior".

See Also

fv_tidy_posterior

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), update_voc_label()

Examples

posterior <- fv_example(strains = 2, type = "posterior")

# case summary
summary(posterior, type = "cases")

# summary of the case summary
summary(posterior, type = "cases", as_dt = TRUE)

# case forecast only
summary(posterior, type = "cases", forecast = TRUE)

# voc fraction summary
summary(posterior, type = "voc_frac")

# voc advantage summary
summary(posterior, type = "voc_advantage")

# growth summary
summary(posterior, type = "growth")

# Rt summary
summary(posterior, type = "rt")

# model parameter summary
summary(posterior, type = "model")

# raw posterior values
summary(posterior, type = "raw")

Unnest posterior estimates from a forecast data.frame

Description

Unnest posterior predictions and forecasts from output produced by forecast() (or multiple combined calls) dropping diagnostic and fitting variables in the process.

Usage

unnest_posterior(forecasts, target = "posterior")

Arguments

forecasts

A data frame of forecasts as produced by forecast().

target

A character string indicating the list of outputs to unnest.

Value

An unnested data.frame of posterior estimates and other variables produced by forecast().

See Also

Functions used for forecasting across models, dates, and scenarios forecast_across_dates(), forecast_across_scenarios(), forecast_n_strain(), forecast(), plot.fv_forecast(), summary.fv_forecast()

Examples

library(data.table)
options(mc.cores = 4)
dt <- forecast(
  germany_covid19_delta_obs,
  forecast_date = as.Date("2021-06-12"),
  max_treedepth = 15, adapt_delta = 0.99
)

# unnest posterior predictions
posterior <- unnest_posterior(dt)
posterior

# unnest forecasts
forecasts <- unnest_posterior(dt, target = "forecast")
forecasts

Update observations based on availability

Description

Update observations based on availability

Usage

update_obs_availability(obs, cases_lag, seq_lag)

Arguments

obs

A data.frame with the following variables: date, cases, seq_voc, and seq_total, cases_available, and seq_available. seq_available and case_available must be uniquely define data rows but other rows can be duplicated based on data availability. This data format allows for multiple versions of case and sequence data for a given date with different reporting dates. This is important when using the package in evaluation settings or in real-time where data sources are liable to be updated as new data becomes available. See germany_covid19_delta_obs for an example of a supported data set.

cases_lag

Number of weeks that case data takes to be reported. Defaults to not alter the input data.

seq_lag

Number of weeks that sequence data takes to be reported. Defaults to not alter the input data.

Value

A data.frame of observations with updated case and sequence availability dates.

See Also

Functions to define and create data scenarios define_scenarios(), generate_obs_scenario()

Examples

update_obs_availability(
  germany_covid19_delta_obs,
  cases_lag = 2, seq_lag = 3
)

Label the Variant of Concern

Description

Assign a custom label to the variant of concern in the output from fv_tidy_posterior().

Usage

update_voc_label(posterior, label, target_label = "VOC")

Arguments

posterior

A dataframe of posterior output as produced by fv_tidy_posterior(). For forecast dates to be extracted data with value_type == "cases" must be present.

label

Character string indicating the new label to use for the variant of concern.

target_label

A character string defaulting to "VOC". Indicates the current label for the variant of concern.

Value

A list of data frames as returned by 'fv_tidy_posterior() but with updated labels.

See Also

Functions used for postprocessing of model fits convert_to_stanfit(), extract_draws(), extract_forecast_dates(), fv_extract_forecast(), fv_posterior(), fv_tidy_posterior(), link_dates_with_posterior(), link_obs_with_posterior(), plot.fv_posterior(), print.fv_posterior(), quantiles_to_long(), summary.fv_posterior()

Examples

p <- fv_example(strains = 2, type = "posterior")
p <- update_voc_label(p, "Delta")
summary(p, type = "cases")
summary(p, type = "model")