Title: | Estimate Realtime Case Counts and Time-varying Epidemiological Parameters |
---|---|
Description: | To identify changes in the reproduction number, rate of spread, and doubling time during the course of outbreaks whilst accounting for potential biases due to delays in case reporting. |
Authors: | Sam Abbott [aut, cre] , Joel Hellewell [aut] , James Munday [aut], Robin Thompson [aut], Sebastian Funk [aut] |
Maintainer: | Sam Abbott <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.0 |
Built: | 2024-10-31 21:25:23 UTC |
Source: | https://github.com/epiforecasts/EpiNow |
Pulls the last n dates from a vector
add_dates(dates, n)
add_dates(dates, n)
dates |
Character vector of dates to pull from. |
n |
Number of dates required |
Character vector of dates of length N
dates <- rep(1:10) add_dates(dates, 3)
dates <- rep(1:10) add_dates(dates, 3)
Adjust Case Counts for Truncation
adjust_for_truncation( cases, cum_freq, dates, confidence_adjustment = NULL, samples )
adjust_for_truncation( cases, cum_freq, dates, confidence_adjustment = NULL, samples )
cases |
Numeric vector of cases |
cum_freq |
Numeric vector of cumulative frequencies |
dates |
Character vector of dates |
confidence_adjustment |
Numeric vector of frequencies used to adjust confidence |
samples |
Numeric, number of samples to take |
This function removes nowcasts in the format produced by EpiNow
from a target
directory for the date supplied.
clean_nowcasts(date = NULL, nowcast_dir = NULL)
clean_nowcasts(date = NULL, nowcast_dir = NULL)
date |
Date object. Defaults to todays date |
nowcast_dir |
Character string giving the filepath to the nowcast results directory. |
This general purpose function can be used to generate a country map for a single variable. It has few defaults but
the data supplied must contain a region_code
variable for linking to mapping data. This function requires
the installation of the rnaturalearth
package.
country_map( data = NULL, country = NULL, variable = NULL, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, show_caption = TRUE, ... )
country_map( data = NULL, country = NULL, variable = NULL, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, show_caption = TRUE, ... )
data |
Dataframe containing variables to be mapped. Must contain a |
variable |
A character string indicating the variable to map data for. This must be supplied. |
trans |
A character string specifying the transform to use on the specified metric. Defaults to no
transform ("identity"). Other options include log scaling ("log") and log base 10 scaling
("log10"). For a complete list of options see |
fill_labels |
A function to use to allocate legend labels. An example (used below) is |
scale_fill |
Function to use for scaling the fill. Defaults to a custom |
A ggplot2
object containing a country map.
Fit an integer adjusted exponential or gamma distribution
dist_fit(values = NULL, samples = NULL, dist = "exp")
dist_fit(values = NULL, samples = NULL, dist = "exp")
values |
Numeric vector of values |
samples |
Numeric, number of samples to take |
dist |
Character string, which distribution to fit. Defaults to exponential ( |
This function acts as a skeleton for a truncated distribution defined by
model type, maximum value and model parameters. It is designed to be used with the
output from get_dist
.
dist_skel(n, dist = FALSE, cum = TRUE, model, params, max_value = 120)
dist_skel(n, dist = FALSE, cum = TRUE, model, params, max_value = 120)
n |
Numeric vector, number of samples to take (or days for the probability density). |
dist |
Logical, defaults to |
cum |
Logical, defaults to |
model |
Character string, defining the model to be used. Supported options are exponential ("exp"), gamma ("gamma"), and log normal ("lognorm") |
params |
A list of parameters values (by name) required for each model. For the exponential model this is a rate parameter and for the gamma model this is alpha and beta. |
max_value |
Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled. |
A vector of samples or a probability distribution.
## Exponential model ## Sample dist_skel(10, model = "exp", params = list(rate = 1)) ## Cumulative prob density dist_skel(1:10, model = "exp", dist = TRUE, params = list(rate = 1)) ## Probability density dist_skel(1:10, model = "exp", dist = TRUE, cum = FALSE, params = list(rate = 1)) ## Gamma model dist_skel(10, model = "gamma", params = list(alpha = 1, beta = 2)) ## Cumulative prob density dist_skel(0:10, model = "gamma", dist = TRUE, params = list(alpha = 1, beta = 2)) ## Probability density dist_skel(0:10, model = "gamma", dist = TRUE, cum = FALSE, params = list(alpha = 2, beta = 2)) ## Log normal model dist_skel(10, model = "lognorm", params = list(mean = log(5), sd = log(2))) ## Cumulative prob density dist_skel(0:10, model = "lognorm", dist = TRUE, params = list(mean = log(5), sd = log(2))) ## Probability density dist_skel(0:10, model = "lognorm", dist = TRUE, cum = FALSE, params = list(mean = log(5), sd = log(2)))
## Exponential model ## Sample dist_skel(10, model = "exp", params = list(rate = 1)) ## Cumulative prob density dist_skel(1:10, model = "exp", dist = TRUE, params = list(rate = 1)) ## Probability density dist_skel(1:10, model = "exp", dist = TRUE, cum = FALSE, params = list(rate = 1)) ## Gamma model dist_skel(10, model = "gamma", params = list(alpha = 1, beta = 2)) ## Cumulative prob density dist_skel(0:10, model = "gamma", dist = TRUE, params = list(alpha = 1, beta = 2)) ## Probability density dist_skel(0:10, model = "gamma", dist = TRUE, cum = FALSE, params = list(alpha = 2, beta = 2)) ## Log normal model dist_skel(10, model = "lognorm", params = list(mean = log(5), sd = log(2))) ## Cumulative prob density dist_skel(0:10, model = "lognorm", dist = TRUE, params = list(mean = log(5), sd = log(2))) ## Probability density dist_skel(0:10, model = "lognorm", dist = TRUE, cum = FALSE, params = list(mean = log(5), sd = log(2)))
Estimate time-varying measures and forecast
epi_measures_pipeline( nowcast = NULL, generation_times = NULL, min_est_date = NULL, gt_samples = 1, rt_samples = 5, rt_windows = 7, rate_window = 7, rt_prior = NULL, forecast_model = NULL, horizon = NULL, verbose = TRUE )
epi_measures_pipeline( nowcast = NULL, generation_times = NULL, min_est_date = NULL, gt_samples = 1, rt_samples = 5, rt_windows = 7, rate_window = 7, rt_prior = NULL, forecast_model = NULL, horizon = NULL, verbose = TRUE )
nowcast |
A nowcast as produced by |
generation_times |
A matrix with columns representing samples and rows representing the probability of the generation timebeing on that day. |
min_est_date |
Date to begin estimation. |
gt_samples |
Numeric, the number of samples to take from the generaiton times supplied |
rt_samples |
Numeric, the number of samples to take from the estimated R distribution for each time point. |
rt_windows |
Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases). |
rate_window |
Numeric, the window to use to estimate the rate of spread. |
rt_prior |
A list defining the reproduction number prior containing the mean ( |
forecast_model |
An uninitialised bsts model passed to |
horizon |
Numeric, defaults to 0. The horizon over which to forecast Rts and cases. |
verbose |
Logical, defaults to |
Estimate the doubling time
estimate_doubling_time(r)
estimate_doubling_time(r)
r |
An estimate of the rate of change (r) |
A vector of numeric values
Estimate r
estimate_little_r(sample, min_time = NULL, max_time = NULL)
estimate_little_r(sample, min_time = NULL, max_time = NULL)
sample |
A datatable containing a numeric cases variable. |
min_time |
Numeric, minimum time to use to fit the model. |
max_time |
Numeric, maximum time to use to fit the model. |
A datatable containing an estimate of r, its standard deviation and a measure of the goodness of fit.
cases <- data.table::setDT(EpiSoon::example_obs_cases)[, cases := as.integer(cases)] estimate_little_r(cases)
cases <- data.table::setDT(EpiSoon::example_obs_cases)[, cases := as.integer(cases)] estimate_little_r(cases)
Estimate r in a set time window
estimate_r_in_window( onsets = NULL, min_time = NULL, max_time = NULL, bootstrap_samples = 1000 )
estimate_r_in_window( onsets = NULL, min_time = NULL, max_time = NULL, bootstrap_samples = 1000 )
onsets |
A list of samples datasets nested within the dataset sampled from. |
min_time |
Numeric, the minimum time to fit the model to. |
max_time |
Numeric, the maximum time to fit the model to. |
bootstrap_samples |
Numeric, defaults to 1000. The number of samples to take when bootstrapping little r to account for model uncertainty. |
A list of 3 dataframes containing estimates for little r, doubling time and model goodness of fit.
Estimate the time varying R0 - using EpiEstim
estimate_R0( cases = NULL, generation_times = NULL, rt_prior = NULL, windows = NULL, gt_samples = 100, rt_samples = 100, min_est_date = NULL, forecast_model = NULL, horizon = 0 )
estimate_R0( cases = NULL, generation_times = NULL, rt_prior = NULL, windows = NULL, gt_samples = 100, rt_samples = 100, min_est_date = NULL, forecast_model = NULL, horizon = 0 )
cases |
A dataframe containing a list of local cases with the following variables: |
generation_times |
A matrix with columns representing samples and rows representing the probability of the generation timebeing on that day. |
rt_prior |
A list defining the reproduction number prior containing the mean ( |
windows |
Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases). |
gt_samples |
Numeric, the number of samples to take from the generaiton times supplied |
rt_samples |
Numeric, the number of samples to take from the estimated R distribution for each time point. |
min_est_date |
Date to begin estimation. |
forecast_model |
An uninitialised bsts model passed to |
horizon |
Numeric, defaults to 0. The horizon over which to forecast Rts and cases. |
A tibble containing the date and summarised R estimte.
## Nowcast Rts estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, generation_times = as.matrix(EpiNow::covid_generation_times[,2]), rt_prior = list(mean_prior = 2.6, std_prior = 2), windows = c(1, 3, 7), rt_samples = 10, gt_samples = 1, min_est_date = as.Date("2020-02-18")) estimates$rts ## Nowcast Rts, forecast Rts and the forecast cases estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, generation_times = as.matrix(EpiNow::covid_generation_times[,1]), rt_prior = list(mean_prior = 2.6, std_prior = 2), windows = c(1, 3, 7), rt_samples = 10, gt_samples = 20, min_est_date = as.Date("2020-02-18"), forecast_model = function(...){EpiSoon::fable_model(model = fable::ETS(y ~ trend("A")), ...)}, horizon = 14) ## Rt estimates and forecasts estimates$rts ## Case forecasts estimates$cases
## Nowcast Rts estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, generation_times = as.matrix(EpiNow::covid_generation_times[,2]), rt_prior = list(mean_prior = 2.6, std_prior = 2), windows = c(1, 3, 7), rt_samples = 10, gt_samples = 1, min_est_date = as.Date("2020-02-18")) estimates$rts ## Nowcast Rts, forecast Rts and the forecast cases estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, generation_times = as.matrix(EpiNow::covid_generation_times[,1]), rt_prior = list(mean_prior = 2.6, std_prior = 2), windows = c(1, 3, 7), rt_samples = 10, gt_samples = 20, min_est_date = as.Date("2020-02-18"), forecast_model = function(...){EpiSoon::fable_model(model = fable::ETS(y ~ trend("A")), ...)}, horizon = 14) ## Rt estimates and forecasts estimates$rts ## Case forecasts estimates$cases
Estimate time varying r
estimate_time_varying_r(onsets, window = 7)
estimate_time_varying_r(onsets, window = 7)
onsets |
A list of samples datasets nested within the dataset sampled from. |
window |
integer value for window size in days (default = 7) |
A dataframe of r estimates over time summarisd across samples.
Generates a distribution definition when only parameter estimates
are available for gamma distributed parameters. See rgamma
for distribution information.
gamma_dist_def(shape, shape_sd, scale, scale_sd, max_value, samples)
gamma_dist_def(shape, shape_sd, scale, scale_sd, max_value, samples)
shape |
Numeric, shape parameter of the gamma distribution. |
shape_sd |
Numeric, standard deviation of the shape parameter. |
scale |
Numeric, scale parameter of the gamma distribution. |
scale_sd |
Numeric, standard deviation of the scale parameter. |
max_value |
Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled. |
samples |
Numeric, number of sample distributions to generate. |
A data.table definining the distribution as used by dist_skel
def <- gamma_dist_def(shape = 5.807, shape_sd = 0.2, scale = 0.9, scale_sd = 0.05, max_value = 20, samples = 10) print(def) def$params[[1]]
def <- gamma_dist_def(shape = 5.807, shape_sd = 0.2, scale = 0.9, scale_sd = 0.05, max_value = 20, samples = 10) print(def) def$params[[1]]
Generate a sample linelist from the observed linelist and sampled linelists
generate_pseudo_linelist( count_linelist = NULL, observed_linelist = NULL, merge_actual_onsets = TRUE )
generate_pseudo_linelist( count_linelist = NULL, observed_linelist = NULL, merge_actual_onsets = TRUE )
count_linelist |
Dataframe with two variables: date_report and daily_linelist. As generated by |
observed_linelist |
Dataframe with two variables: date_report and daily_observed_linelist. As generated by 'split_linelist_by_day“ |
merge_actual_onsets |
Logical, defaults to |
earliest_allowed_onset |
A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset. |
Dataframe with two variables: date_report and date_onset
Get a Parameters that Define a Discrete Distribution
get_dist_def( values, verbose = FALSE, samples = 1, bootstraps = 1, bootstrap_samples = 250 )
get_dist_def( values, verbose = FALSE, samples = 1, bootstraps = 1, bootstrap_samples = 250 )
values |
Numeric vector of integer values. |
verbose |
Logical, defaults to |
bootstraps |
Numeric, defaults to 1. The number of bootstrap samples (with replacement) of the delay distribution to take. |
bootstrap_samples |
Numeric, defaults to 100. The number of samples to take in each boostrap. When the sample size of the supplied delay distribution is less than 100 this is used instead. |
A data.table of distributions and the parameters that define them.
Sebastian Funk [email protected]
## Example with exponential and a small smaple delays <- rexp(20, 1) get_dist_def(delays, samples = 10, verbose = TRUE) ## Example with gamma and a larger sample delays <- rgamma(100, 4, 1) out <- get_dist_def(delays, samples = 2, bootstraps = 2) ## Inspect out ## Inspect one parameter out$params[[1]] ## Load into skeleton and sample with truncation EpiNow::dist_skel(10, model = out$model[[1]], params = out$params[[1]], max_value = out$max_value[[1]])
## Example with exponential and a small smaple delays <- rexp(20, 1) get_dist_def(delays, samples = 10, verbose = TRUE) ## Example with gamma and a larger sample delays <- rgamma(100, 4, 1) out <- get_dist_def(delays, samples = 2, bootstraps = 2) ## Inspect out ## Inspect one parameter out$params[[1]] ## Load into skeleton and sample with truncation EpiNow::dist_skel(10, model = out$model[[1]], params = out$params[[1]], max_value = out$max_value[[1]])
Combine total and imported case counts
get_local_import_case_counts(total_cases, linelist = NULL, cases_from = NULL)
get_local_import_case_counts(total_cases, linelist = NULL, cases_from = NULL)
total_cases |
Dataframe with following variables: |
linelist |
Dataframe with at least the following variables: |
cases_from |
A character string containing a date in the format |
A tibble containing cases by date locally and imported
Get Folders with Nowcast Results
get_regions(results_dir)
get_regions(results_dir)
results_dir |
A character string giving the directory in which results
are stored (as produced by |
A named character vector containing the results to plot.
## Code get_regions
## Code get_regions
Get Timeseries from EpiNow
get_timeseries(results_dir = NULL, date = NULL, summarised = FALSE)
get_timeseries(results_dir = NULL, date = NULL, summarised = FALSE)
results_dir |
A character string indicating the folder containing the |
date |
A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available. |
summarised |
Logical, defaults to |
## Not run: ## Assuming epiforecasts/covid is one repo higher ## Summary results get_timeseries("../covid/_posts/global/nowcast/results/", summarised = TRUE) ## Simulations get_timeseries("../covid/_posts/global/nowcast/results/") ## End(Not run) ## Code get_timeseries
## Not run: ## Assuming epiforecasts/covid is one repo higher ## Summary results get_timeseries("../covid/_posts/global/nowcast/results/", summarised = TRUE) ## Simulations get_timeseries("../covid/_posts/global/nowcast/results/") ## End(Not run) ## Code get_timeseries
This general purpose function can be used to generate a global map for a single variable. It has few defaults but
the data supplied must contain a country
variable for linking to mapping data. This function requires the
installation of the rnaturalearth
package.
global_map( data = NULL, variable = NULL, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, show_caption = TRUE, ... )
global_map( data = NULL, variable = NULL, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, show_caption = TRUE, ... )
data |
Dataframe containing variables to be mapped. Must contain a |
variable |
A character string indicating the variable to map data for. This must be supplied. |
trans |
A character string specifying the transform to use on the specified metric. Defaults to no
transform ("identity"). Other options include log scaling ("log") and log base 10 scaling
("log10"). For a complete list of options see |
fill_labels |
A function to use to allocate legend labels. An example (used below) is |
scale_fill |
Function to use for scaling the fill. Defaults to a custom |
A ggplot2
object containing a global map.
df <- data.table::data.table(variable = "Increasing", country = "France") global_map(df, variable = "variable")
df <- data.table::data.table(variable = "Increasing", country = "France") global_map(df, variable = "variable")
Sample a linelist from case counts and a reporting delay distribution
linelist_from_case_counts(cases = NULL)
linelist_from_case_counts(cases = NULL)
cases |
Dataframe with two variables: confirm (numeric) and date_report (date). |
A linelist grouped by day as a tibble with two variables: date_report, and daily_observed_linelist
Load nowcast results
load_nowcast_result( file = NULL, region = NULL, date = target_date, result_dir = results_dir )
load_nowcast_result( file = NULL, region = NULL, date = target_date, result_dir = results_dir )
file |
Character string giving the result files name. |
region |
Character string giving the region of interest. |
date |
Target date (in the format |
result_dir |
Character string giving the location of the target directory |
Generates a distribution definition when only parameter estimates
are available for log normal distributed parameters. See rlnorm
for distribution information.
lognorm_dist_def(mean, mean_sd, sd, sd_sd, max_value, samples)
lognorm_dist_def(mean, mean_sd, sd, sd_sd, max_value, samples)
mean |
Numeric, log mean parameter of the gamma distribution. |
mean_sd |
Numeric, standard deviation of the log mean parameter. |
sd |
Numeric, log sd parameter of the gamma distribution. |
sd_sd |
Numeric, standard deviation of the log sd parameter. |
max_value |
Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled. |
samples |
Numeric, number of sample distributions to generate. |
A data.table definining the distribution as used by dist_skel
def <- lognorm_dist_def(mean = 1.621, mean_sd = 0.0640, sd = 0.418, sd_sd = 0.0691, max_value = 20, samples = 10) print(def) def$params[[1]]
def <- lognorm_dist_def(mean = 1.621, mean_sd = 0.0640, sd = 0.418, sd_sd = 0.0691, max_value = 20, samples = 10) print(def) def$params[[1]]
Format Credible Intervals
make_conf(value, round_type = NULL, digits = 0)
make_conf(value, round_type = NULL, digits = 0)
value |
List of value to map into a string. Requires,
|
round_type |
Function, type of rounding to apply. Defaults to |
digits |
Numeric, defaults to 0. Amount of rounding to apply |
A character vector formatted for reporting
value <- list(list(point = 1, lower = 0, upper = 3)) make_conf(value, round_type = round, digits = 0)
value <- list(list(point = 1, lower = 0, upper = 3)) make_conf(value, round_type = round, digits = 0)
Categorises a numeric variable into "Increasing" (< 0.05), "Likely increasing" (<0.2), "Unsure" (< 0.8), "Likely decreasing" (< 0.95), "Decreasing" (<= 1)
map_prob_change(var)
map_prob_change(var)
var |
Numeric variable to be categorised |
A character variable.
var <- seq(0.01, 1, 0.01) var map_prob_change(var)
var <- seq(0.01, 1, 0.01) var map_prob_change(var)
Impute Cases Date of Infection
nowcast_pipeline( reported_cases = NULL, linelist = NULL, target_date = NULL, earliest_allowed_onset = NULL, merge_actual_onsets = FALSE, approx_delay = FALSE, max_delay = 120, verbose = FALSE, samples = 1, delay_defs = NULL, incubation_defs = NULL, nowcast_lag = 8, onset_modifier = NULL )
nowcast_pipeline( reported_cases = NULL, linelist = NULL, target_date = NULL, earliest_allowed_onset = NULL, merge_actual_onsets = FALSE, approx_delay = FALSE, max_delay = 120, verbose = FALSE, samples = 1, delay_defs = NULL, incubation_defs = NULL, nowcast_lag = 8, onset_modifier = NULL )
reported_cases |
A dataframe of reported cases |
linelist |
A linelist of report dates and onset dates |
earliest_allowed_onset |
A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset. |
merge_actual_onsets |
Logical, defaults to |
approx_delay |
Logical, defaults to |
verbose |
Logical, defaults to |
delay_defs |
A data.table that defines the delay distributions (model, parameters and maximum delay for each model).
See |
incubation_defs |
A data.table that defines the incubation distributions (model, parameters and maximum delay for each model).
See |
nowcast_lag |
Numeric, defaults to 4. The number of days by which to lag nowcasts. Helps reduce bias due to case upscaling. |
onset_modifier |
data.frame containing a |
## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1 / 10), samples = 1, bootstraps = 1)) ## incubation delay dist incubation_dist <- delay_dist ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")] ## Basic nowcast nowcast <- nowcast_pipeline(reported_cases = cases, target_date = max(cases$date), delay_defs = delay_dist, incubation_defs = incubation_dist) nowcast
## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1 / 10), samples = 1, bootstraps = 1)) ## incubation delay dist incubation_dist <- delay_dist ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")] ## Basic nowcast nowcast <- nowcast_pipeline(reported_cases = cases, target_date = max(cases$date), delay_defs = delay_dist, incubation_defs = incubation_dist) nowcast
Plot a Time Series with Confidence.
plot_confidence( data, outer_alpha = 0.1, inner_alpha = 0.2, plot_median = TRUE, legend = "none" )
plot_confidence( data, outer_alpha = 0.1, inner_alpha = 0.2, plot_median = TRUE, legend = "none" )
data |
Dataframe containing the follwoing variables: |
outer_alpha |
Numeric, outer alpha level. |
inner_alpha |
Numeric, inner alpha level. |
plot_median |
Logical, defaults to |
legend |
Character string defaults to "none". Should a legend be displayed. |
A ggplot2
object.
Add a Forecast to a Plot
plot_forecast(plot = NULL, forecast = NULL)
plot_forecast(plot = NULL, forecast = NULL)
plot |
|
forecast |
Dataframe containing a forecast with the following variables: |
A ggplot2
plot
Plot a Grid of Plots
plot_grid( regions = NULL, plot_object = "bigr_eff_plot.rds", results_dir = "results", target_date = NULL, ... )
plot_grid( regions = NULL, plot_object = "bigr_eff_plot.rds", results_dir = "results", target_date = NULL, ... )
regions |
A character string containing the list of regions to extract results for (must all have results for the same target date). |
plot_object |
A character string indicating the plot object to use as the base for the grid. |
results_dir |
A character string indicating the location of the results directory to extract results from. |
target_date |
A character string indicating the target date to extract results for. All regions must have results for this date. |
... |
Additional arguments to pass to |
A ggplot2
object combining multiple plots
## Code plot_grid
## Code plot_grid
Plot Pipeline Results
plot_pipeline( target_date = NULL, target_folder = NULL, min_plot_date = NULL, report_forecast = FALSE )
plot_pipeline( target_date = NULL, target_folder = NULL, min_plot_date = NULL, report_forecast = FALSE )
target_date |
Character string, in the form "2020-01-01". Date to cast. |
target_folder |
Character string, name of the folder in which to save the results. |
min_plot_date |
Character string, in the form "2020-01-01". Minimum date at which to start plotting estimates. |
report_forecast |
Logical, defaults to |
Plot a Summary of the Latest Results
plot_summary(summary_results, x_lab = "Region", log_cases = FALSE)
plot_summary(summary_results, x_lab = "Region", log_cases = FALSE)
summary_results |
A datatable as returned by |
x_lab |
A character string giving the label for the x axis, defaults to region. |
log_cases |
Logical, should cases be shown on a logged scale. Defaults to |
A ggplot2
object
Extract a the Maximum Value of a Variable Based on a Filter
pull_max_var(df, max_var = NULL, sel_var = NULL, type_selected = NULL)
pull_max_var(df, max_var = NULL, sel_var = NULL, type_selected = NULL)
df |
Datatable with the following variables: |
type_selected |
The nowcast type to extract. |
var |
Unquoted variable name to pull out the maximum R estimate for. |
A character string containing the maximum variable
df <- data.table::data.table(type = c("nowcast", "other"), var = c(1:10), sel = "test") pull_max_var(df, max_var = "var", sel_var = "var", type_selected = "nowcast")
df <- data.table::data.table(type = c("nowcast", "other"), var = c(1:10), sel = "test") pull_max_var(df, max_var = "var", sel_var = "var", type_selected = "nowcast")
Samples size (the number of trials) of a binomial distribution copied from https://github.com/sbfnk/bpmodels/blob/master/R/utils.r
rbinom_size(n, x, prob)
rbinom_size(n, x, prob)
n |
Numeric, number of samples to draw |
x |
Numeric, offset. |
prob |
Numeric, probability of successful trial |
Runs a pipeline by region.
regional_rt_pipeline( cases = NULL, linelist = NULL, delay_defs = NULL, incubation_defs = NULL, target_folder = "results", target_date = NULL, merge_onsets = FALSE, case_limit = 40, onset_modifier = NULL, dt_threads = 1, verbose = FALSE, ... )
regional_rt_pipeline( cases = NULL, linelist = NULL, delay_defs = NULL, incubation_defs = NULL, target_folder = "results", target_date = NULL, merge_onsets = FALSE, case_limit = 40, onset_modifier = NULL, dt_threads = 1, verbose = FALSE, ... )
cases |
A dataframe of cases ( |
linelist |
A dataframe of of cases (by row) containing the following variables:
|
delay_defs |
A data.table that defines the delay distributions (model, parameters and maximum delay for each model).
See |
incubation_defs |
A data.table that defines the incubation distributions (model, parameters and maximum delay for each model).
See |
target_folder |
Character string, name of the folder in which to save the results. |
target_date |
Character string, in the form "2020-01-01". Date to cast. |
merge_onsets |
Logical defaults to |
case_limit |
Numeric, the minimum number of cases in a region required for that region to be evaluated. Defaults to 10.
set to |
onset_modifier |
data.frame containing a |
dt_threads |
Numeric, the number of data.table threads to use. Set internally to avoid issue when running in parallel. Defaults to 1 thread. |
verbose |
Logical, defaults to |
... |
## Save everything to a temporary directory ## Change this to inspect locally target_dir <- tempdir() ## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1/10), samples = 10, bootstraps = 1)) ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][, cases := NULL] cases <- data.table::rbindlist(list( data.table::copy(cases)[, region := "testland"], cases[, region := "realland"])) ## Run basic nowcasting pipeline regional_rt_pipeline(cases = cases, delay_defs = delay_dist, target_folder = target_dir)
## Save everything to a temporary directory ## Change this to inspect locally target_dir <- tempdir() ## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1/10), samples = 10, bootstraps = 1)) ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][, cases := NULL] cases <- data.table::rbindlist(list( data.table::copy(cases)[, region := "testland"], cases[, region := "realland"])) ## Run basic nowcasting pipeline regional_rt_pipeline(cases = cases, delay_defs = delay_dist, target_folder = target_dir)
Generate Regional Summary Output
regional_summary( results_dir = NULL, summary_dir = NULL, target_date = NULL, region_scale = "Region", csv_region_label = "region", log_cases = FALSE )
regional_summary( results_dir = NULL, summary_dir = NULL, target_date = NULL, region_scale = "Region", csv_region_label = "region", log_cases = FALSE )
results_dir |
A character string indicating the location of the results directory to extract results from. |
summary_dir |
A character string giving the directory in which to store summary of results. |
target_date |
A character string giving the target date for which to extract results (in the format "yyyy-mm-dd"). |
region_scale |
A character string indicating the name to give the regions being summarised. |
log_cases |
Logical, should cases be shown on a logged scale. Defaults to |
## Not run: ## Example asssumes that CovidGlobalNow (github.com/epiforecasts/covid-global) is ## in the directory above the root. regional_summary(results_dir = "../covid-global/national", summary_dir = "../covid-global/national-summary", target_date = "2020-03-19", region_scale = "Country") ## End(Not run)
## Not run: ## Example asssumes that CovidGlobalNow (github.com/epiforecasts/covid-global) is ## in the directory above the root. regional_summary(results_dir = "../covid-global/national", summary_dir = "../covid-global/national-summary", target_date = "2020-03-19", region_scale = "Country") ## End(Not run)
Report Rate of Growth Estimates
report_littler(target_folder)
report_littler(target_folder)
Returns a summarised nowcast as well as saving key information to the results folder.
report_nowcast(nowcast, cases, target, target_folder)
report_nowcast(nowcast, cases, target, target_folder)
nowcast |
A dataframe as produced by |
cases |
A dataframe of cases (in date order) with the following variables:
|
target |
Character string indicting the data type to use as the "nowcast". @param target_folder Character string indicating the folder into which to save results. Also used to extract previously generated results. |
Report Effective Reproduction Number Estimates
report_reff(target_folder)
report_reff(target_folder)
Provide Summary Statistics on an Rt Pipeline
report_summary(target_folder)
report_summary(target_folder)
Combine fitting a delay distribution, constructing a set of complete sampled linelists, nowcast cases by onset date, and estimate the time-varying effective reproduction number and rate of spread.
rt_pipeline( cases = NULL, linelist = NULL, delay_defs = NULL, incubation_defs = NULL, delay_cutoff_date = NULL, rt_samples = 5, rt_windows = 1:7, rate_window = 7, earliest_allowed_onset = NULL, merge_actual_onsets = TRUE, approx_delay = FALSE, approx_threshold = 10000, max_delay = 120, generation_times = NULL, rt_prior = NULL, nowcast_lag = 8, forecast_model = NULL, horizon = 0, report_forecast = FALSE, onset_modifier = NULL, min_forecast_cases = 200, target_folder = NULL, target_date = NULL, dt_threads = 1, verbose = FALSE )
rt_pipeline( cases = NULL, linelist = NULL, delay_defs = NULL, incubation_defs = NULL, delay_cutoff_date = NULL, rt_samples = 5, rt_windows = 1:7, rate_window = 7, earliest_allowed_onset = NULL, merge_actual_onsets = TRUE, approx_delay = FALSE, approx_threshold = 10000, max_delay = 120, generation_times = NULL, rt_prior = NULL, nowcast_lag = 8, forecast_model = NULL, horizon = 0, report_forecast = FALSE, onset_modifier = NULL, min_forecast_cases = 200, target_folder = NULL, target_date = NULL, dt_threads = 1, verbose = FALSE )
cases |
A dataframe of cases (in date order) with the following variables:
|
linelist |
A dataframe of of cases (by row) containing the following variables:
|
delay_defs |
A data.table that defines the delay distributions (model, parameters and maximum delay for each model).
See |
incubation_defs |
A data.table that defines the incubation distributions (model, parameters and maximum delay for each model).
See |
delay_cutoff_date |
Character string, in the form "2020-01-01". Cutoff date to use to estimate the delay distribution. |
rt_samples |
Numeric, the number of samples to take from the estimated R distribution for each time point. |
rt_windows |
Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases). |
rate_window |
Numeric, the window to use to estimate the rate of spread. |
earliest_allowed_onset |
A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset. |
merge_actual_onsets |
Logical, defaults to |
approx_delay |
Logical, defaults to |
generation_times |
A matrix with columns representing samples and rows representing the probability of the serial intervel being on
that day. Defaults to |
rt_prior |
A list defining the reproduction number prior containing the mean ( |
nowcast_lag |
Numeric, defaults to 4. The number of days by which to lag nowcasts. Helps reduce bias due to case upscaling. |
forecast_model |
An uninitialised bsts model passed to |
horizon |
Numeric, defaults to 0. The horizon over which to forecast Rts and cases. |
report_forecast |
Logical, defaults to |
onset_modifier |
data.frame containing a |
min_forecast_cases |
Numeric, defaults to 200. The minimum number of cases required in the last 7 days of data in order for a forecast to be run. This prevents spurious forecasts based on highly uncertain Rt estimates. |
target_folder |
Character string, name of the folder in which to save the results. |
target_date |
Character string, in the form "2020-01-01". Date to cast. |
dt_threads |
Numeric, the number of data.table threads to use. Set internally to avoid issue when running in parallel. Defaults to 1 thread. |
verbose |
Logical, defaults to |
approx_thresold |
Numeric, defaults to 10,000. Threshold of cases below which explicit sampling of onsets always occurs. |
## Save everything to a temporary directory ## Change this to inspect locally target_dir <- tempdir() ## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1/10), samples = 10, bootstraps = 1)) ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][, cases := NULL] ## Run basic nowcasting pipeline rt_pipeline(cases = cases, delay_defs = delay_dist, target_date = max(cases$date), target_folder = target_dir)
## Save everything to a temporary directory ## Change this to inspect locally target_dir <- tempdir() ## Construct example distributions ## reporting delay dist delay_dist <- suppressWarnings( EpiNow::get_dist_def(rexp(25, 1/10), samples = 10, bootstraps = 1)) ## Uses example case vector from EpiSoon cases <- data.table::setDT(EpiSoon::example_obs_cases) cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][, cases := NULL] ## Run basic nowcasting pipeline rt_pipeline(cases = cases, delay_defs = delay_dist, target_date = max(cases$date), target_folder = target_dir)
Approximate Sampling a Distribution using Counts
sample_approx_dist( cases = NULL, dist_fn = NULL, max_value = 120, earliest_allowed_mapped = NULL, direction = "backwards" )
sample_approx_dist( cases = NULL, dist_fn = NULL, max_value = 120, earliest_allowed_mapped = NULL, direction = "backwards" )
cases |
A dataframe of cases (in date order) with the following variables:
|
max_value |
Numeric, maximum value to allow. Defaults to 120 days |
earliest_allowed_mapped |
A character string representing a date ("2020-01-01"). Indicates the earlies allowed mapped value. |
direction |
Character string, defato "backwards". Direction in which to map cases. Supports either "backwards" or "forwards". |
A data.table
of cases by date of onset
cases <- data.table::as.data.table(EpiSoon::example_obs_cases) cases <- cases[, cases := as.integer(cases)] ## Reported case distribution print(cases) ## Total cases sum(cases$cases) delay_fn <- function(n, dist, cum) { pgamma(n + 0.9999, 2, 1) - pgamma(n - 1e-5, 2, 1)} onsets <- sample_approx_dist(cases = cases, dist_fn = delay_fn) ## Estimated onset distribution print(onsets) ## Check that sum is equal to reported cases total_onsets <- median( purrr::map_dbl(1:1000, ~ sum(sample_approx_dist(cases = cases, dist_fn = delay_fn)$cases))) total_onsets ## Map from onset cases to reported reports <- sample_approx_dist(cases = cases, dist_fn = delay_fn, direction = "forwards")
cases <- data.table::as.data.table(EpiSoon::example_obs_cases) cases <- cases[, cases := as.integer(cases)] ## Reported case distribution print(cases) ## Total cases sum(cases$cases) delay_fn <- function(n, dist, cum) { pgamma(n + 0.9999, 2, 1) - pgamma(n - 1e-5, 2, 1)} onsets <- sample_approx_dist(cases = cases, dist_fn = delay_fn) ## Estimated onset distribution print(onsets) ## Check that sum is equal to reported cases total_onsets <- median( purrr::map_dbl(1:1000, ~ sum(sample_approx_dist(cases = cases, dist_fn = delay_fn)$cases))) total_onsets ## Map from onset cases to reported reports <- sample_approx_dist(cases = cases, dist_fn = delay_fn, direction = "forwards")
Sample Onset Dates for Cases missing them
sample_delay(linelist = NULL, delay_fn = NULL, earliest_allowed_onset = NULL)
sample_delay(linelist = NULL, delay_fn = NULL, earliest_allowed_onset = NULL)
linelist |
Dataframe with two variables: date_report and date_onset. As generated by |
delay_fn |
A sampling funtion that takes a single numeric argument and returns a vector of numeric samples this long. |
earliest_allowed_onset |
A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset. |
Dataframe with no missing data and two variables: date_report and date_onset
Convert a linelist into a nested tibble of linelists by day
split_linelist_by_day(linelist = NULL)
split_linelist_by_day(linelist = NULL)
linelist |
Dataframe with the following variables date_onset_symptoms and date_confirmation |
A nested tibble with a linelist per day (daily_observed_linelist) variable containing date_onset and date_report and a date_report variable
Summarise a nowcast
summarise_cast(nowcast)
summarise_cast(nowcast)
nowcast |
A dataframe as produced by |
A summarised dataframe
Summarise Realtime Results
summarise_results( regions = NULL, results_dir = "results", target_date = NULL, region_scale = "Region" )
summarise_results( regions = NULL, results_dir = "results", target_date = NULL, region_scale = "Region" )
regions |
A character string containing the list of regions to extract results for (must all have results for the same target date). |
results_dir |
A character string indicating the location of the results directory to extract results from. |
target_date |
A character string indicating the target date to extract results for. All regions must have results for this date. |
region_scale |
A character string indicating the name to give the regions being summarised. |
## Code summarise_results
## Code summarise_results
Summarise rt and cases as a csv
summarise_to_csv( results_dir = NULL, summary_dir = NULL, type = "country", date = NULL )
summarise_to_csv( results_dir = NULL, summary_dir = NULL, type = "country", date = NULL )
results_dir |
Character string indicating the directory from which to extract results |
summary_dir |
Character string the directory into which to save results |
type |
Character string, the region identifier to apply |
date |
A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available. |
Nothing is returned
Custom Map Theme
theme_map( map = NULL, continuous = FALSE, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, breaks = NULL, ... )
theme_map( map = NULL, continuous = FALSE, variable_label = NULL, trans = "identity", fill_labels = NULL, scale_fill = NULL, breaks = NULL, ... )
map |
|
continuous |
Logical defaults to |
trans |
A character string specifying the transform to use on the specified metric. Defaults to no
transform ("identity"). Other options include log scaling ("log") and log base 10 scaling
("log10"). For a complete list of options see |
fill_labels |
A function to use to allocate legend labels. An example (used below) is |
scale_fill |
Function to use for scaling the fill. Defaults to a custom |
breaks |
Breaks to use in legend. Defaults to |
additional |
arguments passed to |
A ggplot2
object
## Code theme_map
## Code theme_map