Package 'EpiNow'

Title: Estimate Realtime Case Counts and Time-varying Epidemiological Parameters
Description: To identify changes in the reproduction number, rate of spread, and doubling time during the course of outbreaks whilst accounting for potential biases due to delays in case reporting.
Authors: Sam Abbott [aut, cre] , Joel Hellewell [aut] , James Munday [aut], Robin Thompson [aut], Sebastian Funk [aut]
Maintainer: Sam Abbott <[email protected]>
License: MIT + file LICENSE
Version: 0.2.0
Built: 2024-12-27 05:15:57 UTC
Source: https://github.com/epiforecasts/EpiNow

Help Index


Add Dates from Daata

Description

Pulls the last n dates from a vector

Usage

add_dates(dates, n)

Arguments

dates

Character vector of dates to pull from.

n

Number of dates required

Value

Character vector of dates of length N

Examples

dates <- rep(1:10)

add_dates(dates, 3)

Adjust Case Counts for Truncation

Description

Adjust Case Counts for Truncation

Usage

adjust_for_truncation(
  cases,
  cum_freq,
  dates,
  confidence_adjustment = NULL,
  samples
)

Arguments

cases

Numeric vector of cases

cum_freq

Numeric vector of cumulative frequencies

dates

Character vector of dates

confidence_adjustment

Numeric vector of frequencies used to adjust confidence

samples

Numeric, number of samples to take


Clean Nowcasts for a Supplied Date

Description

This function removes nowcasts in the format produced by EpiNow from a target directory for the date supplied.

Usage

clean_nowcasts(date = NULL, nowcast_dir = NULL)

Arguments

date

Date object. Defaults to todays date

nowcast_dir

Character string giving the filepath to the nowcast results directory.


Generate a country map for a single variable.

Description

This general purpose function can be used to generate a country map for a single variable. It has few defaults but the data supplied must contain a region_code variable for linking to mapping data. This function requires the installation of the rnaturalearth package.

Usage

country_map(
  data = NULL,
  country = NULL,
  variable = NULL,
  variable_label = NULL,
  trans = "identity",
  fill_labels = NULL,
  scale_fill = NULL,
  show_caption = TRUE,
  ...
)

Arguments

data

Dataframe containing variables to be mapped. Must contain a region_code variable.

variable

A character string indicating the variable to map data for. This must be supplied.

trans

A character string specifying the transform to use on the specified metric. Defaults to no transform ("identity"). Other options include log scaling ("log") and log base 10 scaling ("log10"). For a complete list of options see ggplot2::continous_scale.

fill_labels

A function to use to allocate legend labels. An example (used below) is scales::percent, which can be used for percentage data.

scale_fill

Function to use for scaling the fill. Defaults to a custom ggplot2::scale_fill_manual

Value

A ggplot2 object containing a country map.


Fit an integer adjusted exponential or gamma distribution

Description

Fit an integer adjusted exponential or gamma distribution

Usage

dist_fit(values = NULL, samples = NULL, dist = "exp")

Arguments

values

Numeric vector of values

samples

Numeric, number of samples to take

dist

Character string, which distribution to fit. Defaults to exponential ("exp") but gamma is also supported ("gamma").


Distribution Skeleton

Description

This function acts as a skeleton for a truncated distribution defined by model type, maximum value and model parameters. It is designed to be used with the output from get_dist.

Usage

dist_skel(n, dist = FALSE, cum = TRUE, model, params, max_value = 120)

Arguments

n

Numeric vector, number of samples to take (or days for the probability density).

dist

Logical, defaults to FALSE. Should the probability density be returned rather than a number of samples.

cum

Logical, defaults to TRUE. If dist = TRUE should the returned distribution be cumulative.

model

Character string, defining the model to be used. Supported options are exponential ("exp"), gamma ("gamma"), and log normal ("lognorm")

params

A list of parameters values (by name) required for each model. For the exponential model this is a rate parameter and for the gamma model this is alpha and beta.

max_value

Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled.

Value

A vector of samples or a probability distribution.

Examples

## Exponential model

## Sample
dist_skel(10, model = "exp", params = list(rate = 1))

## Cumulative prob density
dist_skel(1:10, model = "exp", dist = TRUE, params = list(rate = 1))

## Probability density
dist_skel(1:10, model = "exp", dist = TRUE, 
          cum = FALSE, params = list(rate = 1))

## Gamma model

dist_skel(10, model = "gamma", params = list(alpha = 1, beta = 2))

## Cumulative prob density
dist_skel(0:10, model = "gamma", dist = TRUE,
          params = list(alpha = 1, beta = 2))

## Probability density
dist_skel(0:10, model = "gamma", dist = TRUE, 
          cum = FALSE, params = list(alpha = 2, beta = 2))

## Log normal model

dist_skel(10, model = "lognorm", params = list(mean = log(5), sd = log(2)))

## Cumulative prob density
dist_skel(0:10, model = "lognorm", dist = TRUE,
          params = list(mean = log(5), sd = log(2)))

## Probability density
dist_skel(0:10, model = "lognorm", dist = TRUE, cum = FALSE,
          params = list(mean = log(5), sd = log(2)))

Estimate time-varying measures and forecast

Description

Estimate time-varying measures and forecast

Usage

epi_measures_pipeline(
  nowcast = NULL,
  generation_times = NULL,
  min_est_date = NULL,
  gt_samples = 1,
  rt_samples = 5,
  rt_windows = 7,
  rate_window = 7,
  rt_prior = NULL,
  forecast_model = NULL,
  horizon = NULL,
  verbose = TRUE
)

Arguments

nowcast

A nowcast as produced by nowcast_pipeline

generation_times

A matrix with columns representing samples and rows representing the probability of the generation timebeing on that day.

min_est_date

Date to begin estimation.

gt_samples

Numeric, the number of samples to take from the generaiton times supplied

rt_samples

Numeric, the number of samples to take from the estimated R distribution for each time point.

rt_windows

Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases).

rate_window

Numeric, the window to use to estimate the rate of spread.

rt_prior

A list defining the reproduction number prior containing the mean (mean_prior) and standard deviation (std_prior)

forecast_model

An uninitialised bsts model passed to EpiSoon::forecast_rt to be used for forecasting future Rt values. An example of the required structure is: function(ss, y){bsts::AddSemilocalLinearTrend(ss, y = y)}.

horizon

Numeric, defaults to 0. The horizon over which to forecast Rts and cases.

verbose

Logical, defaults to TRUE. Should progress messages be shown.


Estimate the doubling time

Description

Estimate the doubling time

Usage

estimate_doubling_time(r)

Arguments

r

An estimate of the rate of change (r)

Value

A vector of numeric values


Estimate r

Description

Estimate r

Usage

estimate_little_r(sample, min_time = NULL, max_time = NULL)

Arguments

sample

A datatable containing a numeric cases variable.

min_time

Numeric, minimum time to use to fit the model.

max_time

Numeric, maximum time to use to fit the model.

Value

A datatable containing an estimate of r, its standard deviation and a measure of the goodness of fit.

Examples

cases <- data.table::setDT(EpiSoon::example_obs_cases)[, 
                          cases := as.integer(cases)]

estimate_little_r(cases)

Estimate r in a set time window

Description

Estimate r in a set time window

Usage

estimate_r_in_window(
  onsets = NULL,
  min_time = NULL,
  max_time = NULL,
  bootstrap_samples = 1000
)

Arguments

onsets

A list of samples datasets nested within the dataset sampled from.

min_time

Numeric, the minimum time to fit the model to.

max_time

Numeric, the maximum time to fit the model to.

bootstrap_samples

Numeric, defaults to 1000. The number of samples to take when bootstrapping little r to account for model uncertainty.

Value

A list of 3 dataframes containing estimates for little r, doubling time and model goodness of fit.


Estimate the time varying R0 - using EpiEstim

Description

Estimate the time varying R0 - using EpiEstim

Usage

estimate_R0(
  cases = NULL,
  generation_times = NULL,
  rt_prior = NULL,
  windows = NULL,
  gt_samples = 100,
  rt_samples = 100,
  min_est_date = NULL,
  forecast_model = NULL,
  horizon = 0
)

Arguments

cases

A dataframe containing a list of local cases with the following variables: date, cases, and import_status

generation_times

A matrix with columns representing samples and rows representing the probability of the generation timebeing on that day.

rt_prior

A list defining the reproduction number prior containing the mean (mean_prior) and standard deviation (std_prior)

windows

Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases).

gt_samples

Numeric, the number of samples to take from the generaiton times supplied

rt_samples

Numeric, the number of samples to take from the estimated R distribution for each time point.

min_est_date

Date to begin estimation.

forecast_model

An uninitialised bsts model passed to EpiSoon::forecast_rt to be used for forecasting future Rt values. An example of the required structure is: function(ss, y){bsts::AddSemilocalLinearTrend(ss, y = y)}.

horizon

Numeric, defaults to 0. The horizon over which to forecast Rts and cases.

Value

A tibble containing the date and summarised R estimte.

Examples

## Nowcast Rts                  
estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, 
                         generation_times = as.matrix(EpiNow::covid_generation_times[,2]), 
                         rt_prior = list(mean_prior = 2.6, std_prior = 2),
                         windows = c(1, 3, 7), rt_samples = 10, gt_samples = 1,
                         min_est_date =  as.Date("2020-02-18"))
                         
                         
estimates$rts
  
## Nowcast Rts, forecast Rts and the forecast cases
estimates <- estimate_R0(cases = EpiSoon::example_obs_cases, 
                         generation_times = as.matrix(EpiNow::covid_generation_times[,1]), 
                         rt_prior = list(mean_prior = 2.6, std_prior = 2),
                         windows = c(1, 3, 7), rt_samples = 10, gt_samples = 20,
                         min_est_date =  as.Date("2020-02-18"),
                         forecast_model = function(...){EpiSoon::fable_model(model = fable::ETS(y ~ trend("A")), ...)},
                         horizon = 14)
                                           
## Rt estimates and forecasts
estimates$rts



## Case forecasts
estimates$cases

Estimate time varying r

Description

Estimate time varying r

Usage

estimate_time_varying_r(onsets, window = 7)

Arguments

onsets

A list of samples datasets nested within the dataset sampled from.

window

integer value for window size in days (default = 7)

Value

A dataframe of r estimates over time summarisd across samples.


Generate a Gamma Distribution Definition Based on Parameter Estimates

Description

Generates a distribution definition when only parameter estimates are available for gamma distributed parameters. See rgamma for distribution information.

Usage

gamma_dist_def(shape, shape_sd, scale, scale_sd, max_value, samples)

Arguments

shape

Numeric, shape parameter of the gamma distribution.

shape_sd

Numeric, standard deviation of the shape parameter.

scale

Numeric, scale parameter of the gamma distribution.

scale_sd

Numeric, standard deviation of the scale parameter.

max_value

Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled.

samples

Numeric, number of sample distributions to generate.

Value

A data.table definining the distribution as used by dist_skel

Examples

def <- gamma_dist_def(shape = 5.807, shape_sd = 0.2,
               scale = 0.9, scale_sd = 0.05,
               max_value = 20, samples = 10)
               
print(def)

def$params[[1]]

Generate a sample linelist from the observed linelist and sampled linelists

Description

Generate a sample linelist from the observed linelist and sampled linelists

Usage

generate_pseudo_linelist(
  count_linelist = NULL,
  observed_linelist = NULL,
  merge_actual_onsets = TRUE
)

Arguments

count_linelist

Dataframe with two variables: date_report and daily_linelist. As generated by linelist_from_case_counts.

observed_linelist

Dataframe with two variables: date_report and daily_observed_linelist. As generated by 'split_linelist_by_day“

merge_actual_onsets

Logical, defaults to TRUE. Should linelist onset dates be used where available?

earliest_allowed_onset

A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset.

Value

Dataframe with two variables: date_report and date_onset


Get a Parameters that Define a Discrete Distribution

Description

Get a Parameters that Define a Discrete Distribution

Usage

get_dist_def(
  values,
  verbose = FALSE,
  samples = 1,
  bootstraps = 1,
  bootstrap_samples = 250
)

Arguments

values

Numeric vector of integer values.

verbose

Logical, defaults to FALSE. Should progress messages be printed

bootstraps

Numeric, defaults to 1. The number of bootstrap samples (with replacement) of the delay distribution to take.

bootstrap_samples

Numeric, defaults to 100. The number of samples to take in each boostrap. When the sample size of the supplied delay distribution is less than 100 this is used instead.

Value

A data.table of distributions and the parameters that define them.

Author(s)

Sebastian Funk [email protected]

Examples

## Example with exponential and a small smaple
delays <- rexp(20, 1)

get_dist_def(delays, samples = 10, verbose = TRUE)


## Example with gamma and a larger sample
delays <- rgamma(100, 4, 1)

out <- get_dist_def(delays, samples = 2, bootstraps = 2)

## Inspect
out

## Inspect one parameter
out$params[[1]]


## Load into skeleton and sample with truncation
EpiNow::dist_skel(10, model = out$model[[1]],
                  params = out$params[[1]],
                  max_value = out$max_value[[1]])

Combine total and imported case counts

Description

Combine total and imported case counts

Usage

get_local_import_case_counts(total_cases, linelist = NULL, cases_from = NULL)

Arguments

total_cases

Dataframe with following variables: date and cases.

linelist

Dataframe with at least the following variables: date_confirm, import_status

cases_from

A character string containing a date in the format "yyyy-mm-dd". Applies a filter to returned cases.

Value

A tibble containing cases by date locally and imported


Get Folders with Nowcast Results

Description

Get Folders with Nowcast Results

Usage

get_regions(results_dir)

Arguments

results_dir

A character string giving the directory in which results are stored (as produced by regional_rt_pipeline).

Value

A named character vector containing the results to plot.

Examples

## Code 
get_regions

Get Timeseries from EpiNow

Description

Get Timeseries from EpiNow

Usage

get_timeseries(results_dir = NULL, date = NULL, summarised = FALSE)

Arguments

results_dir

A character string indicating the folder containing the EpiNow results to extract.

date

A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available.

summarised

Logical, defaults to FALSE. Should full or summarised results be returned.

Examples

## Not run: 
## Assuming epiforecasts/covid is one repo higher
## Summary results
get_timeseries("../covid/_posts/global/nowcast/results/", 
               summarised = TRUE)

## Simulations
get_timeseries("../covid/_posts/global/nowcast/results/")

## End(Not run)
## Code
get_timeseries

Generate a global map for a single variable.

Description

This general purpose function can be used to generate a global map for a single variable. It has few defaults but the data supplied must contain a country variable for linking to mapping data. This function requires the installation of the rnaturalearth package.

Usage

global_map(
  data = NULL,
  variable = NULL,
  variable_label = NULL,
  trans = "identity",
  fill_labels = NULL,
  scale_fill = NULL,
  show_caption = TRUE,
  ...
)

Arguments

data

Dataframe containing variables to be mapped. Must contain a country variable.

variable

A character string indicating the variable to map data for. This must be supplied.

trans

A character string specifying the transform to use on the specified metric. Defaults to no transform ("identity"). Other options include log scaling ("log") and log base 10 scaling ("log10"). For a complete list of options see ggplot2::continous_scale.

fill_labels

A function to use to allocate legend labels. An example (used below) is scales::percent, which can be used for percentage data.

scale_fill

Function to use for scaling the fill. Defaults to a custom ggplot2::scale_fill_manual

Value

A ggplot2 object containing a global map.

Examples

df <- data.table::data.table(variable = "Increasing", country = "France") 

global_map(df, variable = "variable")

Sample a linelist from case counts and a reporting delay distribution

Description

Sample a linelist from case counts and a reporting delay distribution

Usage

linelist_from_case_counts(cases = NULL)

Arguments

cases

Dataframe with two variables: confirm (numeric) and date_report (date).

Value

A linelist grouped by day as a tibble with two variables: date_report, and daily_observed_linelist


Load nowcast results

Description

Load nowcast results

Usage

load_nowcast_result(
  file = NULL,
  region = NULL,
  date = target_date,
  result_dir = results_dir
)

Arguments

file

Character string giving the result files name.

region

Character string giving the region of interest.

date

Target date (in the format ⁠"yyyy-mm-dd⁠).

result_dir

Character string giving the location of the target directory


Generate a Log Normal Distribution Definition Based on Parameter Estimates

Description

Generates a distribution definition when only parameter estimates are available for log normal distributed parameters. See rlnorm for distribution information.

Usage

lognorm_dist_def(mean, mean_sd, sd, sd_sd, max_value, samples)

Arguments

mean

Numeric, log mean parameter of the gamma distribution.

mean_sd

Numeric, standard deviation of the log mean parameter.

sd

Numeric, log sd parameter of the gamma distribution.

sd_sd

Numeric, standard deviation of the log sd parameter.

max_value

Numeric, the maximum value to allow. Defaults to 120. Samples outside of this range are resampled.

samples

Numeric, number of sample distributions to generate.

Value

A data.table definining the distribution as used by dist_skel

Examples

def <- lognorm_dist_def(mean = 1.621, mean_sd = 0.0640,
                        sd = 0.418, sd_sd = 0.0691,
                        max_value = 20, samples = 10)
               
print(def)

def$params[[1]]

Format Credible Intervals

Description

Format Credible Intervals

Usage

make_conf(value, round_type = NULL, digits = 0)

Arguments

value

List of value to map into a string. Requires, point, lower, and upper.

round_type

Function, type of rounding to apply. Defaults to round.

digits

Numeric, defaults to 0. Amount of rounding to apply

Value

A character vector formatted for reporting

Examples

value <- list(list(point = 1, lower = 0, upper = 3))

make_conf(value, round_type = round, digits = 0)

Categorise the Probability of Change for Rt

Description

Categorises a numeric variable into "Increasing" (< 0.05), "Likely increasing" (<0.2), "Unsure" (< 0.8), "Likely decreasing" (< 0.95), "Decreasing" (<= 1)

Usage

map_prob_change(var)

Arguments

var

Numeric variable to be categorised

Value

A character variable.

Examples

var <- seq(0.01, 1, 0.01)

var
 
map_prob_change(var)

Impute Cases Date of Infection

Description

Impute Cases Date of Infection

Usage

nowcast_pipeline(
  reported_cases = NULL,
  linelist = NULL,
  target_date = NULL,
  earliest_allowed_onset = NULL,
  merge_actual_onsets = FALSE,
  approx_delay = FALSE,
  max_delay = 120,
  verbose = FALSE,
  samples = 1,
  delay_defs = NULL,
  incubation_defs = NULL,
  nowcast_lag = 8,
  onset_modifier = NULL
)

Arguments

reported_cases

A dataframe of reported cases

linelist

A linelist of report dates and onset dates

earliest_allowed_onset

A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset.

merge_actual_onsets

Logical, defaults to TRUE. Should linelist onset dates be used where available?

approx_delay

Logical, defaults to FALSE. Should delay sampling be approximated using case counts. Not appropriate when case numbers are low. Useful for high cases counts as decouples run time and resource usage from case count.

verbose

Logical, defaults to FALSE. Should internal nowcasting progress messages be returned.

delay_defs

A data.table that defines the delay distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

incubation_defs

A data.table that defines the incubation distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

nowcast_lag

Numeric, defaults to 4. The number of days by which to lag nowcasts. Helps reduce bias due to case upscaling.

onset_modifier

data.frame containing a date variable and a function modifier variable. This is used to modify estimated cases by onset date. modifier must be a function that returns a proportion when called (enables inclusion of uncertainty) and takes the following arguments: n (samples to return) and status ("local" or "import").

Examples

## Construct example distributions
## reporting delay dist
delay_dist <- suppressWarnings(
               EpiNow::get_dist_def(rexp(25, 1 / 10), 
                                    samples = 1, bootstraps = 1))
## incubation delay dist
incubation_dist <- delay_dist

## Uses example case vector from EpiSoon
cases <- data.table::setDT(EpiSoon::example_obs_cases)
cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")]

## Basic nowcast
nowcast <- nowcast_pipeline(reported_cases = cases, 
                            target_date = max(cases$date),
                            delay_defs = delay_dist,
                            incubation_defs = incubation_dist)
                            
nowcast

Plot a Time Series with Confidence.

Description

Plot a Time Series with Confidence.

Usage

plot_confidence(
  data,
  outer_alpha = 0.1,
  inner_alpha = 0.2,
  plot_median = TRUE,
  legend = "none"
)

Arguments

data

Dataframe containing the follwoing variables: date, median, type, bottom, top, lower, upper, and confidence

outer_alpha

Numeric, outer alpha level.

inner_alpha

Numeric, inner alpha level.

plot_median

Logical, defaults to FALSE. Should the median be plotted.

legend

Character string defaults to "none". Should a legend be displayed.

Value

A ggplot2 object.


Add a Forecast to a Plot

Description

Add a Forecast to a Plot

Usage

plot_forecast(plot = NULL, forecast = NULL)

Arguments

plot

ggplot2 plot.

forecast

Dataframe containing a forecast with the following variables: bottom, top, lower, and upper.

Value

A ggplot2 plot


Plot a Grid of Plots

Description

Plot a Grid of Plots

Usage

plot_grid(
  regions = NULL,
  plot_object = "bigr_eff_plot.rds",
  results_dir = "results",
  target_date = NULL,
  ...
)

Arguments

regions

A character string containing the list of regions to extract results for (must all have results for the same target date).

plot_object

A character string indicating the plot object to use as the base for the grid.

results_dir

A character string indicating the location of the results directory to extract results from.

target_date

A character string indicating the target date to extract results for. All regions must have results for this date.

...

Additional arguments to pass to patchwork::plot_layout

Value

A ggplot2 object combining multiple plots

Examples

## Code 
plot_grid

Plot Pipeline Results

Description

Plot Pipeline Results

Usage

plot_pipeline(
  target_date = NULL,
  target_folder = NULL,
  min_plot_date = NULL,
  report_forecast = FALSE
)

Arguments

target_date

Character string, in the form "2020-01-01". Date to cast.

target_folder

Character string, name of the folder in which to save the results.

min_plot_date

Character string, in the form "2020-01-01". Minimum date at which to start plotting estimates.

report_forecast

Logical, defaults to FALSE. Should the forecast be reported.


Plot a Summary of the Latest Results

Description

Plot a Summary of the Latest Results

Usage

plot_summary(summary_results, x_lab = "Region", log_cases = FALSE)

Arguments

summary_results

A datatable as returned by summarise_results (the data object).

x_lab

A character string giving the label for the x axis, defaults to region.

log_cases

Logical, should cases be shown on a logged scale. Defaults to FALSE

Value

A ggplot2 object


Extract a the Maximum Value of a Variable Based on a Filter

Description

Extract a the Maximum Value of a Variable Based on a Filter

Usage

pull_max_var(df, max_var = NULL, sel_var = NULL, type_selected = NULL)

Arguments

df

Datatable with the following variables: type and si_dist

type_selected

The nowcast type to extract.

var

Unquoted variable name to pull out the maximum R estimate for.

Value

A character string containing the maximum variable

Examples

df <- data.table::data.table(type = c("nowcast", "other"),
                             var = c(1:10),
                             sel = "test")
                             
pull_max_var(df, max_var = "var", sel_var = "var", type_selected = "nowcast")

Draw with an offset from a negative binomial distribution

Description

Samples size (the number of trials) of a binomial distribution copied from https://github.com/sbfnk/bpmodels/blob/master/R/utils.r

Usage

rbinom_size(n, x, prob)

Arguments

n

Numeric, number of samples to draw

x

Numeric, offset.

prob

Numeric, probability of successful trial


Regional Realtime Pipeline

Description

Runs a pipeline by region.

Usage

regional_rt_pipeline(
  cases = NULL,
  linelist = NULL,
  delay_defs = NULL,
  incubation_defs = NULL,
  target_folder = "results",
  target_date = NULL,
  merge_onsets = FALSE,
  case_limit = 40,
  onset_modifier = NULL,
  dt_threads = 1,
  verbose = FALSE,
  ...
)

Arguments

cases

A dataframe of cases (confirm) by date of confirmation (date), import status (import_status; ("imp)), and region (region).

linelist

A dataframe of of cases (by row) containing the following variables: import_status (values "local" and "imported"), date_onset, date_confirm, report_delay, and region. If a national linelist is not available a proxy linelist may be used but in this case merge_onsets should be set to FALSE.

delay_defs

A data.table that defines the delay distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

incubation_defs

A data.table that defines the incubation distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

target_folder

Character string, name of the folder in which to save the results.

target_date

Character string, in the form "2020-01-01". Date to cast.

merge_onsets

Logical defaults to FALSE. Should available onset data be used. Typically if regional_delay is

case_limit

Numeric, the minimum number of cases in a region required for that region to be evaluated. Defaults to 10. set to FALSE this should also be FALSE

onset_modifier

data.frame containing a date variable and a function modifier variable. This is used to modify estimated cases by onset date. modifier must be a function that returns a proportion when called (enables inclusion of uncertainty) and takes the following arguments: n (samples to return) and status ("local" or "import").

dt_threads

Numeric, the number of data.table threads to use. Set internally to avoid issue when running in parallel. Defaults to 1 thread.

verbose

Logical, defaults to FALSE. Should progress messages be shown for each reigon?

...

Examples

## Save everything to a temporary directory 
## Change this to inspect locally
target_dir <- tempdir() 

## Construct example distributions
## reporting delay dist
delay_dist <- suppressWarnings(
               EpiNow::get_dist_def(rexp(25, 1/10), 
                                    samples = 10, bootstraps = 1))

## Uses example case vector from EpiSoon
cases <- data.table::setDT(EpiSoon::example_obs_cases)
cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][,
                  cases := NULL]

cases <- data.table::rbindlist(list(
  data.table::copy(cases)[, region := "testland"],
  cases[, region := "realland"]))
  
## Run basic nowcasting pipeline
regional_rt_pipeline(cases = cases,
            delay_defs = delay_dist,
            target_folder = target_dir)

Generate Regional Summary Output

Description

Generate Regional Summary Output

Usage

regional_summary(
  results_dir = NULL,
  summary_dir = NULL,
  target_date = NULL,
  region_scale = "Region",
  csv_region_label = "region",
  log_cases = FALSE
)

Arguments

results_dir

A character string indicating the location of the results directory to extract results from.

summary_dir

A character string giving the directory in which to store summary of results.

target_date

A character string giving the target date for which to extract results (in the format "yyyy-mm-dd").

region_scale

A character string indicating the name to give the regions being summarised.

log_cases

Logical, should cases be shown on a logged scale. Defaults to FALSE

Examples

## Not run: 

## Example asssumes that CovidGlobalNow (github.com/epiforecasts/covid-global) is  
## in the directory above the root.
regional_summary(results_dir = "../covid-global/national",
                 summary_dir = "../covid-global/national-summary",
                 target_date = "2020-03-19",
                 region_scale = "Country")


## End(Not run)

Report Rate of Growth Estimates

Description

Report Rate of Growth Estimates

Usage

report_littler(target_folder)

Report Case Nowcast Estimates

Description

Returns a summarised nowcast as well as saving key information to the results folder.

Usage

report_nowcast(nowcast, cases, target, target_folder)

Arguments

nowcast

A dataframe as produced by nowcast_pipeline

cases

A dataframe of cases (in date order) with the following variables: date and cases.

target

Character string indicting the data type to use as the "nowcast". @param target_folder Character string indicating the folder into which to save results. Also used to extract previously generated results.


Report Effective Reproduction Number Estimates

Description

Report Effective Reproduction Number Estimates

Usage

report_reff(target_folder)

Provide Summary Statistics on an Rt Pipeline

Description

Provide Summary Statistics on an Rt Pipeline

Usage

report_summary(target_folder)

Real-time Pipeline

Description

Combine fitting a delay distribution, constructing a set of complete sampled linelists, nowcast cases by onset date, and estimate the time-varying effective reproduction number and rate of spread.

Usage

rt_pipeline(
  cases = NULL,
  linelist = NULL,
  delay_defs = NULL,
  incubation_defs = NULL,
  delay_cutoff_date = NULL,
  rt_samples = 5,
  rt_windows = 1:7,
  rate_window = 7,
  earliest_allowed_onset = NULL,
  merge_actual_onsets = TRUE,
  approx_delay = FALSE,
  approx_threshold = 10000,
  max_delay = 120,
  generation_times = NULL,
  rt_prior = NULL,
  nowcast_lag = 8,
  forecast_model = NULL,
  horizon = 0,
  report_forecast = FALSE,
  onset_modifier = NULL,
  min_forecast_cases = 200,
  target_folder = NULL,
  target_date = NULL,
  dt_threads = 1,
  verbose = FALSE
)

Arguments

cases

A dataframe of cases (in date order) with the following variables: date and cases.

linelist

A dataframe of of cases (by row) containing the following variables: import_status (values "local" and "imported"), date_onset, date_confirm, report_delay.

delay_defs

A data.table that defines the delay distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

incubation_defs

A data.table that defines the incubation distributions (model, parameters and maximum delay for each model). See get_delay_dist for an example of the structure.

delay_cutoff_date

Character string, in the form "2020-01-01". Cutoff date to use to estimate the delay distribution.

rt_samples

Numeric, the number of samples to take from the estimated R distribution for each time point.

rt_windows

Numeric vector, windows over which to estimate time-varying R. The best performing window will be selected per serial interval sample by default (based on which window best forecasts current cases).

rate_window

Numeric, the window to use to estimate the rate of spread.

earliest_allowed_onset

A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset.

merge_actual_onsets

Logical, defaults to TRUE. Should linelist onset dates be used where available?

approx_delay

Logical, defaults to FALSE. Should delay sampling be approximated using case counts. Not appropriate when case numbers are low. Useful for high cases counts as decouples run time and resource usage from case count.

generation_times

A matrix with columns representing samples and rows representing the probability of the serial intervel being on that day. Defaults to EpiNow::covid_generation_times.

rt_prior

A list defining the reproduction number prior containing the mean (mean_prior) and standard deviation (std_prior)

nowcast_lag

Numeric, defaults to 4. The number of days by which to lag nowcasts. Helps reduce bias due to case upscaling.

forecast_model

An uninitialised bsts model passed to EpiSoon::forecast_rt to be used for forecasting future Rt values. An example of the required structure is: function(ss, y){bsts::AddSemilocalLinearTrend(ss, y = y)}.

horizon

Numeric, defaults to 0. The horizon over which to forecast Rts and cases.

report_forecast

Logical, defaults to FALSE. Should the forecast be reported.

onset_modifier

data.frame containing a date variable and a function modifier variable. This is used to modify estimated cases by onset date. modifier must be a function that returns a proportion when called (enables inclusion of uncertainty) and takes the following arguments: n (samples to return) and status ("local" or "import").

min_forecast_cases

Numeric, defaults to 200. The minimum number of cases required in the last 7 days of data in order for a forecast to be run. This prevents spurious forecasts based on highly uncertain Rt estimates.

target_folder

Character string, name of the folder in which to save the results.

target_date

Character string, in the form "2020-01-01". Date to cast.

dt_threads

Numeric, the number of data.table threads to use. Set internally to avoid issue when running in parallel. Defaults to 1 thread.

verbose

Logical, defaults to FALSE. Should internal nowcasting progress messages be returned.

approx_thresold

Numeric, defaults to 10,000. Threshold of cases below which explicit sampling of onsets always occurs.

Examples

## Save everything to a temporary directory 
## Change this to inspect locally
target_dir <- tempdir() 

## Construct example distributions
## reporting delay dist
delay_dist <- suppressWarnings(
               EpiNow::get_dist_def(rexp(25, 1/10), 
                                    samples = 10, bootstraps = 1))

## Uses example case vector from EpiSoon
cases <- data.table::setDT(EpiSoon::example_obs_cases)
cases <- cases[, `:=`(confirm = as.integer(cases), import_status = "local")][,
                  cases := NULL]

## Run basic nowcasting pipeline
rt_pipeline(cases = cases,
            delay_defs = delay_dist,
            target_date = max(cases$date),
            target_folder = target_dir)

Approximate Sampling a Distribution using Counts

Description

Approximate Sampling a Distribution using Counts

Usage

sample_approx_dist(
  cases = NULL,
  dist_fn = NULL,
  max_value = 120,
  earliest_allowed_mapped = NULL,
  direction = "backwards"
)

Arguments

cases

A dataframe of cases (in date order) with the following variables: date and cases.

max_value

Numeric, maximum value to allow. Defaults to 120 days

earliest_allowed_mapped

A character string representing a date ("2020-01-01"). Indicates the earlies allowed mapped value.

direction

Character string, defato "backwards". Direction in which to map cases. Supports either "backwards" or "forwards".

Value

A data.table of cases by date of onset

Examples

cases <- data.table::as.data.table(EpiSoon::example_obs_cases) 

cases <- cases[, cases := as.integer(cases)] 

## Reported case distribution
print(cases)

## Total cases
sum(cases$cases)

delay_fn <- function(n, dist, cum) {
   pgamma(n + 0.9999, 2, 1) - pgamma(n - 1e-5, 2, 1)}

onsets <- sample_approx_dist(cases = cases,
                             dist_fn = delay_fn)
   
## Estimated onset distribution
print(onsets)
  
## Check that sum is equal to reported cases
total_onsets <- median(
   purrr::map_dbl(1:1000, 
                  ~ sum(sample_approx_dist(cases = cases,
                  dist_fn = delay_fn)$cases))) 
                   
total_onsets
 
                   
## Map from onset cases to reported                  
reports <- sample_approx_dist(cases = cases,
                              dist_fn = delay_fn,
                              direction = "forwards")

Sample Onset Dates for Cases missing them

Description

Sample Onset Dates for Cases missing them

Usage

sample_delay(linelist = NULL, delay_fn = NULL, earliest_allowed_onset = NULL)

Arguments

linelist

Dataframe with two variables: date_report and date_onset. As generated by generate_pseudo_linelist.

delay_fn

A sampling funtion that takes a single numeric argument and returns a vector of numeric samples this long.

earliest_allowed_onset

A character string in the form of a date ("2020-01-01") indiciating the earliest allowed onset.

Value

Dataframe with no missing data and two variables: date_report and date_onset


Convert a linelist into a nested tibble of linelists by day

Description

Convert a linelist into a nested tibble of linelists by day

Usage

split_linelist_by_day(linelist = NULL)

Arguments

linelist

Dataframe with the following variables date_onset_symptoms and date_confirmation

Value

A nested tibble with a linelist per day (daily_observed_linelist) variable containing date_onset and date_report and a date_report variable


Summarise a nowcast

Description

Summarise a nowcast

Usage

summarise_cast(nowcast)

Arguments

nowcast

A dataframe as produced by nowcast_pipeline

Value

A summarised dataframe


Summarise Realtime Results

Description

Summarise Realtime Results

Usage

summarise_results(
  regions = NULL,
  results_dir = "results",
  target_date = NULL,
  region_scale = "Region"
)

Arguments

regions

A character string containing the list of regions to extract results for (must all have results for the same target date).

results_dir

A character string indicating the location of the results directory to extract results from.

target_date

A character string indicating the target date to extract results for. All regions must have results for this date.

region_scale

A character string indicating the name to give the regions being summarised.

Examples

## Code

summarise_results

Summarise rt and cases as a csv

Description

Summarise rt and cases as a csv

Usage

summarise_to_csv(
  results_dir = NULL,
  summary_dir = NULL,
  type = "country",
  date = NULL
)

Arguments

results_dir

Character string indicating the directory from which to extract results

summary_dir

Character string the directory into which to save results

type

Character string, the region identifier to apply

date

A Character string (in the format "yyyy-mm-dd") indicating the date to extract data for. Defaults to "latest" which finds the latest results available.

Value

Nothing is returned


Custom Map Theme

Description

Custom Map Theme

Usage

theme_map(
  map = NULL,
  continuous = FALSE,
  variable_label = NULL,
  trans = "identity",
  fill_labels = NULL,
  scale_fill = NULL,
  breaks = NULL,
  ...
)

Arguments

map

ggplot2 map object

continuous

Logical defaults to FALSE. Is the fill variable continuous. #@param variable_label A character string indicating the variable label to use. If not supplied then the underlying variable name is used.

trans

A character string specifying the transform to use on the specified metric. Defaults to no transform ("identity"). Other options include log scaling ("log") and log base 10 scaling ("log10"). For a complete list of options see ggplot2::continous_scale.

fill_labels

A function to use to allocate legend labels. An example (used below) is scales::percent, which can be used for percentage data.

scale_fill

Function to use for scaling the fill. Defaults to a custom ggplot2::scale_fill_manual

breaks

Breaks to use in legend. Defaults to ggplot2::waiver.

additional

arguments passed to scale_fill

Value

A ggplot2 object

Examples

## Code 
theme_map