--- title: "Model features" output: rmarkdown::html_vignette: toc: true number_sections: true bibliography: library.bib csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl vignette: > %\VignetteIndexEntry{Model features} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette provides a quick reference to the modelling features available in EpiNow2. For an overview of how these models connect see the [model overview](model_overview.html). For mathematical details see the model definition vignettes ([infection model](estimate_infections.html), [secondary model](estimate_secondary.html), [truncation model](estimate_truncation.html), [distribution model](estimate_dist.html)). For applied examples see the [options vignette](estimate_infections_options.html). For prior guidance see the [prior choice guide](prior_choice_guide.html). # Component overview The package's modelling features fall into four groups. **Estimation models** fit a Stan model to data: - [Infection model](#infection-model) — `estimate_infections()` - [Secondary model](#secondary-model) — `estimate_secondary()` - [Truncation / nowcasting](#truncation-nowcasting) — `estimate_truncation()` - [Delay distribution fitting](#delay-distribution-fitting) — `estimate_dist()` **Model configuration** controls how the estimation models behave: - [Reproduction number](#reproduction-number) — `rt_opts()` - [Gaussian process](#gaussian-process) — `gp_opts()` - [Delay distributions](#delay-distributions) — `gt_opts()`, `delay_opts()`, `trunc_opts()` - [Observation model](#observation-model) — `obs_opts()` - [Options summary](#options-summary) — defaults and which models use each **Forward simulation and forecasting** generate observations: - [Simulation](#simulation) — `simulate_infections()`, `simulate_secondary()` - [Forecasting](#forecasting) — `forecast_infections()`, `forecast_secondary()` **Supporting utilities**: - [Data preprocessing](#data-preprocessing) — `fill_missing()` - [Workflow wrappers](#workflow-wrappers) — `epinow()`, `regional_epinow()` - [Stan backend](#stan-backend) — `stan_opts()` # Estimation models ## Infection model `estimate_infections()` reconstructs infections from a count time series (e.g. reported cases) using either a generative renewal model or non-parametric back-calculation. | Feature | Argument | Description | See also | |---|---|---|---| | Renewal equation (default) | `rt = rt_opts(...)` | Generative model using the reproduction number and generation time | [Model definition](estimate_infections.html) | | Back-calculation | `rt = NULL`, `backcalc = backcalc_opts(...)` | Non-parametric deconvolution of reported cases | [Model definition](estimate_infections.html) | See also: [estimate_infections workflow](estimate_infections_workflow.html) ## Secondary model `estimate_secondary()` estimates the relationship between a primary and secondary observation (e.g. cases and deaths, admissions and bed occupancy). | Feature | Argument | Description | See also | |---|---|---|---| | Incidence model | `secondary_opts(type = "incidence")` | Secondary reports as a convolution of primary cases | [Secondary model definition](estimate_secondary.html) | | Prevalence model | `secondary_opts(type = "prevalence")` | Cumulative secondary reports (e.g. bed occupancy) | [Secondary model definition](estimate_secondary.html) | See also: [estimate_secondary model definition](estimate_secondary.html) ## Truncation / nowcasting {#truncation-nowcasting} Recent observations are typically incomplete due to reporting delays (right truncation). `estimate_truncation()` estimates a truncation distribution from multiple snapshots of the same data source over time and produces a nowcast (predicted complete observations). The reconstructed observations can be obtained from the fit with `get_predictions()`. The estimated distribution can be passed to `estimate_infections()` via `trunc_opts()` for truncation-adjusted inference. | Feature | Argument | Description | See also | |---|---|---|---| | Estimate truncation and nowcast | `estimate_truncation(data = list_of_snapshots)` | Fit a truncation distribution and produce a nowcast from data snapshots | [Truncation model definition](estimate_truncation.html) | | Truncation-adjusted inference | `estimate_infections(truncation = trunc_opts(dist = ...))` | Adjust for right-truncation of recent data in the observation model | [Truncation model definition](estimate_truncation.html) | See also: [estimate_truncation model definition](estimate_truncation.html) ## Delay distribution fitting `estimate_dist()` fits delay distributions from linelist data using likelihood functions vendored from [primarycensored](https://primarycensored.epinowcast.org/). | Feature | Argument | Description | See also | |---|---|---|---| | Double interval censoring | Date columns in linelist | Accounts for censoring of both primary and secondary events | [Distribution model](estimate_dist.html) | | Variable censoring windows | `pdate_upr`, `sdate_upr` columns | Per-observation primary and secondary censoring window widths | [Distribution model](estimate_dist.html) | | Right truncation | `obs_date` column | Per-observation truncation times from reporting delays | [Distribution model](estimate_dist.html) | | Primary event distribution | `estimate_dist(primary = "expgrowth")` | Account for exponential growth in primary events | [Distribution model](estimate_dist.html) | | Distribution families | `estimate_dist(dist = ...)` | `"lognormal"` (default), `"gamma"`, `"normal"`, `"exp"`, `"weibull"` | [Distribution model](estimate_dist.html) | | Untruncated approximation | `estimate_dist(obs_time_threshold = ...)` | Skip the right-truncation renormalisation when observation times are far beyond the largest observed delay | [Distribution model](estimate_dist.html) | | Observation aggregation | Automatic | Identical delay-censoring-truncation strata are aggregated to speed up fitting | [Distribution model](estimate_dist.html) | See also: [distribution model](estimate_dist.html), [worked example](estimate_dist_workflow.html) # Model configuration ## Reproduction number When using the renewal equation, several options control how the time-varying reproduction number is modelled. | Feature | Argument | Description | See also | |---|---|---|---| | GP on differences of log Rt (default) | `rt_opts(gp_on = "R_t-1")` | Gaussian process applied to successive Rt values | [Options examples](estimate_infections_options.html) | | GP on deviations from R0 | `rt_opts(gp_on = "R0")` | Gaussian process applied as deviations from a global mean | [Options examples](estimate_infections_options.html) | | Random walk | `rt_opts(rw = 7)` with `gp = NULL` | Piecewise-constant Rt with specified step size | [Options examples](estimate_infections_options.html) | | Breakpoints | Add a `breakpoint` column to the input data | Step changes in Rt at user-specified dates; on by default in `rt_opts()`, set `use_breakpoints = FALSE` to disable | [Options examples](estimate_infections_options.html) | | Fixed Rt | `gp = NULL` (and default `rt_opts()`) | Constant reproduction number sampled from the prior; no GP, no random walk | [Options examples](estimate_infections_options.html) | | Population adjustment | `rt_opts(pop = Fixed(N))` | Adjust Rt for susceptible depletion; `pop` accepts any `` | [Options examples](estimate_infections_options.html) | | Growth rate method | `rt_opts(growth_method = "infectiousness")` | Alternative growth rate calculation via infectiousness | [Model definition](estimate_infections.html) | See also: [estimate_infections model definition](estimate_infections.html), [options examples](estimate_infections_options.html) ## Gaussian process The Gaussian process controls the flexibility of the time-varying reproduction number or infection trajectory. Configured via `gp_opts()`. For kernel mathematics and the Hilbert space spectral approximation see the [GP implementation details](gaussian_process_implementation_details.html) vignette. | Feature | Argument | Description | |---|---|---| | Kernel choice | `gp_opts(kernel = ...)` | `"matern"` (default), `"se"`, `"ou"`, or `"periodic"` | | Disable GP | `gp = NULL` in `estimate_infections()` | Remove the Gaussian process entirely | | Accuracy/speed tradeoff | `gp_opts(basis_prop = ...)` | Higher values increase accuracy; lower values are faster (default 0.2) | See also: [GP implementation details](gaussian_process_implementation_details.html), [options examples](estimate_infections_options.html), [prior choice guide](prior_choice_guide.html) ## Delay distributions Delay distributions map latent infections to observed quantities. See the [workflow vignette](estimate_infections_workflow.html) for a practical guide and the [prior choice guide](prior_choice_guide.html) for default priors. | Feature | Argument | Description | |---|---|---| | Generation time | `gt_opts(dist = ...)` | Time between successive infections | | Reporting delay | `delay_opts(dist = ...)` | Delay from infection to report | | Composite delays | `delay_opts(dist = dist1 + dist2)` | Sum of independent delay distributions | | Non-parametric delays | `dist = NonParametric(...)` | Fixed PMF rather than a parametric family | | Uncertain parameters | e.g. `LogNormal(meanlog = Normal(...), ...)` | Parameters drawn from prior distributions | | Truncation correction | `trunc_opts(dist = ...)` | Adjust for right-truncation of recent data | See also: [workflow vignette](estimate_infections_workflow.html), [prior choice guide](prior_choice_guide.html), [fitting delay distributions](estimate_dist_workflow.html) ## Observation model The observation model links latent expected cases to reported data. Configured via `obs_opts()`. For default prior values see the [prior choice guide](prior_choice_guide.html). | Feature | Argument | Description | |---|---|---| | Negative binomial (default) | `obs_opts(family = "negbin")` | Overdispersed count model | | Poisson | `obs_opts(family = "poisson")` | No overdispersion | | Day-of-week effect | `obs_opts(week_effect = TRUE)` | Separate reporting rate per day of the week (default on) | | Scaling / ascertainment | `obs_opts(scale = Normal(...))` | Fraction of infections observed; accepts any `` | | Likelihood weighting | `obs_opts(weight = ...)` | Re-weight observations in the log density | | Aggregated data | `fill_missing(missing_dates = "accumulate")` | Latent daily expectations are accumulated in the model before likelihood evaluation | | Missing observations | NA values in data | Time points with NA observations are excluded from the likelihood | Aggregation and missing data are handled at the model level in Stan, not just in preprocessing. `fill_missing()` constructs the flags that the Stan model uses; the model itself accumulates latent expected reports and drops missing observations from the likelihood. Both `estimate_infections()` and `estimate_secondary()` support these features. See also: [workflow vignette](estimate_infections_workflow.html), [prior choice guide](prior_choice_guide.html) ## Options summary The table below summarises the options functions, what they configure, their defaults, and which estimation functions use them. | Function | Configures | Default | Used by | |----------|-----------|---------|---------| | `rt_opts()` | Reproduction number model | GP on log Rt differences | `estimate_infections()` | | `gp_opts()` | Gaussian process prior | Matern 3/2 kernel | `estimate_infections()` | | `gt_opts()` | Generation time distribution | `Fixed(1)` (degenerate 1-day generation interval; users normally override) | `estimate_infections()` | | `delay_opts()` | Reporting delay distributions | `Fixed(0)` (no delay) | `estimate_infections()`, `estimate_secondary()` | | `obs_opts()` | Observation model | Negative binomial with day-of-week effect | `estimate_infections()`, `estimate_secondary()` | | `trunc_opts()` | Truncation distribution | `Fixed(0)` (no truncation) | `estimate_infections()` | | `backcalc_opts()` | Back-calculation settings | Smoothed reports as prior | `estimate_infections()` (when `rt = NULL`) | | `secondary_opts()` | Secondary model type | Incidence | `estimate_secondary()` | See the [prior choice guide](prior_choice_guide.html) for default values and guidance. # Forward simulation and forecasting The estimation models already produce forecasts as part of their fit, projecting forward over a horizon set by the `horizon` argument. The functions below are for the cases where you want to simulate or forecast separately from a fit: generating synthetic observations from known parameters, or extending a fitted model with new inputs (e.g. a new Rt trajectory or new primary data). ## Simulation Simulation functions generate observations from known or fixed parameters, useful for model checking and scenario analysis. | Feature | Argument | Description | See also | |---|---|---|---| | Simulate infections | `simulate_infections(R, initial_infections)` | Forward-simulate from a given Rt trajectory via the renewal equation | [Workflow](estimate_infections_workflow.html) | | Simulate secondary | `simulate_secondary(primary, ...)` | Simulate secondary observations from primary data | [Secondary model definition](estimate_secondary.html) | | Convolve and scale | `convolve_and_scale(data, ...)` | R-based convolution with time-varying parameters | [Secondary model definition](estimate_secondary.html) | See also: [workflow vignette](estimate_infections_workflow.html) ## Forecasting Forecasts can be generated from fitted models by projecting forward with specified or estimated parameters. | Feature | Argument | Description | See also | |---|---|---|---| | Infection forecast | `forecast_infections(estimates, R)` | Simulate future infections from a fitted model with a new Rt trajectory | [Workflow](estimate_infections_workflow.html) | | Secondary forecast | `forecast_secondary(estimate, primary)` | Forecast secondary observations from new primary data | [Secondary model definition](estimate_secondary.html) | See also: [workflow vignette](estimate_infections_workflow.html) # Supporting utilities ## Data preprocessing `fill_missing()` is an R-side utility that prepares data for the estimation functions by constructing the accumulation and missingness flags that the Stan models require. | Feature | Argument | Description | |---|---|---| | Accumulate missing dates | `fill_missing(missing_dates = "accumulate")` | Mark missing dates for model-level accumulation | | Zero-fill missing dates | `fill_missing(missing_dates = "zero")` | Insert zero observations for missing dates | | Ignore missing dates (default) | `fill_missing(missing_dates = "ignore")` | Skip missing dates in the likelihood | | Handle NA observations | `fill_missing(missing_obs = ...)` | Accumulate or zero-fill NA values | | Weekly reporting | `fill_missing(initial_accumulate = 7)` | Set accumulation window for weekly data | See also: [workflow vignette](estimate_infections_workflow.html) ## Workflow wrappers Convenience functions wrap the core estimation and reporting steps. | Feature | Argument | Description | See also | |---|---|---|---| | Production wrapper | `epinow()` | Wraps `estimate_infections()` with logging and formatted output | [epinow vignette](epinow.html) | | Multi-region | `regional_epinow()` | Runs `epinow()` across regions in parallel | [Getting started](EpiNow2.html) | | Region-specific options | `opts_list()` | Generate per-region configuration lists | [Getting started](EpiNow2.html) | See also: [epinow vignette](epinow.html), [getting started](EpiNow2.html) ## Stan backend Models are implemented in Stan. Users can switch between MCMC sampling (default, via `cmdstanr` or `rstan`), variational inference, the Laplace approximation, or pathfinder using `stan_opts(method = ...)`. See the [workflow vignette](estimate_infections_workflow.html) for details on configuring the backend.