Discretised distributions

This vignette describes the parametric delay distributions that are currently available in epinowcast and explains how they are internally discretised.

Available distributions

The currently available parametric delay distributions are continuous probability distributions with (up to) two parameters μg, t and υg, t. The table below provides a link to the definition of each distribution, specifies how the parameters μg, t and υg, t are mapped to the parameters of the distribution (according to the referenced definition), and states the resulting mean of the distribution (before discretization and adjustment for the assumed maximum delay).

Distribution Parametrization Mean
Log-normal μ = μg, t, σ = υg, t $\exp(\mu_{g,t}+\frac{\upsilon_{g,t}^2}{2})$
Exponential β = exp (−μg, t) exp (μg, t)
Gamma α = exp (μg, t), β = υg, t exp (μg, t)/υg, t
Log-logistic α = exp (μg, t), β = υg, t $\frac{\exp(\mu_{g,t})\,\pi/\upsilon_{g,t}}{\sin(\pi/\upsilon_{g,t})}$

Discretisation and adjustment for maximum delay

In epinowcast, delays are modeled in discrete time and with an assumed maximum delay (specified via the max_delay argument). Therefore, the continuous delay distributions must be discretised and adjusted for the maximum delay.

The exact form of this discretisation is complex due to the interaction between primary and secondary events. Rather than modelling this explicitly, we approximate it by assuming a uniform censoring interval of 2 days for each delay. This comes from assuming daily censoring of both the primary and secondary events, which together define the delay distribution, and ignoring potential interactions between primary and secondary events. As a result, the probability of reporting a delay of d days equals the probability of reporting a delay of d + 1 days or less, minus the probability of reporting a delay of d − 1 days or less. This is then normalised by the overall probability of reporting any delay up to some maximum observed delay, D.

More formally, we define this in terms of the cumulative distribution function of the delay distribution. Let Fμg, t, υg, t be the cumulative distribution function of a continuous probability distribution of delays with parameters μg, t and υg, t. Then, the probability of reporting a delay of d days is $$p_{g,t,d} = \frac{F^{\mu_{g,t}, \upsilon_{g,t}}(d+1) - F^{\mu_{g,t}, \upsilon_{g,t}}(d-1)}{F^{\mu_{g,t}, \upsilon_{g,t}}(D + 1 ) + F^{\mu_{g,t}, \upsilon_{g,t}}(D)}.$$

Unless d = 0 then we instead have

$$p_{g,t,0} = \frac{F^{\mu_{g,t}, \upsilon_{g,t}}(1)}{F^{\mu_{g,t}, \upsilon_{g,t}}(D + 1 ) + F^{\mu_{g,t}, \upsilon_{g,t}}(D)}.$$

Normalising by Fμg, t, υg, t(D + 1) + Fμg, t, υg, t(D), ensures that the pg, t, d sum to 1. Since Fμg, t, υg, t(D) is the probability of reporting before the maximum delay, this can also be interpreted as conditioning our distribution on the maximum delay.

Note that because of the discretisation and normalization, the discrete delay distribution we obtain only approximates the original continuous distribution, and the approximation is worse for shorter delays.