Skip to contents

Estimates the mean and standard deviation of the serial interval distribution from outbreak data using the Expectation-Maximization (EM) algorithm developed by Vink et al. (2014). The serial interval is defined as the time between symptom onset in a primary case and symptom onset in a secondary case infected by that primary case.

Usage

si_estim(
  dat,
  n = 50,
  dist = "normal",
  init = NULL,
  tol = 1e-06,
  n_starts = 1,
  n_routes = 4L,
  wind = 1
)

Arguments

dat

numeric vector; index case-to-case (ICC) intervals in days.

n

integer; number of EM algorithm iterations to perform. Defaults to 50.

dist

character; the assumed parametric family for the serial interval distribution. Must be either "normal" (default) or "gamma".

init

numeric vector of length 2; initial values for the mean and standard deviation. If NULL (default), uses the sample mean and sample standard deviation.

tol

numeric; convergence tolerance for the EM algorithm. Defaults to 1e-6.

n_starts

integer; number of random restarts for the EM algorithm. Defaults to 1.

n_routes

integer; number of transmission routes to model. Must be >= 2. Defaults to 4 (Co-Primary, Primary-Secondary, Primary-Tertiary, Primary-Quaternary). Increasing this allows modelling longer transmission chains.

wind

The window censure interval

Value

A named list containing:

  • mean: Estimated mean of the serial interval distribution (days)

  • sd: Estimated standard deviation of the serial interval distribution (days)

  • wts: Numeric vector of estimated component weights. Length is 2*n_routes - 1 for normal distribution, n_routes for gamma distribution.

  • converged: Logical indicating whether the algorithm converged.

  • iterations: Integer indicating the number of iterations performed.

  • loglik: Log-likelihood of the fitted model.

  • n_restarts: Number of restarts performed.

  • n_routes: Number of transmission routes used.

References

Vink MA, Bootsma MCJ, Wallinga J (2014). Serial intervals of respiratory infectious diseases: A systematic review and analysis. American Journal of Epidemiology, 180(9), 865-875. doi:10.1093/aje/kwu209

Examples

# Example 1: Basic usage with simulated data, default 4 routes
set.seed(123)
simulated_icc <- c(
  rep(1, 20), rep(2, 25), rep(3, 15), rep(4, 8)
)
result <- si_estim(simulated_icc)

# \donttest{
# Example 2: Using 5 routes
result_5routes <- si_estim(simulated_icc, n_routes = 5)

# Example 3: Using gamma distribution with 3 routes
result_gamma <- si_estim(simulated_icc, dist = "gamma", n_routes = 3)
# }