This function allows for combining two different incidence time series, see Details . The two timeseries can represent events that are differently delayed from the original infection events. The two data sources must not have any overlap in the events recorded. The function can account for the one of the two types of events to require the future observation of the other type of event. For instance, one type can be events of symptom onset, and the other be case confirmation. Typically, the recording of a symptom onset event will require a future case confirmation. If so, the partial_observation_requires_full_observation flag should be set to TRUE.

estimate_from_combined_observations(
  partially_delayed_incidence,
  fully_delayed_incidence,
  smoothing_method = "LOESS",
  deconvolution_method = "Richardson-Lucy delay distribution",
  estimation_method = "EpiEstim sliding window",
  delay_until_partial,
  delay_until_final_report,
  partial_observation_requires_full_observation = TRUE,
  ref_date = NULL,
  time_step = "day",
  output_Re_only = TRUE,
  ...
)

Arguments

partially_delayed_incidence

An object containing incidence data through time. It can be:

  • A list with two elements:

    1. A numeric vector named values: the incidence recorded on consecutive time steps.

    2. An integer named index_offset: the offset, counted in number of time steps, by which the first value in values is shifted compared to a reference time step This parameter allows one to keep track of the date of the first value in values without needing to carry a date column around. A positive offset means values are delayed in the future compared to the reference values. A negative offset means the opposite.

  • A numeric vector. The vector corresponds to the values element descrived above, and index_offset is implicitely zero. This means that the first value in incidence_data is associated with the reference time step (no shift towards the future or past).

fully_delayed_incidence

An object containing incidence data through time. It can be:

  • A list with two elements:

    1. A numeric vector named values: the incidence recorded on consecutive time steps.

    2. An integer named index_offset: the offset, counted in number of time steps, by which the first value in values is shifted compared to a reference time step This parameter allows one to keep track of the date of the first value in values without needing to carry a date column around. A positive offset means values are delayed in the future compared to the reference values. A negative offset means the opposite.

  • A numeric vector. The vector corresponds to the values element descrived above, and index_offset is implicitely zero. This means that the first value in incidence_data is associated with the reference time step (no shift towards the future or past).

smoothing_method

string. Method used to smooth the original incidence data. Available options are:

deconvolution_method

string. Method used to infer timings of infection events from the original incidence data (aka deconvolution step). Available options are:

estimation_method

string. Method used to estimate reproductive number values through time from the reconstructed infection timings. Available options are:

delay_until_partial

Single delay or list of delays. Each delay can be one of:

  • a list representing a distribution object

  • a discretized delay distribution vector

  • a discretized delay distribution matrix

  • a dataframe containing empirical delay data

delay_until_final_report

Single delay or list of delays. Each delay can be one of:

  • a list representing a distribution object

  • a discretized delay distribution vector

  • a discretized delay distribution matrix

  • a dataframe containing empirical delay data

partial_observation_requires_full_observation

boolean Set to TRUE if partially_delayed_incidence represent delayed observations of infection events that themselves rely on further-delayed observations. See Details for more details.

ref_date

Date. Optional. Date of the first data entry in incidence_data

time_step

string. Time between two consecutive incidence datapoints. "day", "2 days", "week", "year"... (see seq.Date for details)

output_Re_only

boolean. Should the output only contain Re estimates? (as opposed to containing results for each intermediate step)

...

Arguments passed on to .smooth_LOESS, .deconvolve_incidence_Richardson_Lucy, .estimate_Re_EpiEstim_sliding_window, .estimate_Re_EpiEstim_piecewise_constant, merge_outputs, nowcast

data_points_incl

integer. Size of the window used in the LOESS algorithm. The span parameter passed to loess is computed as the ratio of data_points_incl and the number of time steps in the input data.

degree

integer. LOESS degree. Must be 0, 1 or 2.

initial_Re_estimate_window

integer. In order to help with the smoothing, the function extends the data back in time, padding with values obtained by assuming a constant Re. This parameter represents the number of timesteps in the beginning of incidence_input to take into account when computing the average initial Re.

delay_distribution

numeric square matrix or vector.

threshold_chi_squared

numeric scalar. Threshold for chi-squared values under which the R-L algorithm stops.

max_iterations

integer. Maximum threshold for the number of iterations in the R-L algorithm.

verbose

Boolean. Print verbose output?

estimation_window

Use with estimation_method = "EpiEstim sliding window" Positive integer value. Number of data points over which to assume Re to be constant.

import_incidence_input

NULL or module input object. List with two elements:

  1. A numeric vector named values: the incidence recorded on consecutive time steps.

  2. An integer named index_offset: the offset, counted in number of time steps, by which the first value in values is shifted compared to a reference time step This parameter allows one to keep track of the date of the first value in values without needing to carry a date column around. A positive offset means values are delayed in the future compared to the reference values. A negative offset means the opposite.

If not NULL, this data represents recorded imported cases. And then incidence_input represents only local cases.

minimum_cumul_incidence

Numeric value. Minimum number of cumulated infections before starting the Re estimation. Default is 12 as recommended in Cori et al., 2013.

mean_serial_interval

Numeric positive value. mean_si for estimate_R

std_serial_interval

Numeric positive value. std_si for estimate_R

mean_Re_prior

Numeric positive value. mean prior for estimate_R

output_HPD

Boolean. If TRUE, return the highest posterior density interval with the output.

interval_ends

Use with estimation_method = "EpiEstim piecewise constant" Integer vector. Optional argument. If provided, interval_ends overrides the interval_length argument. Each element of interval_ends specifies the right boundary of an interval over which Re is assumed to be constant for the calculation. Values in interval_ends must be integer values corresponding with the same numbering of time steps as given by incidence_input. In other words, interval_ends and incidence_input, use the same time step as the zero-th time step.

interval_length

Use with estimation_method = "EpiEstim piecewise constant" Positive integer value. Re is assumed constant over steps of size interval_length.

cutoff_observation_probability

value between 0 and 1. Only datapoints for timesteps that have a probability of observing a event higher than cutoff_observation_probability are kept. The few datapoints with a lower probability to be observed are trimmed off the tail of the timeseries.

gap_to_present

Integer. Default value: 0. Number of time steps truncated off from the right tail of the raw incidence data. See Details for more details.

Value

Effective reproductive estimates through time. If output_Re_only is FALSE, then transformations made on the input observations during calculations are output as well.

Details

With this function, one can specify two types of delayed observations of infection events (in the same epidemic). The two incidence records are passed with the partially_delayed_incidence and fully_delayed_incidence. These two types of delayed observations must not overlap with one another: a particular infection event should not be recorded in both time series.

If the two sets of observations are completely independent from one another, meaning that they represents two different ways infection events can be observed, with two different delays then set partial_observation_requires_full_observation to FALSE. Note that a particular infection events should NOT be recorded twice: it cannot be recorded both in partially_delayed_incidence and in fully_delayed_incidence.

An alternative use-case is when the two sets of observations are not independent from one another. For instance, if to record a "partially-delayed" event, one had to wait to record it as a "fully-delayed" event first. A typical example of this occurs when recording symptom onset events: in most cases, you must first wait until a case is confirmed via a positive test result to learn about the symptom onset event (assuming the case was symptomatic in the first place). But you typically do not have the date of onset of symptoms for all cases confirmed (even assumed they were all symptomatic cases). In such a case, we set the partial_observation_requires_full_observation flag to TRUE and we call the incidence constructed from events of symptom onset partially_delayed_incidence and the incidence constructed from case confirmation events fully_delayed_incidence. The delay from infection to symptom onset events is specified with the delay_until_partial argument. The delay from symptom onset to positive test in this example is specified with the delay_until_final_report argument. Note that, for a particular patient, if the date of onset of symptom is known, the patient must not be counted again in the incidence of case confirmation. Otherwise, the infection event would have been counted twice.

Examples

shape_incubation = 3.2 scale_incubation = 1.3 delay_incubation <- list(name="gamma", shape = shape_incubation, scale = scale_incubation) shape_onset_to_report = 2.7 scale_onset_to_report = 1.6 delay_onset_to_report <- list(name="gamma", shape = shape_onset_to_report, scale = scale_onset_to_report) ## Basic usage of estimate_from_combined_observations Re_estimate_1 <- estimate_from_combined_observations( partially_delayed_incidence = HK_incidence_data$onset_incidence, fully_delayed_incidence = HK_incidence_data$report_incidence, partial_observation_requires_full_observation = TRUE, delay_until_partial = delay_incubation, delay_until_final_report = delay_onset_to_report ) ## Advanced usage of estimate_from_combined_observations # Getting a more verbose result. Adding a date column and returning intermediate # results as well as the Re estimate. Re_estimate_2 <- estimate_from_combined_observations( partially_delayed_incidence = HK_incidence_data$onset_incidence, fully_delayed_incidence = HK_incidence_data$report_incidence, partial_observation_requires_full_observation = TRUE, delay_until_partial = delay_incubation, delay_until_final_report = delay_onset_to_report, ref_date = HK_incidence_data$date[1], output_Re_only = FALSE ) # Incorporating prior knowledge over Re. Here, Re is assumed constant over a time # frame of one week, with a prior mean of 1.25. Re_estimate_3 <- estimate_from_combined_observations( partially_delayed_incidence = HK_incidence_data$onset_incidence, fully_delayed_incidence = HK_incidence_data$report_incidence, partial_observation_requires_full_observation = TRUE, delay_until_partial = delay_incubation, delay_until_final_report = delay_onset_to_report, estimation_method = 'EpiEstim piecewise constant', interval_length = 7, mean_Re_prior = 1.25 ) # Incorporating prior knowledge over the disease. Here, the mean of the serial # interval is assumed to be 5 days, and the standard deviation is assumed to be # 2.5 days. Re_estimate_4 <- estimate_from_combined_observations( partially_delayed_incidence = HK_incidence_data$onset_incidence, fully_delayed_incidence = HK_incidence_data$report_incidence, partial_observation_requires_full_observation = TRUE, delay_until_partial = delay_incubation, delay_until_final_report = delay_onset_to_report, mean_serial_interval = 5, std_serial_interval = 2.5 )