Sampling Contacts

Overview¶

Agents' interactions in epiworld are implemented in various ways. Depending on the model, contacts are drawn either directly from agents' connections (network models) or via binomial sampling over groups defined by a contact matrix (mixing models). This section explains the optimization used in the mixing models to reduce computation time when modeling potentially infectious contacts.

Drawing contacts¶

In networked models, we assume that all contacts are drawn at each step of the simulation. The randomness comes from the probability of transmission, which depends on the virus, the infectious agent, and the susceptible agent.

In the mixing models, each agent belongs to a group, and interactions are governed by a contact matrix \(C\). Entry \(C(g, j)\) is the expected number of contacts that an agent in group \(g\) has with agents in group \(j\) per day. Internally, the matrix is stored in column-major order, so entry \((g, j)\) is stored at contact_matrix[j * n_groups + g].

Let \(n_{\text{avail}}(j)\) denote the number of agents in group \(j\) that are currently available for mixing, and let \(n_{\text{inf}}(j)\) denote the number of infectious agents in that group. Then the total number of contacts that an agent in group \(g\) has with group \(j\) can be written as

\[ Y(g,j) \sim \text{Binomial}\left(n_{\text{avail}}(j), \frac{C(g, j)}{n_{\text{avail}}(j)}\right). \]

Because only infectious contacts can change the state of a susceptible agent, the implementation samples directly from the infectious subset instead:

\[ X(g,j) \sim \text{Binomial}\left(n_{\text{inf}}(j), \frac{C(g, j)}{n_{\text{avail}}(j)}\right). \]

This is exactly what the code implements through

rbinom(
    n_inf_in_group_j,
    adjusted_contact_rate[j] * contact_matrix[j * n_groups + g]
)

where g is the group of the focal agent, adjusted_contact_rate[j] = 1 / n_avail(j) whenever the group has available agents, and 0 otherwise.

The expected number of infectious contacts coming from group \(j\) is therefore

\[ \mathbb{E}\left[X(g,j)\right] = n_{\text{inf}}(j) \times \frac{C(g, j)}{n_{\text{avail}}(j)}. \]

Summing over all groups, the expected number of total contacts for an agent in group \(g\) is

\[ \mathbb{E}\left[\sum_j Y(g,j)\right] = \sum_j C(g, j). \]

This is the key difference from the old parameterization based on a global contact rate and a row-stochastic mixing matrix. In the current implementation, the contact matrix already stores the expected number of contacts directly, so there is no separate contact-rate parameter to distribute across groups. The single-group case is recovered by using a \(1 \times 1\) matrix whose only entry is the expected number of contacts per agent per day.

Since the algorithm only draws infectious contacts, it avoids spending time drawing non-infectious contacts that cannot affect transmission. Once the number of infectious contacts from each group has been sampled, infectious agents are selected uniformly at random from that group's current infectious list.

Mixing models with quarantine¶

When the model features quarantine, isolation, hospitalization, or any other mechanism that removes agents from the mixing pool, the quantity \(n_{\text{avail}}(j)\) changes over time. The contact matrix \(C\) does not change, but the binomial sampling probability does:

\[ X(g,j) \sim \text{Binomial}\left(n_{\text{inf}}(j), \frac{C(g, j)}{n_{\text{avail}}(j)}\right). \]

As \(n_{\text{avail}}(j)\) decreases, the probability of drawing an infectious contact from group \(j\) increases for a fixed value of \(C(g, j)\). This is why quarantine-oriented models recompute the reciprocal of the number of available agents in each group at every step. The exact definition of an "available" agent depends on the model, but the common idea is the same: agents who are temporarily removed from mixing are excluded from the denominator used to sample contacts.

In the Measles and SEIR quarantine models, this means that the effective exposure pressure from a group can increase when a large share of that group is removed from circulation while infectious individuals remain available to mix.

Probability of Infection¶

Generally, epiworld assumes that at each step of the simulation, susceptible agents can acquire the disease from at most one infectious agent. The probability of transmission from \(i\) to \(j\) is given by the following formula:

\[ P(i\to j| \text{at most one}) = \frac{p_{ij} \times \prod_{k\neq i}\left(1 - p_{kj}\right)}{\prod_k\left(1 - p_{kj}\right) + \sum_k p_{kj} \times \prod_{l\neq k}\left(1 - p_{lj}\right)} \]

The adjusted probabilities \(p_{ij}\) are computed as a function of \(i\), \(j\), and the virus. The following section describes how these probabilities are computed.

Adjusted probabilities¶

Viruses and tools provide a way to adjust how agents move between states. Viruses in epiworld contain various baseline probabilities used across models, including transmission, recovery, and death. Tools, on the other hand, alter these probabilities by reducing or increasing them, affecting agents' susceptibility, infectiousness, recovery, and death probabilities. Currently, tools alter these probabilities by a constant factor,

\[ p_{ij} = p_{v} \times \left(1 - factor_{host}\right) \times \left(1 - factor_{target}\right) \]

Where \(p_{v}\) is the raw transmission probability of the virus \(v\), and \(factor_{t}\) are the increasing/reducing factors tools have over the process. For example, if p_v was 0.9, the host was wearing a mask, so \(factor_{\text{mask host}} = 0.3\) and the target was vaccinated, so \(factor_{\text{vaccinated target}} = 0.5\), then the adjusted probability \(p_{ij}\) would be \(0.9 \times (1 - 0.3) \times (1 - 0.5) = 0.27\).

When agents have more than one tool, factors are combined as follows:

\[ factor_{agent} = 1 - \prod_{t\in tools_{agent}}\left(1 - factor_{t}\right) \]

Therefore, for example, a vaccinated agent wearing a mask would have a factor of \(1 - (1 - 0.30) \times (1 - 0.5) = 0.65\). The adjusted probabilities principle also applies to recovery rates in the SIR and SEIR models.