Average age at infection in the SIR model¶

Continuing our investigation of the Susceptible-Infected-Recovered model, we will add demographics and investigate the behavior of the model in and around the endemic equilibrium.

$$ \dot{S} = -\frac{\beta*S*I}{N} + \mu N - \mu S $$

$$ \dot{I} = \frac{\beta*S*I}{N} - \gamma I - \mu I $$

$$ \dot{R} = \gamma I - \mu R $$

Analysis of this system can be found in other sources (ref. Keeling/Rohani). Setting the derivatives to zero and doing a bit of algebra gets you to the endemic equilbrium

$$ (S^*, \: I^*, \: R^*) = (\frac{1}{R_0}, \:\: \frac{\mu (R_0-1)}{\beta}, \: \: 1-\frac{1}{R_0} - \frac{\mu (R_0-1)}{\beta}) $$

$$ \text{where} \:\: R_0 = \frac{\beta}{\gamma + \mu} $$

An individual susceptible's mean time to infection will then be the inverse of the total force of infection, $\beta I^*$.

$$ \tau_{S \rightarrow I} = \frac{1}{\mu (R_0 -1)} $$

The equation above is often also described as the average age at infection. However, it is key to remember that the measured age at infection will be censored by non-disease deaths occurring at a rate $\mu$. You can work through the impact of this, but in the simple case of constant mortality, it turns out to exactly balance out the $-1$ in the above equation, and so our observed average age at infection will be $\frac{1}{\mu R_0}$. Since the hazard is constant at equilibrium, this implies an exponential distribution:

$$P(a | S->I) \sim \mu R_0 e^{-\mu R_0 a}$$

Alternatively, we can correct for the influence of mortality by observing the fraction of children who are susceptible at age A, which necessarily conditions on survival to age A. This will look like the cumulative distribution of an exponential distributed according to the mean time to infection above:

$$P(R | a) \sim 1 - e^{(-a \mu (R_0-1))}$$

Construct the model¶

In this notebook, we will reproduce this dependence, and to do so, we will demonstrate how a LASER user can construct new model components on the fly for use with one of the packaged models. First, we note that the SIR model as built here does not actually output the needed information on the age at infection, only the dates of birth and death. The infection event happens in the Transmission step, so we will construct a new derived class (TransmissionWithDOI) to reproduce transmission behavior but also add and record a date of infection property for agents. Second, because this distribution is an equilibrium property of this system, we will build a model component to periodically import new infections, to ensure that infections don't burn out after the initial outbreak, and we are able to establish an endemic equilibrium (we could, instead, opt to initialize the model as near the endemic equilibrium as possible, but it is nicer to show that the model naturally goes there). Finally, we also need to model a large enough population to sustain the infection endemically (this population is known as the Critical Community Size and will be treated in notebook 7). The primary driver of the critical community size is actually not $R_0$ but the duration of infection, so setting the duration of infection to relatively long values ($\gamma = \frac {1} {60 \text {days}}$ ) prevents us from needing huge agent populations. Finally, we construct a single-patch LASER model with five components: VitalDynamics, Susceptibility, TransmissionWithDOI, Infection, and Importation.

Sanity check¶

The first test, as always, ensures that certain basic constraints are being obeyed by the model.

Scientific test¶

The scientific test will sample a set of $(\mu, \gamma, R_0)$ tuples and confirm that both the age at infection and the fraction of susceptibles at a given age are well-described by exponential distributions and that the associated rate constant is as expected.

In [ ]:

Copied!





import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from laser.core.propertyset import PropertySet
from scipy.optimize import curve_fit
from scipy.stats import expon
from scipy.stats import kstest
from sklearn.metrics import mean_squared_error
from functools import partial

import laser.core.distributions as dists
from laser.core.demographics import AliasedDistribution
from laser.core.demographics import KaplanMeierEstimator
from laser.generic import SIR
from laser.generic import Model
from laser.generic.utils import ValuesMap
from laser.core.utils import grid

import laser.core
import laser.generic

print(f"{np.__version__=}")
print(f"{laser.core.__version__=}")
print(f"{laser.generic.__version__=}")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from laser.core.propertyset import PropertySet
from scipy.optimize import curve_fit
from scipy.stats import expon
from scipy.stats import kstest
from sklearn.metrics import mean_squared_error
from functools import partial

import laser.core.distributions as dists
from laser.core.demographics import AliasedDistribution
from laser.core.demographics import KaplanMeierEstimator
from laser.generic import SIR
from laser.generic import Model
from laser.generic.utils import ValuesMap
from laser.core.utils import grid

import laser.core
import laser.generic

print(f"{np.__version__=}")
print(f"{laser.core.__version__=}")
print(f"{laser.generic.__version__=}")

done
np.__version__='2.2.6'
laser.core.__version__='0.9.1'
laser.generic.__version__='0.0.0'

Here we build the modified Transmission class, to record date of infection.

In [2]:

Copied!





import numba as nb
from laser.generic import State

class TransmissionWithDOI(SIR.Transmission):
    def __init__(self, model, infdurdist, infdurmin: int =1):
        super().__init__(model, infdurdist, infdurmin)
        #After calling the superclass constructor, add an agent property "doi" to record date of infection
        self.model.people.add_scalar_property("doi", dtype=np.int32)  # Date of Infection
        self.model.people.doi[:] = -1
        
    #Next, define the Numba-accelerated transmission function - this looks like exactly like the one from the transmission class, 
    #but takes in the additional "doi" parameter and fills it with the current timestep upon infection
    @staticmethod
    @nb.njit(
        nogil=True,
        parallel=True,
    )
    def nb_transmission_doi(states, nodeids, ft, inf_by_node, itimers, infdurdist, infdurmin, tick, dois):
        for i in nb.prange(len(states)):
            if states[i] == State.SUSCEPTIBLE.value:
                # Check for infection
                draw = np.random.rand()
                nid = nodeids[i]
                if draw < ft[nid]:
                    states[i] = State.INFECTIOUS.value
                    dois[i] = tick  # Set date of infection
                    itimers[i] = np.maximum(np.round(infdurdist(tick, nid)), infdurmin)  # Set the infection timer
                    inf_by_node[nb.get_thread_id(), nid] += 1

        return

    #Finally, we define the step function, which again mirrors that of the Transmission class but calls the new nb_transmission_doi function and passes 
    #doi to it appropriately
    def step(self, tick: int) -> None:
        ft = self.model.nodes.forces[tick]

        N = self.model.nodes.S[tick] + (I := self.model.nodes.I[tick])  # noqa: E741
        # Shouldn't be any exposed (E), because this is an S->I model
        # Might have R.  
        if hasattr(self.model.nodes, "R"):
            N += self.model.nodes.R[tick]

        ft[:] = self.model.params.beta * I / N
        transfer = ft[:, None] * self.model.network
        ft += transfer.sum(axis=0)
        ft -= transfer.sum(axis=1)
        ft = -np.expm1(-ft)  # Convert to probability of infection

        inf_by_node = np.zeros((nb.get_num_threads(), self.model.nodes.count), dtype=np.uint32)
        self.nb_transmission_doi(
            self.model.people.state, self.model.people.nodeid, ft, inf_by_node, self.model.people.itimer, self.infdurdist, self.infdurmin,
            tick, self.model.people.doi # DOI specific parameters
        )
        inf_by_node = inf_by_node.sum(axis=0).astype(self.model.nodes.S.dtype)  # Sum over threads

        # state(t+1) = state(t) + ∆state(t)
        self.model.nodes.S[tick + 1] -= inf_by_node
        self.model.nodes.I[tick + 1] += inf_by_node
        # Record today's ∆
        self.model.nodes.newly_infected[tick] = inf_by_node

        return
import numba as nb
from laser.generic import State

class TransmissionWithDOI(SIR.Transmission):
    def __init__(self, model, infdurdist, infdurmin: int =1):
        super().__init__(model, infdurdist, infdurmin)
        #After calling the superclass constructor, add an agent property "doi" to record date of infection
        self.model.people.add_scalar_property("doi", dtype=np.int32)  # Date of Infection
        self.model.people.doi[:] = -1
        
    #Next, define the Numba-accelerated transmission function - this looks like exactly like the one from the transmission class, 
    #but takes in the additional "doi" parameter and fills it with the current timestep upon infection
    @staticmethod
    @nb.njit(
        nogil=True,
        parallel=True,
    )
    def nb_transmission_doi(states, nodeids, ft, inf_by_node, itimers, infdurdist, infdurmin, tick, dois):
        for i in nb.prange(len(states)):
            if states[i] == State.SUSCEPTIBLE.value:
                # Check for infection
                draw = np.random.rand()
                nid = nodeids[i]
                if draw < ft[nid]:
                    states[i] = State.INFECTIOUS.value
                    dois[i] = tick  # Set date of infection
                    itimers[i] = np.maximum(np.round(infdurdist(tick, nid)), infdurmin)  # Set the infection timer
                    inf_by_node[nb.get_thread_id(), nid] += 1

        return

    #Finally, we define the step function, which again mirrors that of the Transmission class but calls the new nb_transmission_doi function and passes 
    #doi to it appropriately
    def step(self, tick: int) -> None:
        ft = self.model.nodes.forces[tick]

        N = self.model.nodes.S[tick] + (I := self.model.nodes.I[tick])  # noqa: E741
        # Shouldn't be any exposed (E), because this is an S->I model
        # Might have R.  
        if hasattr(self.model.nodes, "R"):
            N += self.model.nodes.R[tick]

        ft[:] = self.model.params.beta * I / N
        transfer = ft[:, None] * self.model.network
        ft += transfer.sum(axis=0)
        ft -= transfer.sum(axis=1)
        ft = -np.expm1(-ft)  # Convert to probability of infection

        inf_by_node = np.zeros((nb.get_num_threads(), self.model.nodes.count), dtype=np.uint32)
        self.nb_transmission_doi(
            self.model.people.state, self.model.people.nodeid, ft, inf_by_node, self.model.people.itimer, self.infdurdist, self.infdurmin,
            tick, self.model.people.doi # DOI specific parameters
        )
        inf_by_node = inf_by_node.sum(axis=0).astype(self.model.nodes.S.dtype)  # Sum over threads

        # state(t+1) = state(t) + ∆state(t)
        self.model.nodes.S[tick + 1] -= inf_by_node
        self.model.nodes.I[tick + 1] += inf_by_node
        # Record today's ∆
        self.model.nodes.newly_infected[tick] = inf_by_node

        return

Now we define the Importation class, which will periodically infect a few susceptibles to prevent die-out.

In [3]:

Copied!





class Importation:
    def __init__(self, model, infdurdist, infdurmin: int =1, period: int = 180, count: int = 3):
        self.model = model
        self.infdurdist = infdurdist
        self.infdurmin = infdurmin
        self.period = period
        self.count = count

        self.model.nodes.add_vector_property("imports", model.params.nticks + 1, dtype=np.uint32, default=0)

        return
    
    def step(self, tick: int) -> None:
        if tick > 0 and tick % self.period == 0:
            i_susceptible = np.nonzero(self.model.people.state == SIR.State.SUSCEPTIBLE.value)[0]
            if len(i_susceptible) > 0:
                count = min(self.count, len(i_susceptible))
                i_infect = np.random.choice(i_susceptible, size=count, replace=False)
                self.model.people.state[i_infect] = SIR.State.INFECTIOUS.value
                samples = dists.sample_floats(self.infdurdist, np.zeros(count, np.float32))
                samples = np.round(samples)
                samples = np.maximum(samples, self.infdurmin).astype(self.model.people.itimer.dtype)
                self.model.people.itimer[i_infect] = samples
                inf_by_node = np.bincount(self.model.people.nodeid[i_infect], minlength=len(self.model.nodes)).astype(self.model.nodes.S.dtype)
                self.model.nodes.S[tick + 1] -= inf_by_node
                self.model.nodes.I[tick + 1] += inf_by_node
                self.model.nodes.imports[tick] = inf_by_node
            # else:
            #     print(f"No susceptibles to infect at tick {tick}")

        return
class Importation:
    def __init__(self, model, infdurdist, infdurmin: int =1, period: int = 180, count: int = 3):
        self.model = model
        self.infdurdist = infdurdist
        self.infdurmin = infdurmin
        self.period = period
        self.count = count

        self.model.nodes.add_vector_property("imports", model.params.nticks + 1, dtype=np.uint32, default=0)

        return
    
    def step(self, tick: int) -> None:
        if tick > 0 and tick % self.period == 0:
            i_susceptible = np.nonzero(self.model.people.state == SIR.State.SUSCEPTIBLE.value)[0]
            if len(i_susceptible) > 0:
                count = min(self.count, len(i_susceptible))
                i_infect = np.random.choice(i_susceptible, size=count, replace=False)
                self.model.people.state[i_infect] = SIR.State.INFECTIOUS.value
                samples = dists.sample_floats(self.infdurdist, np.zeros(count, np.float32))
                samples = np.round(samples)
                samples = np.maximum(samples, self.infdurmin).astype(self.model.people.itimer.dtype)
                self.model.people.itimer[i_infect] = samples
                inf_by_node = np.bincount(self.model.people.nodeid[i_infect], minlength=len(self.model.nodes)).astype(self.model.nodes.S.dtype)
                self.model.nodes.S[tick + 1] -= inf_by_node
                self.model.nodes.I[tick + 1] += inf_by_node
                self.model.nodes.imports[tick] = inf_by_node
            # else:
            #     print(f"No susceptibles to infect at tick {tick}")

        return

In [ ]:

Copied!





from laser.generic.vitaldynamics import BirthsByCBR, MortalityByEstimator

pop=3e5
scenario = grid(M=1, N=1, population_fn=lambda x,y: pop, origin_x=-(122+(19+(59/60))/60), origin_y=47+(36+(35/60))/60)
initial_infected = 1
scenario["S"] = scenario.population - initial_infected
scenario["I"] = initial_infected
scenario["R"] = 0
parameters = PropertySet(
    # {"seed": 4, "nticks": 18250, "verbose": True, "beta": 0.1, "inf_mean": 60, "cbr": 90, "importation_period": 180, "importation_count": 3}
    {"seed": 4, "nticks": 30*365, "verbose": True, "beta": 12/60, "inf_mean": 60, "cbr": 90, "importation_period": 180, "importation_count": 3}
)

birthrate_map = ValuesMap.from_scalar(parameters.cbr, nticks=parameters.nticks, nnodes=len(scenario))
model = Model(scenario, parameters, birthrates=birthrate_map)
infdurdist = dists.exponential(scale=parameters.inf_mean)
rate_const = 365 * ((1 + parameters.cbr / 1000) ** (1/365) - 1) #to get stable population, small correction factor for the annual age-dist/KM estimators.
stable_age_dist = np.array(1000*np.exp(-rate_const*np.arange(89)))
pyramid = AliasedDistribution(stable_age_dist)
survival = KaplanMeierEstimator(stable_age_dist.cumsum())
model.components = [
    SIR.Susceptible(model),
    SIR.Recovered(model),
    SIR.Infectious(model, infdurdist),
    Importation(model, infdurdist),
    # SIR.Transmission(model, infdurdist),
    TransmissionWithDOI(model, infdurdist),
    BirthsByCBR(model, birthrate_map, pyramid=pyramid),
    MortalityByEstimator(model, estimator=survival),
]

model.run()
plt.plot(model.nodes.S, color="blue")
plt.plot(model.nodes.I, color="red")
plt.plot(model.nodes.R, color="green")
plt.plot(model.nodes.S+model.nodes.I+model.nodes.R, color="black")
plt.legend(["S", "I", "R", "N"])
plt.figure()
cut = (model.people.doi >= 365*28) 
plt.hist(model.people.doi[cut] - model.people.dob[cut], 40)
plt.show()
from laser.generic.vitaldynamics import BirthsByCBR, MortalityByEstimator

pop=3e5
scenario = grid(M=1, N=1, population_fn=lambda x,y: pop, origin_x=-(122+(19+(59/60))/60), origin_y=47+(36+(35/60))/60)
initial_infected = 1
scenario["S"] = scenario.population - initial_infected
scenario["I"] = initial_infected
scenario["R"] = 0
parameters = PropertySet(
    # {"seed": 4, "nticks": 18250, "verbose": True, "beta": 0.1, "inf_mean": 60, "cbr": 90, "importation_period": 180, "importation_count": 3}
    {"seed": 4, "nticks": 30*365, "verbose": True, "beta": 12/60, "inf_mean": 60, "cbr": 90, "importation_period": 180, "importation_count": 3}
)

birthrate_map = ValuesMap.from_scalar(parameters.cbr, nticks=parameters.nticks, nnodes=len(scenario))
model = Model(scenario, parameters, birthrates=birthrate_map)
infdurdist = dists.exponential(scale=parameters.inf_mean)
rate_const = 365 * ((1 + parameters.cbr / 1000) ** (1/365) - 1) #to get stable population, small correction factor for the annual age-dist/KM estimators.
stable_age_dist = np.array(1000*np.exp(-rate_const*np.arange(89)))
pyramid = AliasedDistribution(stable_age_dist)
survival = KaplanMeierEstimator(stable_age_dist.cumsum())
model.components = [
    SIR.Susceptible(model),
    SIR.Recovered(model),
    SIR.Infectious(model, infdurdist),
    Importation(model, infdurdist),
    # SIR.Transmission(model, infdurdist),
    TransmissionWithDOI(model, infdurdist),
    BirthsByCBR(model, birthrate_map, pyramid=pyramid),
    MortalityByEstimator(model, estimator=survival),
]

model.run()
plt.plot(model.nodes.S, color="blue")
plt.plot(model.nodes.I, color="red")
plt.plot(model.nodes.R, color="green")
plt.plot(model.nodes.S+model.nodes.I+model.nodes.R, color="black")
plt.legend(["S", "I", "R", "N"])
plt.figure()
cut = (model.people.doi >= 365*28) 
plt.hist(model.people.doi[cut] - model.people.dob[cut], 40)
plt.show()

300,000 agents in 1 node(s): 100%|██████████| 10950/10950 [00:10<00:00, 1073.57it/s]

Out[ ]:

(array([1.0895e+04, 8.4550e+03, 6.2160e+03, 4.7340e+03, 3.6110e+03,
        2.5050e+03, 2.0470e+03, 1.5200e+03, 1.1860e+03, 9.0300e+02,
        6.8400e+02, 4.8800e+02, 3.3100e+02, 2.7900e+02, 2.3000e+02,
        1.5300e+02, 1.2700e+02, 9.0000e+01, 7.9000e+01, 4.4000e+01,
        3.1000e+01, 3.7000e+01, 2.9000e+01, 1.4000e+01, 9.0000e+00,
        6.0000e+00, 4.0000e+00, 5.0000e+00, 2.0000e+00, 3.0000e+00,
        4.0000e+00, 0.0000e+00, 3.0000e+00, 1.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00, 2.0000e+00]),
 array([1.000000e+00, 1.018250e+02, 2.026500e+02, 3.034750e+02,
        4.043000e+02, 5.051250e+02, 6.059500e+02, 7.067750e+02,
        8.076000e+02, 9.084250e+02, 1.009250e+03, 1.110075e+03,
        1.210900e+03, 1.311725e+03, 1.412550e+03, 1.513375e+03,
        1.614200e+03, 1.715025e+03, 1.815850e+03, 1.916675e+03,
        2.017500e+03, 2.118325e+03, 2.219150e+03, 2.319975e+03,
        2.420800e+03, 2.521625e+03, 2.622450e+03, 2.723275e+03,
        2.824100e+03, 2.924925e+03, 3.025750e+03, 3.126575e+03,
        3.227400e+03, 3.328225e+03, 3.429050e+03, 3.529875e+03,
        3.630700e+03, 3.731525e+03, 3.832350e+03, 3.933175e+03,
        4.034000e+03]),
 <BarContainer object of 40 artists>)

No description has been provided for this image

Sanity checks¶

As always, check that we haven't broken anything - S+I+R = N at all times.

First scientific test check¶

As usual, we will test this on a single simulation instance and demonstrate the analysis, before moving on to testing over a range of input values.
Here, we are interested in equilibrium behavior, so I first place a cut to exclude all infections that occur before day 10000. We then plot the data, an exponential distribution with the expected value of $\frac{1}{R_0 \mu}$, a best-fit exponential distribution to the data, and perform a KS test.

In [5]:

Copied!





mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
R0 = model.params.beta / (1 / model.params.inf_mean + mu)
A = 1 / (R0 * mu) / 365
a = np.arange(0, 15, 1/12)

def fit_age_at_infection(model, cutpos=5000):
    cut = model.people.doi > cutpos
    data = (model.people.doi[cut] - model.people.dob[cut]) / 365
    expfit = expon.fit(data)
    fitqual = kstest(data, expon.cdf, expfit)
    return data, expfit, fitqual

data, expfit, fitqual = fit_age_at_infection(model)

plt.hist(data, bins=a, density=True)
plt.plot(a, 1 / A * np.exp(-a / A), "-", lw=4)
plt.plot(a, expon.pdf(a, scale=expfit[1]), "k--")
plt.xlabel("Age at infection (years)")
plt.ylabel("Density")
plt.legend(
    [
        f"Expected exponential distribution - A = {A:.2f} years",
        f"Best fit age of infection, A = {expfit[1]:.2f} years",
        "Ages from simulation",
    ]
)
plt.show()
fitqual
mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
R0 = model.params.beta / (1 / model.params.inf_mean + mu)
A = 1 / (R0 * mu) / 365
a = np.arange(0, 15, 1/12)

def fit_age_at_infection(model, cutpos=5000):
    cut = model.people.doi > cutpos
    data = (model.people.doi[cut] - model.people.dob[cut]) / 365
    expfit = expon.fit(data)
    fitqual = kstest(data, expon.cdf, expfit)
    return data, expfit, fitqual

data, expfit, fitqual = fit_age_at_infection(model)

plt.hist(data, bins=a, density=True)
plt.plot(a, 1 / A * np.exp(-a / A), "-", lw=4)
plt.plot(a, expon.pdf(a, scale=expfit[1]), "k--")
plt.xlabel("Age at infection (years)")
plt.ylabel("Density")
plt.legend(
    [
        f"Expected exponential distribution - A = {A:.2f} years",
        f"Best fit age of infection, A = {expfit[1]:.2f} years",
        "Ages from simulation",
    ]
)
plt.show()
fitqual

Out[5]:

KstestResult(statistic=np.float64(0.003316425350068257), pvalue=np.float64(0.000590975775692332), statistic_location=np.float64(0.0410958904109589), statistic_sign=np.int8(1))

Second scientific test check¶

As noted above, because of the censoring that mortality induces, the distributions of age at infection vs. fraction susceptible at a given age will both follow exponential behavior but with a slight difference in the mean of the distribution - for the fraction susceptible, that value will be $\frac{1}{\mu (R_0-1)}$. We develop this second test below.

In [6]:

Copied!





A2 = 1 / ((R0 - 1) * mu) / 365


def Scurve(x, lam):
    return np.exp(-x / lam)


def fit_susceptibility_vs_age(model, cutpos=5000):
    cut = (model.people.state == State.SUSCEPTIBLE.value) & (model.people.dob > cutpos)
    bins = np.linspace(0, 15, 180)
    age_years = (np.max(model.people.dob) - model.people.dob) / 365
    n1 = np.histogram(age_years, bins=bins)[0]
    n2 = np.histogram(age_years[cut], bins=bins)[0]
    n1[n1 == 0] = 1
    y1 = n2 / n1
    popt, pcov = curve_fit(Scurve, bins[1:], y1, p0=2.5)
    mse = mean_squared_error(y1, Scurve(bins[1:], *popt))
    return y1, bins, popt, pcov, mse

data, bins, popt, pcov, mse = fit_susceptibility_vs_age(model)

# Plotting is a little more complicated here, want to plot at the bin centers and force y=1 at x=0
plt.plot([0, *bins[1:]], [1, *data], ".")
plt.plot([0, *bins[1:]], [1, *(Scurve(bins[1:], A2))], lw=4)
plt.plot([0, *bins[1:]], [1, *(Scurve(bins[1:], *popt))], "k--")
plt.legend(
    ["Fraction susceptible at age", f"Expected exponential distribution - A = {A2:.3f} years", f"Best fit mean age of susceptibility, A = {popt[0]:.3f} years"]
)
plt.xlabel("Age (years)")
plt.xlim(0, 15)
plt.ylabel("Fraction of individuals susceptible at age")
plt.show()
print("RMSE = ", np.sqrt(mse))
A2 = 1 / ((R0 - 1) * mu) / 365


def Scurve(x, lam):
    return np.exp(-x / lam)


def fit_susceptibility_vs_age(model, cutpos=5000):
    cut = (model.people.state == State.SUSCEPTIBLE.value) & (model.people.dob > cutpos)
    bins = np.linspace(0, 15, 180)
    age_years = (np.max(model.people.dob) - model.people.dob) / 365
    n1 = np.histogram(age_years, bins=bins)[0]
    n2 = np.histogram(age_years[cut], bins=bins)[0]
    n1[n1 == 0] = 1
    y1 = n2 / n1
    popt, pcov = curve_fit(Scurve, bins[1:], y1, p0=2.5)
    mse = mean_squared_error(y1, Scurve(bins[1:], *popt))
    return y1, bins, popt, pcov, mse

data, bins, popt, pcov, mse = fit_susceptibility_vs_age(model)

# Plotting is a little more complicated here, want to plot at the bin centers and force y=1 at x=0
plt.plot([0, *bins[1:]], [1, *data], ".")
plt.plot([0, *bins[1:]], [1, *(Scurve(bins[1:], A2))], lw=4)
plt.plot([0, *bins[1:]], [1, *(Scurve(bins[1:], *popt))], "k--")
plt.legend(
    ["Fraction susceptible at age", f"Expected exponential distribution - A = {A2:.3f} years", f"Best fit mean age of susceptibility, A = {popt[0]:.3f} years"]
)
plt.xlabel("Age (years)")
plt.xlim(0, 15)
plt.ylabel("Fraction of individuals susceptible at age")
plt.show()
print("RMSE = ", np.sqrt(mse))

RMSE =  0.00653116128761858

Third scientific test¶

As long as we are here, let's test that the age distribution of the population is correct. With constant birth rate, equal to death rate, the population should have exponentially distributed ages with parameter $\frac{1}{\mu}$

In [7]:

Copied!





# Test that population age distribution is correct
def fit_population_age_distribution(model, max_age_years=100):
    tick = model.params.nticks
    cut = model.people.dod > tick
    age_years = (tick - model.people.dob[cut]) / 365
    expfit = expon.fit(age_years[age_years <= max_age_years])
    fitqual = kstest(age_years[age_years <= max_age_years], expon.cdf, expfit)
    return age_years, expfit, fitqual


age_years, expfit, fitqual = fit_population_age_distribution(model)
mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
A = 1 / mu / 365
a = np.arange(0, 60, 0.1)

plt.hist(age_years, bins=a, density=True)
plt.plot(a, 1 / A * np.exp(-a / A), "-", lw=4)
plt.plot(a, expon.pdf(a, scale=expfit[1]), "k--")
plt.xlabel("Population age distribution (years)")
plt.ylabel("Density")
plt.legend(
    [f"Expected exponential distribution - A = {A:.2f} years", f"Best fit age distribution, A = {expfit[1]:.2f} years", "Simulation output"]
)
plt.show()
fitqual
# Test that population age distribution is correct
def fit_population_age_distribution(model, max_age_years=100):
    tick = model.params.nticks
    cut = model.people.dod > tick
    age_years = (tick - model.people.dob[cut]) / 365
    expfit = expon.fit(age_years[age_years <= max_age_years])
    fitqual = kstest(age_years[age_years <= max_age_years], expon.cdf, expfit)
    return age_years, expfit, fitqual


age_years, expfit, fitqual = fit_population_age_distribution(model)
mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
A = 1 / mu / 365
a = np.arange(0, 60, 0.1)

plt.hist(age_years, bins=a, density=True)
plt.plot(a, 1 / A * np.exp(-a / A), "-", lw=4)
plt.plot(a, expon.pdf(a, scale=expfit[1]), "k--")
plt.xlabel("Population age distribution (years)")
plt.ylabel("Density")
plt.legend(
    [f"Expected exponential distribution - A = {A:.2f} years", f"Best fit age distribution, A = {expfit[1]:.2f} years", "Simulation output"]
)
plt.show()
fitqual

Out[7]:

KstestResult(statistic=np.float64(0.004452976731299074), pvalue=np.float64(2.3525039838407847e-05), statistic_location=np.float64(6.16986301369863), statistic_sign=np.int8(-1))

Larger test suite¶

OK, so now we are going to replicate the above for many values of $R_0$ and cbr, as a scientific validity test.

We choose extremely high birth rates (7-10% per year) and $R_0$ (5-15), to give us systems that should equilibrate relatively quickly and prevent us from having to run for very long times.

In [8]:

Copied!





cbrs = 70 + 30 * np.random.rand(25)
R0s = 5 + 10 * np.random.rand(25)
pop = 1e5
initial_infected = 1
inf_mean = 90
output = pd.DataFrame(data={"cbr": cbrs, "R0": R0s})
output["Average_Iage_observed"] = np.nan
output["Average_Iage_expected"] = np.nan
output["Average_Sage_observed"] = np.nan
output["Average_Sage_expected"] = np.nan
output["Average_age_expected"] = np.nan
output["Average_age_expected"] = np.nan

for index, row in output.iterrows():
    parameters = PropertySet(
        {
            "seed": 2,
            "nticks": 365*50,
            "verbose": True,
            "beta": row["R0"] * (mu + 1 / inf_mean),
            "inf_mean": inf_mean,
            "cbr": row["cbr"],
            "importation_period": 180,
            "importation_count": 3,
        }
    )
    scenario = grid(M=1, N=1, population_fn=lambda x,y: pop, origin_x=0, origin_y=0)
    scenario["I"] = initial_infected
    scenario["R"] = np.round(pop * (1 / row["R0"] - 0.05)).astype(np.int32)
    scenario["S"] = scenario.population - scenario["I"] - scenario["R"]

    parameters = PropertySet(
        {"seed": index, "nticks": 365*50, "verbose": True, "beta": parameters.beta, "inf_mean": parameters.inf_mean, "cbr": parameters.cbr, "importation_period": 180, "importation_count": 3}
    )
    birthrate_map = ValuesMap.from_scalar(parameters.cbr, nticks=parameters.nticks, nnodes=len(scenario))
    model = Model(scenario, parameters, birthrates=birthrate_map)

    infdurdist = dists.exponential(scale=parameters.inf_mean)
    rate_const = 365 * ((1 + parameters.cbr / 1000) ** (1/365) - 1) #to get stable population, small correction factor for the annual age-dist/KM estimators.
    pyramid = AliasedDistribution(stable_age_dist:=np.array(1000*np.exp(-rate_const*np.arange(89))))
    survival = KaplanMeierEstimator(stable_age_dist.cumsum())

    model.components = [
        SIR.Susceptible(model),
        SIR.Recovered(model),
        SIR.Infectious(model, infdurdist),
        Importation(model, infdurdist),
        TransmissionWithDOI(model, infdurdist),
        BirthsByCBR(model, birthrate_map, pyramid=pyramid),
        MortalityByEstimator(model, estimator=survival),
    ]

    model.run(f"Running model {index+1} of {len(output)} with R0={row['R0']:.2f}, cbr={row['cbr']:.2f}")
    mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
    _, expfit, _ = fit_age_at_infection(model, 365*40)

    output.loc[index, "Average_Iage_expected"] = 1 / (row["R0"] * mu) / 365
    output.loc[index, "Average_Iage_observed"] = expfit[1]

    # _, _, popt, _, _ = fit_susceptibility_vs_age(model, 365*20)
    # output.loc[index, "Average_Sage_expected"] = 1 / ((row["R0"] - 1) * mu) / 365
    # output.loc[index, "Average_Sage_observed"] = popt[0]

    # _, expfit, _ = fit_population_age_distribution(model)
    # output.loc[index, "Average_age_expected"] = 1 / mu / 365
    # output.loc[index, "Average_age_observed"] = expfit[1]
cbrs = 70 + 30 * np.random.rand(25)
R0s = 5 + 10 * np.random.rand(25)
pop = 1e5
initial_infected = 1
inf_mean = 90
output = pd.DataFrame(data={"cbr": cbrs, "R0": R0s})
output["Average_Iage_observed"] = np.nan
output["Average_Iage_expected"] = np.nan
output["Average_Sage_observed"] = np.nan
output["Average_Sage_expected"] = np.nan
output["Average_age_expected"] = np.nan
output["Average_age_expected"] = np.nan

for index, row in output.iterrows():
    parameters = PropertySet(
        {
            "seed": 2,
            "nticks": 365*50,
            "verbose": True,
            "beta": row["R0"] * (mu + 1 / inf_mean),
            "inf_mean": inf_mean,
            "cbr": row["cbr"],
            "importation_period": 180,
            "importation_count": 3,
        }
    )
    scenario = grid(M=1, N=1, population_fn=lambda x,y: pop, origin_x=0, origin_y=0)
    scenario["I"] = initial_infected
    scenario["R"] = np.round(pop * (1 / row["R0"] - 0.05)).astype(np.int32)
    scenario["S"] = scenario.population - scenario["I"] - scenario["R"]

    parameters = PropertySet(
        {"seed": index, "nticks": 365*50, "verbose": True, "beta": parameters.beta, "inf_mean": parameters.inf_mean, "cbr": parameters.cbr, "importation_period": 180, "importation_count": 3}
    )
    birthrate_map = ValuesMap.from_scalar(parameters.cbr, nticks=parameters.nticks, nnodes=len(scenario))
    model = Model(scenario, parameters, birthrates=birthrate_map)

    infdurdist = dists.exponential(scale=parameters.inf_mean)
    rate_const = 365 * ((1 + parameters.cbr / 1000) ** (1/365) - 1) #to get stable population, small correction factor for the annual age-dist/KM estimators.
    pyramid = AliasedDistribution(stable_age_dist:=np.array(1000*np.exp(-rate_const*np.arange(89))))
    survival = KaplanMeierEstimator(stable_age_dist.cumsum())

    model.components = [
        SIR.Susceptible(model),
        SIR.Recovered(model),
        SIR.Infectious(model, infdurdist),
        Importation(model, infdurdist),
        TransmissionWithDOI(model, infdurdist),
        BirthsByCBR(model, birthrate_map, pyramid=pyramid),
        MortalityByEstimator(model, estimator=survival),
    ]

    model.run(f"Running model {index+1} of {len(output)} with R0={row['R0']:.2f}, cbr={row['cbr']:.2f}")
    mu = (1 + model.params.cbr / 1000) ** (1 / 365) - 1
    _, expfit, _ = fit_age_at_infection(model, 365*40)

    output.loc[index, "Average_Iage_expected"] = 1 / (row["R0"] * mu) / 365
    output.loc[index, "Average_Iage_observed"] = expfit[1]

    # _, _, popt, _, _ = fit_susceptibility_vs_age(model, 365*20)
    # output.loc[index, "Average_Sage_expected"] = 1 / ((row["R0"] - 1) * mu) / 365
    # output.loc[index, "Average_Sage_observed"] = popt[0]

    # _, expfit, _ = fit_population_age_distribution(model)
    # output.loc[index, "Average_age_expected"] = 1 / mu / 365
    # output.loc[index, "Average_age_observed"] = expfit[1]

Running model 1 of 25 with R0=11.54, cbr=85.98: 100%|██████████| 18250/18250 [00:14<00:00, 1232.25it/s]
Running model 2 of 25 with R0=11.84, cbr=78.35: 100%|██████████| 18250/18250 [00:14<00:00, 1269.88it/s]
Running model 3 of 25 with R0=14.15, cbr=94.57: 100%|██████████| 18250/18250 [00:14<00:00, 1283.06it/s]
Running model 4 of 25 with R0=11.99, cbr=95.82: 100%|██████████| 18250/18250 [00:14<00:00, 1267.16it/s]
Running model 5 of 25 with R0=5.49, cbr=81.61: 100%|██████████| 18250/18250 [00:14<00:00, 1236.26it/s]
Running model 6 of 25 with R0=13.30, cbr=78.38: 100%|██████████| 18250/18250 [00:14<00:00, 1281.16it/s]
Running model 7 of 25 with R0=12.02, cbr=80.48: 100%|██████████| 18250/18250 [00:14<00:00, 1268.92it/s]
Running model 8 of 25 with R0=6.81, cbr=72.19: 100%|██████████| 18250/18250 [00:14<00:00, 1233.07it/s]
Running model 9 of 25 with R0=10.93, cbr=70.94: 100%|██████████| 18250/18250 [00:14<00:00, 1263.84it/s]
Running model 10 of 25 with R0=9.33, cbr=80.18: 100%|██████████| 18250/18250 [00:14<00:00, 1248.86it/s]
Running model 11 of 25 with R0=8.04, cbr=86.15: 100%|██████████| 18250/18250 [00:14<00:00, 1250.44it/s]
Running model 12 of 25 with R0=12.84, cbr=72.16: 100%|██████████| 18250/18250 [00:14<00:00, 1256.66it/s]
Running model 13 of 25 with R0=12.86, cbr=84.02: 100%|██████████| 18250/18250 [00:14<00:00, 1270.37it/s]
Running model 14 of 25 with R0=14.67, cbr=97.83: 100%|██████████| 18250/18250 [00:14<00:00, 1268.64it/s]
Running model 15 of 25 with R0=12.67, cbr=97.89: 100%|██████████| 18250/18250 [00:14<00:00, 1237.17it/s]
Running model 16 of 25 with R0=9.16, cbr=92.04: 100%|██████████| 18250/18250 [00:14<00:00, 1224.49it/s]
Running model 17 of 25 with R0=14.68, cbr=97.68: 100%|██████████| 18250/18250 [00:14<00:00, 1254.20it/s]
Running model 18 of 25 with R0=13.51, cbr=88.97: 100%|██████████| 18250/18250 [00:14<00:00, 1253.70it/s]
Running model 19 of 25 with R0=8.24, cbr=77.20: 100%|██████████| 18250/18250 [00:14<00:00, 1234.85it/s]
Running model 20 of 25 with R0=5.45, cbr=83.32: 100%|██████████| 18250/18250 [00:14<00:00, 1217.16it/s]
Running model 21 of 25 with R0=13.46, cbr=84.72: 100%|██████████| 18250/18250 [00:14<00:00, 1273.55it/s]
Running model 22 of 25 with R0=11.51, cbr=89.34: 100%|██████████| 18250/18250 [00:14<00:00, 1258.22it/s]
Running model 23 of 25 with R0=12.37, cbr=74.80: 100%|██████████| 18250/18250 [00:14<00:00, 1239.25it/s]
Running model 24 of 25 with R0=6.05, cbr=94.90: 100%|██████████| 18250/18250 [00:14<00:00, 1222.81it/s]
Running model 25 of 25 with R0=14.69, cbr=97.27: 100%|██████████| 18250/18250 [00:14<00:00, 1274.46it/s]

In [9]:

Copied!





# Plotting the expected and observed values
fig, axes = plt.subplots(3, 1, figsize=(10, 15))

# Plot for Average_Iage
axes[0].scatter(output["Average_Iage_expected"], output["Average_Iage_observed"], c="blue")
axes[0].plot(
    [output["Average_Iage_expected"].min(), output["Average_Iage_expected"].max()],
    [output["Average_Iage_expected"].min(), output["Average_Iage_expected"].max()],
    "r--",
)
axes[0].set_xlabel("Average_Iage_expected")
axes[0].set_ylabel("Average_Iage_observed")
axes[0].set_title("Average_Iage: Expected vs Observed")

# # Plot for Average_Sage
# axes[1].scatter(output["Average_Sage_expected"], output["Average_Sage_observed"], c="green")
# axes[1].plot(
#     [output["Average_Sage_expected"].min(), output["Average_Sage_expected"].max()],
#     [output["Average_Sage_expected"].min(), output["Average_Sage_expected"].max()],
#     "r--",
# )
# axes[1].set_xlabel("Average_Sage_expected")
# axes[1].set_ylabel("Average_Sage_observed")
# axes[1].set_title("Average_Sage: Expected vs Observed")

# # Plot for Average_age
# axes[2].scatter(output["Average_age_expected"], output["Average_age_observed"], c="purple")
# axes[2].plot(
#     [output["Average_age_expected"].min(), output["Average_age_expected"].max()],
#     [output["Average_age_expected"].min(), output["Average_age_expected"].max()],
#     "r--",
# )
# axes[2].set_xlabel("Average_age_expected")
# axes[2].set_ylabel("Average_age_observed")
# axes[2].set_title("Average_age: Expected vs Observed")

plt.tight_layout()
plt.show()

# Testing whether the expected and observed values are within 10% of each other
within_10_percent_Iage = np.abs(output["Average_Iage_expected"] - output["Average_Iage_observed"]) / output["Average_Iage_expected"] <= 0.1
#within_10_percent_Sage = np.abs(output["Average_Sage_expected"] - output["Average_Sage_observed"]) / output["Average_Sage_expected"] <= 0.1
#within_10_percent_age = np.abs(output["Average_age_expected"] - output["Average_age_observed"]) / output["Average_age_expected"] <= 0.1
# Calculate deviations
deviation_Iage = np.abs(output["Average_Iage_expected"] - output["Average_Iage_observed"]) / output["Average_Iage_expected"]
#deviation_Sage = np.abs(output["Average_Sage_expected"] - output["Average_Sage_observed"]) / output["Average_Sage_expected"]
#deviation_age = np.abs(output["Average_age_expected"] - output["Average_age_observed"]) / output["Average_age_expected"]

# Print average fractional deviation
print(f"Average fractional deviation for Iage: {deviation_Iage.mean()}")
#print(f"Average fractional deviation for Sage: {deviation_Sage.mean()}")
#print(f"Average fractional deviation for age: {deviation_age.mean()}")

# Print max deviation
print(f"Max deviation for Iage: {deviation_Iage.max()}")
#print(f"Max deviation for Sage: {deviation_Sage.max()}")
#print(f"Max deviation for age: {deviation_age.max()}")

# Print number of sims >5% and 10% away from expectation
print(f"Number of sims >5% away for Iage: {(deviation_Iage > 0.05).sum()}")
print(f"Number of sims >10% away for Iage: {(deviation_Iage > 0.1).sum()}")
#print(f"Number of sims >5% away for Sage: {(deviation_Sage > 0.05).sum()}")
#print(f"Number of sims >10% away for Sage: {(deviation_Sage > 0.1).sum()}")
#print(f"Number of sims >5% away for age: {(deviation_age > 0.05).sum()}")
#print(f"Number of sims >10% away for age: {(deviation_age > 0.1).sum()}")
# Plotting the expected and observed values
fig, axes = plt.subplots(3, 1, figsize=(10, 15))

# Plot for Average_Iage
axes[0].scatter(output["Average_Iage_expected"], output["Average_Iage_observed"], c="blue")
axes[0].plot(
    [output["Average_Iage_expected"].min(), output["Average_Iage_expected"].max()],
    [output["Average_Iage_expected"].min(), output["Average_Iage_expected"].max()],
    "r--",
)
axes[0].set_xlabel("Average_Iage_expected")
axes[0].set_ylabel("Average_Iage_observed")
axes[0].set_title("Average_Iage: Expected vs Observed")

# # Plot for Average_Sage
# axes[1].scatter(output["Average_Sage_expected"], output["Average_Sage_observed"], c="green")
# axes[1].plot(
#     [output["Average_Sage_expected"].min(), output["Average_Sage_expected"].max()],
#     [output["Average_Sage_expected"].min(), output["Average_Sage_expected"].max()],
#     "r--",
# )
# axes[1].set_xlabel("Average_Sage_expected")
# axes[1].set_ylabel("Average_Sage_observed")
# axes[1].set_title("Average_Sage: Expected vs Observed")

# # Plot for Average_age
# axes[2].scatter(output["Average_age_expected"], output["Average_age_observed"], c="purple")
# axes[2].plot(
#     [output["Average_age_expected"].min(), output["Average_age_expected"].max()],
#     [output["Average_age_expected"].min(), output["Average_age_expected"].max()],
#     "r--",
# )
# axes[2].set_xlabel("Average_age_expected")
# axes[2].set_ylabel("Average_age_observed")
# axes[2].set_title("Average_age: Expected vs Observed")

plt.tight_layout()
plt.show()

# Testing whether the expected and observed values are within 10% of each other
within_10_percent_Iage = np.abs(output["Average_Iage_expected"] - output["Average_Iage_observed"]) / output["Average_Iage_expected"] <= 0.1
#within_10_percent_Sage = np.abs(output["Average_Sage_expected"] - output["Average_Sage_observed"]) / output["Average_Sage_expected"] <= 0.1
#within_10_percent_age = np.abs(output["Average_age_expected"] - output["Average_age_observed"]) / output["Average_age_expected"] <= 0.1
# Calculate deviations
deviation_Iage = np.abs(output["Average_Iage_expected"] - output["Average_Iage_observed"]) / output["Average_Iage_expected"]
#deviation_Sage = np.abs(output["Average_Sage_expected"] - output["Average_Sage_observed"]) / output["Average_Sage_expected"]
#deviation_age = np.abs(output["Average_age_expected"] - output["Average_age_observed"]) / output["Average_age_expected"]

# Print average fractional deviation
print(f"Average fractional deviation for Iage: {deviation_Iage.mean()}")
#print(f"Average fractional deviation for Sage: {deviation_Sage.mean()}")
#print(f"Average fractional deviation for age: {deviation_age.mean()}")

# Print max deviation
print(f"Max deviation for Iage: {deviation_Iage.max()}")
#print(f"Max deviation for Sage: {deviation_Sage.max()}")
#print(f"Max deviation for age: {deviation_age.max()}")

# Print number of sims >5% and 10% away from expectation
print(f"Number of sims >5% away for Iage: {(deviation_Iage > 0.05).sum()}")
print(f"Number of sims >10% away for Iage: {(deviation_Iage > 0.1).sum()}")
#print(f"Number of sims >5% away for Sage: {(deviation_Sage > 0.05).sum()}")
#print(f"Number of sims >10% away for Sage: {(deviation_Sage > 0.1).sum()}")
#print(f"Number of sims >5% away for age: {(deviation_age > 0.05).sum()}")
#print(f"Number of sims >10% away for age: {(deviation_age > 0.1).sum()}")

Average fractional deviation for Iage: 0.005952376697369108
Max deviation for Iage: 0.013101594653640043
Number of sims >5% away for Iage: 0
Number of sims >10% away for Iage: 0

In [10]:

Copied!

model.params.beta*90
model.params.beta*90

Out[10]:

np.float64(15.019753725016873)

In [11]:

Copied!

output
output

Out[11]:

	cbr	R0	Average_Iage_observed	Average_Iage_expected	Average_Sage_observed	Average_Sage_expected	Average_age_expected
0	85.982348	11.541424	1.053437	1.050308	NaN	NaN	NaN
1	78.352124	11.835540	1.113214	1.119952	NaN	NaN	NaN
2	94.574587	14.153259	0.791026	0.781782	NaN	NaN	NaN
3	95.824595	11.987251	0.899588	0.911530	NaN	NaN	NaN
4	81.605498	5.485394	2.335996	2.323656	NaN	NaN	NaN
5	78.376489	13.295939	0.989437	0.996641	NaN	NaN	NaN
6	80.480542	12.017599	1.069197	1.074886	NaN	NaN	NaN
7	72.191892	6.809706	2.101361	2.106522	NaN	NaN	NaN
8	70.937054	10.929013	1.331440	1.334972	NaN	NaN	NaN
9	80.181433	9.330705	1.384899	1.389383	NaN	NaN	NaN
10	86.150877	8.040972	1.497542	1.504704	NaN	NaN	NaN
11	72.160409	12.836432	1.109104	1.117977	NaN	NaN	NaN
12	84.019793	12.855215	0.972663	0.964112	NaN	NaN	NaN
13	97.825868	14.665762	0.729589	0.730484	NaN	NaN	NaN
14	97.892352	12.669930	0.836221	0.845005	NaN	NaN	NaN
15	92.044757	9.155130	1.225405	1.240351	NaN	NaN	NaN
16	97.682221	14.679613	0.726888	0.730819	NaN	NaN	NaN
17	88.973760	13.513917	0.869642	0.868053	NaN	NaN	NaN
18	77.201035	8.243124	1.621558	1.631132	NaN	NaN	NaN
19	83.323379	5.454713	2.274522	2.290385	NaN	NaN	NaN
20	84.717467	13.458195	0.912957	0.913629	NaN	NaN	NaN
21	89.337468	11.513124	1.023889	1.014930	NaN	NaN	NaN
22	74.799681	12.366528	1.120772	1.120902	NaN	NaN	NaN
23	94.904526	6.048482	1.806037	1.823264	NaN	NaN	NaN
24	97.274629	14.691271	0.729959	0.733161	NaN	NaN	NaN