# Herd immunity: how vaccines work for all of us

’tis the season to think about getting your flu shot, so let’s talk a little bit about herd immunity. Vaccines are amazing because they work at two levels. They protect the individual – by stimulating your immune system, they make you more likely to resist the virus when or if you encounter it. But vaccines also protect the entire community, through the “herd immunity” effect. To give you the short version, the most vaccinated people in a group, the less likely we are to see an epidemic happen.

Want the long version? Buckle up, we’re about to write equations!

Pathogens are only really **dangerous at a large scale when they can spread
through a population**, which happens when an infected individual can create
new infection cases faster than they are “removed” from the pool of contagious
people. Removal from the pool can happen through a variety of mechanisms,
including the natural end of the contagious period, development of immunity
against the pathogen, and mortality.

A pathogen that would kill its host instantly, for example, would not be very good at starting an outbreak, because infected individuals would be removed before they can get the chance to create new cases. On the other hand, a pathogen that is very good at invading the immune system would possibly stay active within its host for a long time, creating more cases and triggering an epidemic.

Newly infected cases are created from a pool of susceptible individuals, *i.e.*
those that because they have never encountered the pathogen, or because they
have encountered it a long time ago, have no immune response to it. For common
infections like the common cold (which is a family of about 200 strains of
pathogens with similar symptoms), the overwhelming majority of the population is
susceptible at all time. We mostly recover within a week, during which we create
new cases, and this is the story of why we get on average 2 to 6 common colds
(from different strains) every year.

So let’s write this up, and we will make the important simplifying assumption
that there is no natality, and no mortality. The population of infectious
individuals is $I$, and it can vary over time because of two mechanisms: the
infection (which *creates* new infected individuals), and the recovery (which
*removes* them). So we can write this as:

$$\dot I = I\left(a\times S - b\right)$$

$S$ is the number of susceptible individuals. The rates $a$ and $b$ represent
characteristics of the pathogen, namely the rate at which new infections happen
(for every contact between and individual of $I$ and of $S$), and the rate at
which individuals recover. The $\dot I$ quantity is the *absolute change* in
number of infectious individuals over a given time period. If we assume that our
time period is a day, then we can guesstimate the value of the parameters. For
example, if the infectious period lasts for four days, we can write this as a
*chance* of not being infectious anymore of $b \approx 0.25$, every day.

Reasoning about the infectious population this way makes it somewhat easier to
think about what happens in the other. The susceptible individuals can *lose*
part of their population to the infectious group, at rate $a$ (on every
contact):

$$\dot S = - a\times S \times I$$

This is an interesting formulation, because we can verify two things. First, if
we have no infectious individuals ($I = 0$), this whole term collapses to 0, and
the susceptible population remains constant – **outbreaks do not happen ex
nihilo, and they require some number of infectious individuals to start** (keep
this in mind, this is a crucial piece of information). The second thing we can
see from this expression is that $\dot S$ is always go to be negative, or
strictly equal to 0. Either the population of susceptible individuals remains
stable ($S = 0$, there are no susceptible individuals left; $I = 0$, there is no
infectious individual), or it is decreasing. Susceptible individuals will end up

*somewhere else*.

This *somewhere else* is the pool of recovered individuals (after going through
the infectious step). The other term we have not yet used from $\dot I$ is
$-b\times I$, which the flux of individuals that are removed from the infectious
pool, into the recovered pool:

$$\dot R = b\times I$$

Why all of this mathematical nonsense? In short, because it is one way to
determine whether an outbreak will persist. The way we represent the change in
the number susceptible individuals ($\dot S = -a\times S\times I$), it is clear
that any number of infectious larger than 0 will result, at first, in a loss of
susceptible individuals. Yet this does not guarantee that the outbreak will
occur, because the newly infectious individuals must *remain* infectious long
enough to keep this cycle going. In short, **the infection will spread if
infectious cases are produced faster than they are recovered**.

This question can be very slightly reframed as, how many new infectious cases can a single infectious individual create? If the answer to this question is “some quantity larger than one”, the disease will spread. If not, it won’t. The answer to this question is an important quantity in epidemiology, called $\mathcal{R}_0$, the basic reproduction number, pronounced “R nought”.

So of course, **no one agrees on how to measure it**. There are several methods,
that do not always agree on the result, and require different mathematical tools
or assumptions to apply. There is one way that works remarkably well for our
$SIR$ model, and it goes as follows:

$$\mathcal{R}_0 = \text{transmissibility}\times\text{rate of contact}\times\text{duration of infectiousness}$$

Luckily, we already know the duration of infectiousness, which is $b^{-1}$.
Transmissibility is the chance to see a successful infection of a susceptible
individual *per contact* (which is $a$), and the rate of contact is the number
of encounters between individuals from $I$ and $S$, *i.e.* $I\times S$. In
practice, this is done on a model describing the *proportion* of individuals
($i=I/N$, $s=S/N$, $r=R/N$, $N=S+I+R$), with the parameters scaled
appropriately. Taking this together, we are interested in the sign of
$i\times\left(a\times s - b\right)$, which is positive when

$$\frac{a}{b}\times s\times i > i$$

And of course, we should rewrite this as

$$\frac{a}{b}\times s> 1$$

But because we are really only concerned about the *very beginning* of the
outbreak, we can apply a neat little trick, and say that $N$ is somewhat large,
and $R = 0$, and $I$ is small enough that $S\approx N$ is a good approximation.
This matters immensely because it removes the population size from the equation
– our pathogen can spread if, *and only if*,

$$\frac{a}{b} > 1$$

Thanks to our model, we can have a criteria that will help us predict when the pathogen will spread, and this is $\mathcal{R}_0 = a/b$. If this quantity is greater than one (or “greater than unity”, to be all fancy about it), we have a problem.

But, you may ask, **what about vaccination**?

Good question. Vaccination is a *shortcut*, to bring us from the susceptible to
the recovered step, *without* having to go through the infectious one. We can
tweak two of our equations (remove individuals from $S$, put them in $R$). But
we don’t need to!

Remember, that we used a neat little trick to simplify the expression of $\mathcal{R}_0$? This is because in practice, what we derived was the expression $\mathcal{R}_0\times s$, which is to say the proportion of individuals that are susceptible. But what if vaccinated a proportion, say $p$, of this population? Well, because we can still use the $s \approx 1$ simplification, we can say that our epidemic will happen when

$$\mathcal{R}_0 \times (1 - p) > 1$$

In short, there is a value of $p$ (larger than 0, smaller than 1) for which we
can get this entire value to drop below 1, and the epidemic to stop spreading.
To find this value (the **herd epidemic threshold**), we re-order things a
little bit, and we end up with

$$p^\star = 1 - \frac{1}{\mathcal{R}_0}$$

And the beauty of it, is that this works regardless of the expression of $\mathcal{R}_0$. Influenza viruses have an estimated $\mathcal{R}_0$ value between 2 and 3. To control an epidemic, we would need to vaccinate between one half and two thirds of the population.

One of the greatest benefit of vaccination is not that *one, as an individual*
decreases the chance of becoming sick (though that is, admittedly, pretty cool),
but rather that **individual actions protect the entire group**, or the entire
herd. In fact, willingly unvaccinated people (as opposed to those who cannot be
vaccinated) benefit from the actions of others (while still representing a risk
for public health, since they can contaminate people around them). It’s a really
cool situation to see a solution that works, effectively, at two different
scales.