- Underlying maths and probability libraries should not be exposed, i.e. TensorFlow
- Interaction with GEM API is via Numpy data structures, strings, or Python constants
- High-level API is object-orientated e.g.

```
gem_model = """
x: Vector()
alpha ~ Normal(0, 1)
beta ~ Normal(0, 1)
y ~ Normal(alpha + beta*x, 0.1)
"""
model = GEM(gem_model, const_data={'x': np.random.uniform(size=10),
'y': np.random.uniform(size=10)})
mcmc = gem.mcmc.MCMC(model)
mcmc.add_step(["alpha"], algorithm="HMC", step_size=0.1, num_leapfrog_steps=20)
mcmc.add_step(["beta"], algorithm="Metropolis", adaptive=True, initial_scaling_constant=0.44)
samples, results = mcmc.sample(num_samples=1000,
start={'alpha': 1.0, 'beta': 0.0},
burnin=300,
chain=4,
num_steps_between_samples=9)
```

- Linear discrete-time meta-population epidemic model (chain binomial/multinomial)
- Hierarchical parameters
- Implement Metropolis-Hastings-based inference -- data-augmentation MCMC
- Forward simulation, with ability to condition on parameter values
- Ability to fit UK COVID-19 Lancaster model.

- Continuous time
- Automatic MCMC algorithm construction. User specifics update steps.
- No facility for adding user-specified functions in
`gemlang`

- Branching state transition models
- Cyclic state transition models

We wish to create a Metropolis Hastings proposal mechanism that adapts the covariance of a multivariate Normal random variable to the emerging posterior covariance structure. We propose to achieve this by creating a subclass of `tfp.mcmc.Transitionkernel`

that implements adaptive multi-site Metropolis Hastings as described in Roberts and Rosenthal, 2008.

Let `theta`

be vector of parameters of length `d`

. In a multisite Metropolis Hastings algorithm, we wish to draw (correlated) variate from the conditional log posterior `logp(theta | data)`

by proposing from a multivariate Normal distribution, i.e.

- Propose
`proposed_theta = MultivariateNormal(mean=theta, var=Sigma)`

- Calculate log acceptance probability
`alpha = min(logp(proposed_theta) - logp(current_theta), 0)`

- With probability
`exp(alpha)`

set`theta = proposed_theta`

, else set`theta = current_theta`

Here, `Sigma`

is the covariance matrix of the proposal, which according to Roberts and Rosenthal, 2008 should be a scaled version of the evolving empirical posterior covariance matrix, i.e. `cov(theta_v)`

where `theta_v`

is a `k x d`

matrix of draws `k`

from `theta | data`

. As `cov(theta_v)`

converges as `k -> inf`

, this scheme obeys "Diminishing Adaptation". Moreover, we should have a mixed proposal which occasionally (e.g. 5% of the time) draws from a MultivariateNormal with fixed (i.e. non-adaptive) covariance so as to obey "Bounded Convergence" (i.e. to prevent the chain from converging to a singularity, and thereby preserve ergodicity).

A suitable proposal distribution written in `tensorflow_probability`

might be the `random_walk_mvnorm_fn`

closure which returns a function `fn(mean)`

which samples from `MultivariateNormal(mean, covariance)`

:

```
def random_walk_mvnorm_fn(covariance, p_u=0.95, name=None):
"""Returns callable that adds Multivariate Normal noise to the input
:param covariance: the covariance matrix of the mvnorm proposal
:param p_u: the bounded convergence parameter. If equal to 1, then all proposals are drawn
from the MVN(0, covariance) distribution, if less than 1, proposals are drawn
from MVN(0, covariance) with probability p_u, and MVN(0, 0.1^2I_d/d) otherwise.
:returns: a function implementing the proposal.
"""
# Stabilise covariance matrix to ensure positive semi-definiteness
covariance = covariance + tf.eye(covariance.shape[0], dtype=DTYPE) * 1.0e-9
scale_tril = tf.linalg.cholesky(covariance)
rv_adaptive = tfp.distributions.MultivariateNormalTriL(
loc=tf.zeros(covariance.shape[0], dtype=DTYPE), scale_tril=scale_tril
)
rv_fixed = tfp.distributions.Normal(
loc=tf.zeros(covariance.shape[0], dtype=DTYPE),
scale=0.01 / covariance.shape[0],
)
u = tfp.distributions.Bernoulli(probs=p_u)
def _fn(state_parts, seed):
with tf.name_scope(name or "random_walk_mvnorm_fn"):
def proposal():
# For parallel computation it turns out to be quicker to sample
# both distributions and simply select the result we need
rv = tf.stack([rv_fixed.sample(), rv_adaptive.sample()])
uv = u.sample(seed=seed)
return tf.gather(rv, uv)
new_state_parts = [proposal() + state_part for state_part in state_parts]
return new_state_parts
return _fn
```

A subclass of `tfp.mcmc.TransitionKernel`

which draws from the above proposal mechanism, using a covariance matrix kept up-to-date with the latest sample from `theta`

stored in the `results`

`NamedTuple`

returned by the `tfp.mcmc.TransitionKernel.bootstrap_results`

method and returned by the `tfp.mcmc.TransitionKernel.one_step`

methods. e.g.

```
class AdaptiveMultisiteMetropolisHastingsKernel(tfp.mcmc.TransitionKernel):
def __init__(self, target_log_prob, seed=None, name=None):
# ...
pass
def one_step(self, current_state, previous_results):
# `previous_results` and `new_results` contain empirical var-covar matrix
return new_state, new_results
def bootstrap_results(self, current_state):
# Return an initial covariance matrix (e.g. tf.eye(current_state.shape)*0.1)
# in results `NamedTuple`
return results
```

Currently, transition rates in GEM are expected to be in terms of per-individual rates. The the model below, we wish to model individuals possibly transitioning to a vaccinated state, instead of an infected state within an SVIR model:

```
beta ~ Gamma(2, 10)
gamma ~ Gamma(1, 10)
rho = 0.25
delta ~ Gamma(1, 10)
Epidemic MyEpidemic() {
S = State(init=199)
V = State(init=0)
I = State(init=1)
R = State(init=0)
[S -> I] = beta * I / 200
[I -> R] = gamma
[S -> V] = rho
[V -> I] = delta * I / 200
epi ~ MyEpidemic()
```

@millerp points out that a more natural parameterisation of the `[S->V]`

transition is for a constant number of individuals to be vaccinated per day, determined by the availability of vaccination resource. We *could* parameterise as

`[S->V] = rho / S`

which achieves a constant vaccination rate irrespective of the size of `S`

. However, this is unstable as $S \rightarrow 0$. Instead, we could implement a guard condition which sets the rate to 0 if `S=0`

, for example:

`[S->V] = rho / S * (S > 0)`

In progress with re-write of backend.

Currently, simulations will stop automatically if $\sum_k \lambda_k(t) = 0$. However, we might like to stop a simulation early with an 'end' parameter supplied to `GEM.sample()`

.

Enable the user to supply a random seed value at the top level to ensure reproducibility.

This produces an error:

```
# The line below has a trailing space
alpha ~ Gamma(1, 1)
```

Fixed in 7354905a

**Chris Jewell**
(9374e29a)
*
at
19 Dec 14:48
*

**Chris Jewell**
(7354905a)
*
at
19 Dec 14:48
*

Merge branch 'trailing_ws_hotfix' into 'develop'

*
... and
2 more commits
*

**Chris Jewell**
(9374e29a)
*
at
19 Dec 14:44
*

Added test for trailing whitespace bug.

**Chris Jewell**
(4841bdb4)
*
at
19 Dec 14:38
*

Fixed bug in grammar where trailing whitespace on a gemlang stateme...

*
... and
12 more commits
*

This produces an error:

```
# The line below has a trailing space
alpha ~ Gamma(1, 1)
```

**Chris Jewell**
(c25297a1)
*
at
12 Dec 07:36
*