Solution_sir_jhelam.Rmd

---
title: "Solution: SIR model"
author: "Jhelam N. Deshpande"
date: "02/04/2023"
output: html_document
---
# Import and visualise the data set
```{r}
rm(list=ls()) # clear workspace
# import required libraries
library(rstan)
library(coda)
library(deSolve)
```
```{r}
#read data set
data<-read.csv("Data/epidemio.csv")
#view data set
head(data)
```
```{r}
#plot time series
plot(data$time,data$S,col="blue",xlim=c(0,200),ylim=c(0,20000),ylab="S or I or R",xlab="Time",pch=16)
points(data$time,data$I,col="red",pch=16)
points(data$time,data$R,col="green",pch=16)

plot(data$time,data$S+data$I+data$R,col="black",xlim=c(0,200),ylim=c(0,20000),ylab="S+I+R",xlab="Time",pch=16)
```


# Formulate the model

We fit the following system of ODE to the data. The $S(t)$ represent the dynamics of susceptibles, $I(t)$ of infected and $R(t)$ of recovered. The total N(t) is then $N(t)=S(t)+I(t)+R(t)$, notice from above plot that the remains constant. The following equation represents susceptible dynamics:

$$\frac{dS}{dt}=-\beta S\frac{I}{N}$$
And infected dynamics are given by:
$$\frac{dI}{dt}=\beta S\frac{I}{N} - \gamma R$$
Recovered dynamics are:
And infected dynamics are given by:
$$\frac{dR}{dt}= \gamma R$$
$\beta$ is transmission rate and $\gamma$ is the recovery rate. The solution to this ODE is denoted as $S(t)$,  $I(t)$ and $R(t)$. Notice that total density is also constant in this equation $N(t)=S(t)+I(t)+R(r)=N_O$ where $N_0$ is the initial number of individuals. The observed values as $S_{obs}(t)$, $I_{obs}(t)$ and $N_{0,obs}$. We do not need to specify recovered dynamics because they are obtained by $R(t)=N_0-S(t)-I(t)$. We write likelihoods as follows $S_{obs}(t)\sim Normal(S(t),\sigma)$, $I_{obs}(t)\sim Normal(I(t),\sigma)$ and $N_{0} \sim Normal(N_{total},\sigma)$. So we assume that the only source of error is observational error.

# Formatting data for model fitting
Here we will explore fitting for multiple replicates with this example.
```{r}
# keep only one replicate
#N and P are now 2d matrices rather than arrays
m=length(unique(data$replicate))
S=matrix(data$S,m,byrow=T)# time series of for spp1
I=matrix(data$I,m,byrow=T) #time series of spp2
Ntot=matrix(data$S+data$I+data$R,m,byrow=T)[,1]
t=matrix(data$time,m,byrow=T)[1,] # times
n=length(t) #size of data set

#rstan reads data as a named list
data_rstan=list(n=n,m=m,t=t,S=S,I=I,Ntot=Ntot)
```

# Translate to rstan
```{r}
model_competition_str='
  //write function for ode
  functions{
    real[] odemodel(real t, real[] N, real[] p, real[] x_r, int[] x_i){
      real dNdt[2];
      //p[1]=beta, p[2]=gamma, p[3]=Ntot
      dNdt[1]=-p[1]*N[1]*N[2]/p[3];
      dNdt[2]=p[1]*N[1]*N[2]/p[3]-p[2]*N[2];
      return dNdt;
    }
  }
  //data 
  data
  {
    //make sure the names are the same as the list in R
    int n;
    int m;
    real t[n];
    real S[m,n];
    real I[m,n];
    real Ntot[m];
  }
  //parameters that have to be estimated go here
  parameters
  {
    real<lower=0> beta;
    real<lower=0> gamma;
    real<lower=0> Sinit[m];
    real<lower=0> Iinit[m];
    real<lower=0> sigma;
  }
  //model
  model
  {
    real p[3]; //store parameters to pass to ode 
    real N_sim[n-1,2];  //store simulated values
    
    //write priors
    beta~lognormal(-0.5,1);
    gamma~lognormal(-0.5,1);
    sigma~gamma(2,0.1);

    for(j in 1:m)
    {
      Sinit[j]~normal(S[j,1],100);
      Iinit[j]~normal(I[j,1],1);
       //parameters for integrator
      p[1]=beta;
      p[2]=gamma;
      p[3]=Ntot[j];
      N_sim=integrate_ode_rk45(odemodel,{Sinit[j],Iinit[j]},t[1],t[2:n],p,rep_array(0.0,0),rep_array(0,0));
      S[j,1]~normal(Sinit[j],sigma);
      I[j,1]~normal(Iinit[j],sigma);
      for(i in 2:n)
      {
        S[j,i]~normal(N_sim[i-1,1],sigma);
        I[j,i]~normal(N_sim[i-1,2],sigma);
      }
    }
    }
  generated quantities{
  }
  '
```
Compile the model
```{r}
model=stan_model(model_code=model_competition_str,auto_write = TRUE)
```

# Fit model to data using MCMC

```{r}
#stan options
chains=3
#rstan_options(auto_write=TRUE)
options(mc.cores=chains)
iter=6000
warmup=4000

#initial value for sampling
init=rep(list(list(beta=0.1,gamma=0.1,Sinit=data_rstan$S[,1],Iinit=data_rstan$I[,1],sigma=10)),chains)
fit=sampling(model,data=data_rstan,iter=iter,warmup=warmup,chains=chains,init=init)
```


# Model diagnostics
```{r}
print(fit,digits=3)
```
```{r}
params=c("beta","gamma")
samples=As.mcmc.list(fit)
plot(samples[,params])
```
```{r}
pairs(fit,pars=params)
```

#Posterior predictions

We now solve the ODE for 1000 samples of parameter estimates.
```{r}
ode.model=function(t,N,p)
{
  beta=p$beta
  gamma=p$gamma
  Ntot=p$Ntot
  dS=-beta*N[1]*N[2]/Ntot
  dI=beta*N[1]*N[2]/Ntot-gamma*N[2]
  return(list(c(dS,dI)))
}

posteriors=as.matrix(fit) #posterior predictions


n_post=1000 # number of samples drawn
times=seq(min(data_rstan$t),max(data_rstan$t),length.out=50)
predictions=data.frame()

for(j in 1:m)
{  
for(k in 1:n_post)
{
    par=posteriors[sample(1:nrow(posteriors),1),]
    sim=ode(c(par[paste("S","init","[",j,"]",sep="")],par[paste("I","init","[",j,"]",sep="")]),times,ode.model,list(beta=par["beta"],gamma=par["gamma"],Ntot=data_rstan$Ntot[j]))
    temp=data.frame(sample_no=k,repl=j,time=sim[,1],S=sim[,2],I=sim[,3],R=data_rstan$Ntot[j]-sim[,2]-sim[,3])
    predictions=rbind(predictions,temp)
}
}
```

```{r}
#plot raw data
for(j in 1:m)
{
plot(data_rstan$t,data_rstan$S[j,],col="blue",pch=16,ylim=c(0,20000))
points(data_rstan$t,data_rstan$I[j,],col="red",pch=16,ylim=c(0,50))
points(data_rstan$t,data_rstan$Ntot[j]-data_rstan$I[j,]-data_rstan$S[j,],col="green",pch=16,ylim=c(0,50))
#plot posterior predictions
for(k in 1:n_post)
{
  lines(predictions$time[predictions$sample_no==k & predictions$repl==j],predictions$S[predictions$sample_no==k & predictions$repl==j],col=rgb(0,0,1,0.1))
  lines(predictions$time[predictions$sample_no==k & predictions$repl==j],predictions$I[predictions$sample_no==k & predictions$repl==j],col=rgb(1,0,0,0.1))
    lines(predictions$time[predictions$sample_no==k & predictions$repl==j],predictions$R[predictions$sample_no==k & predictions$repl==j],col=rgb(0,1,0,0.1))
}
}
```