Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for saving and restoring simulation state. #280

Merged
merged 12 commits into from
Feb 27, 2024

Conversation

plietar
Copy link
Member

@plietar plietar commented Jan 22, 2024

The model can now be run for a given number of steps, have its state saved, and then restore it and run for some more time steps.

Parameters of the model may be modified when resuming, allowing a simulation to be forked with a single historical run and many futures, modelling different intervention scenarios.

There are some limitations as to which parameters may be modified. In general, the structure of the simulation must remain the same, with the same variables and same events. This means interventions cannot be enabled or disabled when resuming, only their parameterization can change. Additionally, the timing of intervention events should not be modified. These limitations may be relaxed in the future, if there is a need for it.

To avoid changing the existing API, this feature is available as a new run_resumable_simulation function. The function returns a pair of values, the existing dataframe with the simulation data and an object representing the internal simulation state. The function can be called a second time, passing the saved simulation state as an extra argument. See the test-resume.R file for usage of the new function.

The implementation builds on top of individual's new support for this. Individual already saves the state of its native objects, ie. variables and events. The malaria model keeps quite a bit of state outside of individual, which we need to save and restore explicitly. This is done by creating a set of "stateful objects", ie. R6 objects with a pair save_state and restore_state methods. This interface is implemented by every bit of the model that has state to capture:

  • LaggedValue objects store the short term EIR and FOIM history.
  • Solver objects represent the current state of ODE solvers.
  • Adult mosquito ODEs keep a history of incubating values which need to be restored.
  • CorrelationParameters stores a randomly sampled value. This needs to be saved to ensure the simulation is resumed with that same value.
  • In addition to R's native RNG (whose state is already saved by individual), the model uses the dqrng library, whose state needs saving.

Fixes #196

@plietar
Copy link
Member Author

plietar commented Jan 22, 2024

This depends on unreleased versions of both individual and of dqrng (including daqana/dqrng#75). We'll need to make/ask for releases of both of these.

@plietar plietar changed the base branch from master to dev January 22, 2024 15:24
@plietar plietar force-pushed the save-restore branch 2 times, most recently from 7cb5d4d to df84506 Compare January 22, 2024 15:30
@plietar
Copy link
Member Author

plietar commented Jan 23, 2024

Here is an example use case, from @pwinskill's use case in #196.

p_historical <- get_parameters(overrides = list(
  human_population = 1000, individual_mosquitoes = TRUE
)) |>
  set_drugs(
    drugs = list(SP_AQ_params)
  ) |>
  set_mda(
    drug = 1,
    # Because of how events are scheduled, we must set an MDA event now at
    # 365+100 otherwise it wouldn't happen in the second part.
    timesteps = c(100, 365+100),
    coverages =  c(0.5, 0.0),
    min_ages = c(0, 0),
    max_ages = c(80 * 365, 80*365)
  ) |>
  set_equilibrium(
    init_EIR = 5
  )

# Run the historical scenario up to the 1-Year mark
set.seed(123)
s_historical <- run_resumable_simulation(
  timesteps = 365,
  parameters = p_historical
)

plot(
  (s_historical$data$n_detect_730_3650 / s_historical$data$n_730_3650) ~ s_historical$data$timestep,
  t = "l",
  ylab = "Prevalence", xlab = "Time",
  ylim = c(0, 1), xlim = c(0, 365 * 2)
)
abline(v = 365, lty = 2)

p_future1 <- p_historical |> set_mda(
  drug = 1,
  timesteps = c(100, 365 + 100),
  coverages =  c(0.5, 0.1),
  min_ages = c(0, 0),
  max_ages = c(80 * 365, 80 * 365)
)

p_future2 <- p_historical |> set_mda(
  drug = 1,
  timesteps = c(100, 365 + 100),
  coverages =  c(0.5, 0.8),
  min_ages = c(0, 0),
  max_ages = c(80 * 365, 80 * 365)
)

# Run two future scenarios (t = 366:730).
# Restore the state from the end of the historical data.
s_future1 <- run_resumable_simulation(
  timesteps = 2*365,
  parameters = p_future1,
  initial_state = s_historical$state
)
lines(
  (s_future1$data$n_detect_730_3650 / s_future1$data$n_730_3650) ~ s_future1$data$timestep,
  col = adjustcolor("deeppink", alpha.f = 0.5))

s_future2 <- run_resumable_simulation(
  timesteps = 2*365,
  parameters = p_future2,
  initial_state = s_historical$state
)
lines(
  (s_future2$data$n_detect_730_3650 / s_future2$data$n_730_3650) ~ s_future1$data$timestep,
  col = adjustcolor("dodgerblue", alpha.f = 0.5))

# Sanity check, the resumed runs are equivalent to full ones
set.seed(123)
s_expected1 <- run_simulation(timesteps = 2*365,  parameters = p_future1)
set.seed(123)
s_expected2 <- run_simulation(timesteps = 2*365,  parameters = p_future2)
stopifnot(all.equal(s_future1$data, s_expected1[366:(2*365),]))
stopifnot(all.equal(s_future2$data, s_expected2[366:(2*365),]))

Copy link
Member

@giovannic giovannic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful. A few minor comments.

I'll make a minor release of individual as soon as I get a moment.

If you're happy to wait for daqana, then that's fine. If not, we could depend on an mrc-ide fork in the meantime. As long as you promise to point back to daqana when they update to CRAN.

}

//[[Rcpp::export]]
void timeseries_restore_state(Rcpp::XPtr<Timeseries> timeseries, Rcpp::List state) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking std::vector<double> timesteps, std::vector<double> values as arguments will give better Rcpp errors if anything goes wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like doing that as it splits the layout of the state object across two different files. We'd be constructing it in the C++ file, but de-structuring it in R.

The errors I get from Rcpp as is seem fine:

timeseries_restore_state(ts, list(timestep=1, value=NULL))
#> Error: Not compatible with requested type: [type=NULL; target=double].
timeseries_restore_state(ts, list(timestep=1, value="xx"))
#> Error: Not compatible with requested type: [type=character; target=double].
timeseries_restore_state(ts, list(x=1))
#> Error: Index out of bounds: [index='timestep'].

The indexing and implicit conversion into an std::vector does the same checks Rcpp does when calling the function.

src/timeseries.cpp Outdated Show resolved Hide resolved
src/timeseries.cpp Outdated Show resolved Hide resolved
plietar and others added 9 commits February 22, 2024 15:16
The model can now be run for a given number of steps, have its state
saved, and then restore it and run for some more time steps.

Parameters of the model may be modified when resuming, allowing a
simulation to be forked with a single historical run and many futures,
modelling different intervention scenarios.

There are some limitations as to which parameters may be modified. In
general, the structure of the simulation must remain the same, with the
same variables and same events. This means interventions cannot be
enabled or disabled when resuming, only their parameterization can
change. Additionally, the timing of intervention events should not be
modified. These limitations may be relaxed in the future, if there is a
need for it.

To avoid changing the existing API, this feature is available as a new
`run_resumable_simulation` function. The function returns a pair of
values, the existing dataframe with the simulation data and an object
representing the internal simulation state. The function can be called a
second time, passing the saved simulation state as an extra argument.
See the `test-resume.R` file for usage of the new function.

The implementation builds on top of individual's new support for this.
Individual already saves the state of its native objects, ie. variables
and events. The malaria model keeps quite a bit of state outside of
individual, which we need to save and restore explicitly. This is done
by creating a set of "stateful objects", ie. R6 objects with a pair
`save_state` and `restore_state` methods. This interface is implemented
by every bit of the model that has state to capture:

- `LaggedValue` objects store the short term EIR and FOIM history.
- `Solver` objects represent the current state of ODE solvers.
- Adult mosquito ODEs keep a history of incubating values which need to
  be restored.
- `CorrelationParameters` stores a randomly sampled value. This needs to
  be saved to ensure the simulation is resumed with that same value.
- In addition to R's native RNG (whose state is already saved by
  individual), the model uses the dqrng library, whose state needs
  saving.
@plietar plietar force-pushed the save-restore branch 3 times, most recently from 3bfc68d to 3c5c736 Compare February 27, 2024 17:05
@plietar plietar merged commit c42e2c5 into dev Feb 27, 2024
4 checks passed
plietar added a commit that referenced this pull request May 22, 2024
The recent dqrng 0.4.0 release has a few small incompatible changes that
makes the build fail.

The changes necessary have been applied to the dev branch already as
part of larger pull requests #280 and #305. This commit cherry-picks
just the fixes for the master branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to save and load state and simulation history
2 participants