diff --git a/moptipyapps/dynamic_control/objective.py b/moptipyapps/dynamic_control/objective.py index 61a3a2f3..5866815c 100644 --- a/moptipyapps/dynamic_control/objective.py +++ b/moptipyapps/dynamic_control/objective.py @@ -8,7 +8,7 @@ We offer two different approaches for this: -- :class:`FigureOfMerit` computes the arithmetic mean `z ` over the separate +- :class:`FigureOfMerit` computes the arithmetic mean `z` over the separate figures of merit of the training cases. - :class:`FigureOfMeritLE` tries to smooth out the impact of bad starting states by computing `exp(mean[log(z + 1)]) - 1`. diff --git a/moptipyapps/dynamic_control/ode.py b/moptipyapps/dynamic_control/ode.py index 0f74f6a1..b03ab590 100644 --- a/moptipyapps/dynamic_control/ode.py +++ b/moptipyapps/dynamic_control/ode.py @@ -550,7 +550,28 @@ def diff_from_ode(ode: np.ndarray, state_dim: int) \ works reasonably well, then we could essentially plug this model into :func:`run_ode` instead of the original `equations` parameter. - :param ode: the result of :func:`ode_run` + What this function does to compute the differential is to basically + "invert" the dynamic weighting done by :func:`run_ode`. :func:`run_ode` + starts in a given starting state `s`. It then computes the control vector + `c` as a function of `s`, i.e., `c(s)`. Then, the equations of the dynamic + system (see module :mod:`~moptipyapps.dynamic_control.system`) to compute + the state differential `D=ds/dt` as a function of `c(s)` and `s`, i.e., as + something like `D(s, c(s))`. The next step would be to update the state, + i.e., to set `s=s+D(s, c(s))`. Unfortunately, this can make `s` go to + infinity. So :func:`run_ode` will compute a dynamic weight `w` and do + `s=s+w*D(s, c(s))`, where `w` is chosen such that the state vector `s` + does not grow unboundedly. While `s` and `c(s)` and `w` are stored in one + row of the result matrix of :func:`run_ode`, `s+w*D(s,c(s))` is stored as + state `s` in the next row. So what this function here basically does is to + subtract the old state from the next state and divide the result by `w` to + get `D(s, c(s))`. `s` and `c(s)` are already available directly in the ODE + result and `w` is not needed anymore. + + We then get the rows `s, c(s)` and `D(s, c(s))` in the first and second + result matrix, respectively. This can then be used to train a system model + as proposed in model :mod:`~moptipyapps.dynamic_control.system_model`. + + :param ode: the result of :func:`run_ode` :param state_dim: the state dimensions :returns: a tuple of the state+control vectors and the resulting state differential vectors diff --git a/moptipyapps/dynamic_control/system_model.py b/moptipyapps/dynamic_control/system_model.py index 01b4bd78..d2307469 100644 --- a/moptipyapps/dynamic_control/system_model.py +++ b/moptipyapps/dynamic_control/system_model.py @@ -6,8 +6,11 @@ controller output `control` for the state vector `state`. It will return the differential of the system state, i.e., `dstate/dT`. In other words, the constructed model can replace the `equations` parameter in -:func:`~moptipyapps.dynamic_control.ode.run_ode`. The idea is here to -re-use the same function models as used in controllers +:func:`~moptipyapps.dynamic_control.ode.run_ode`. The input used for +training is provided by +:func:`~moptipyapps.dynamic_control.ode.diff_from_ode`. + +The idea here is to re-use the same function models as used in controllers (:mod:`~moptipyapps.dynamic_control.controller`), learn their parameterizations from the observed data, and wrap everything together into a callable.