Update to AIBECS v0.11 API (#12)

* Update to AIBECS v0.11 API * Update documenter version * Add ref to AIBECS in ReadMe
briochemc · Dec 8, 2021 · 3c41182 · 3c41182 · briochemc · Dec 8, 2021
1 parent 6be396a
commit 3c41182
Show file tree

Hide file tree

Showing 8 changed files with 75 additions and 80 deletions.
diff --git a/Project.toml b/Project.toml
@@ -1,7 +1,7 @@
 name = "F1Method"
 uuid = "d5605bae-6d76-11e9-2fd7-71bcf42edbaa"
 authors = ["Benoit Pasquier <[email protected]>"]
-version = "0.4.1"
+version = "0.5.0"
 
 [deps]
 DiffEqBase = "2b5f629d-d688-5b77-993f-72d75c75574e"

diff --git a/README.md b/README.md
@@ -57,15 +57,15 @@ Optimizing the model is then simply done by minimizing `objective(p)`.
 
 The F-1 algorithm is **easy** to use, gives **accurate** results, and is computationally **fast**:
 
-- **Easy** — The F-1 algorithm basically just needs the user to provide a solver (for finding the steady-state), the mismatch function, `f`, the state function, `F`, and their derivatives, `∇ₓf` and `∇ₓF` w.r.t. the state `x`. 
-    (Note these derivatives can be computed numerically, via the [ForwardDiff](https://github.com/JuliaDiff/ForwardDiff.jl) package for example.) 
+- **Easy** — The F-1 algorithm basically just needs the user to provide a solver (for finding the steady-state), the mismatch function, `f`, an ODEFunction, `F` with its Jacobian, and the gradient of the objective w.r.t. `∇ₓf`.
+    (Note these derivatives can be computed numerically, via the [ForwardDiff](https://github.com/JuliaDiff/ForwardDiff.jl) package for example.)
 - **Accurate** — Thanks to ForwardDiff's nested dual numbers implementation, the accuracy of the gradient and Hessian, as computed by the F-1 algorithm, are close to machine precision.
 - **Fast** — The F-1 algorithm is as fast as if you derived analytical formulas for every first and second derivatives *and* used those in the most efficient way.
     This is because the bottleneck of such computations is the number of matrix factorizations, and the F-1 algorithm only requires a single one. In comparison, standard autodifferentiation methods that take the steady-state solver as a black box would require order `m` or `m^2` factorizations, where `m` is the number of parameters.
 
 ## What's needed?
 
-A requirement of the F-1 algorithm is that the Jacobian matrix `A = ∇ₓf` can be created, stored, and factorized.
+A requirement of the F-1 algorithm is that the Jacobian matrix `A = ∇ₓF` can be created, stored, and factorized.
 
 To use the F-1 algorithm, the user must:
 
@@ -81,16 +81,16 @@ Once initial values for the state, `x`, and parameters, `p`, are chosen, simply
 
 ```julia
 # Initialize the cache for storing reusable objects
-mem = initialize_mem(F, ∇ₓf, ∇ₓF, x, p, alg; options...)
+mem = initialize_mem(F, ∇ₓf, x, p, alg; options...)
 ```
 
 wrap the functions into functions of `p` only via
 
 ```julia
 # Wrap the objective, gradient, and Hessian functions
-objective(p) = F1Method.objective(f, F, ∇ₓF, mem, p, alg; options...)
-gradient(p) = F1Method.gradient(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
-hessian(p) = F1Method.hessian(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+objective(p) = F1Method.objective(f, F, mem, p, alg; options...)
+gradient(p) = F1Method.gradient(f, F, ∇ₓf, mem, p, alg; options...)
+hessian(p) = F1Method.hessian(f, F, ∇ₓf, mem, p, alg; options...)
 ```
 
 and compute the objective, gradient, or Hessian via either of
@@ -112,3 +112,10 @@ Now you can test how fast and accurate it is!
 If you use this package, or implement your own package based on the F-1 algorithm please cite us.
 If you use the F-1 algorithm, please cite *[Pasquier and Primeau](https://www.bpasquier.com/publication/pasquier_primeau_sisc_2019/)* (in prep.).
 If you also use this package directly, please cite it! (Use [the Zenodo link](https://doi.org/10.5281/zenodo.2667835) or the [CITATION.bib file](./CITATION.bib), which contains a bibtex entry.)
+
+# Future
+
+This package is developed mainly for use with [AIBECS.jl](https://github.com/JuliaOcean/AIBECS.jl) and is likely not in its final form.
+The API was just changed in v0.5 (to match the API changes in AIBECS.jl v0.11).
+That being said, ultimately, it would make sense for the shortcuts used here to be integrated into a package like ChainRules.jl.
+For the time being, AIBECS users can use F1Method.jl to speed up their optimizations.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -4,4 +4,4 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
 
 [compat]
-Documenter = "~0.25"
+Documenter = "~0.27"
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -7,7 +7,7 @@ This package provides an efficient tool to compute gradient and Hessian matrix o
 When using Newton-type algorithms for optimization, computing the gradient and Hessian can be computationally expensive.
 A typical scientific application is to optimize the parameters of a model which solves for a root through another iterative Newton-like algorithm.
 In this case, there are a number of shortcuts that can be leveraged.
-This was the motivation for the work of [Pasquier et al. (2019, in preparation)](), and for this package.
+
 
 ## Usage
 
@@ -29,17 +29,15 @@ As an example, we use a simple model with only two state variables and two param
 
 ```jldoctest usage
 # State function F
-F(x,p) = [
+statefun(x,p) = [
     -2 * (p[1] - x[1]) - 4 * p[2] * (x[2] - x[1]^2) * x[1]
     p[2] * (x[2] - x[1]^2)
 ]
-
-# Jacobian of F wrt p
-∇ₓF(x,p) = ForwardDiff.jacobian(x -> F(x,p), x)
+F = ODEFunction(statefun, jac = (x,p) -> ForwardDiff.jacobian(x -> statefun(x, p), x))
 
 # output
 
-∇ₓF (generic function with 1 method)
+(::ODEFunction{false, typeof(statefun), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}) (generic function with 7 methods)
 ```
 
 We also define a cost function `f(x,p)` (that we wish to minimize under the constraint that ``\boldsymbol{F}(\boldsymbol{x},\boldsymbol{p}) = 0``).
@@ -65,7 +63,7 @@ f(x,p) = state_mismatch(x) + parameter_mismatch(p)
 ```
 
 Once these are set up, we need to let the F-1 method know how to solve for the steady-state.
-We do this by using the [DiffEqBase](https://github.com/JuliaDiffEq/DiffEqBase.jl) API.
+We do this by using the [DiffEqBase](https://github.com/SciML/DiffEqBase.jl) API.
 For that, we first write a small Newton solver algorithm, we overload the `solve` function from DiffEqBase, and we overload the `SteadyStateProblem` constructor.
 
 ```jldoctest usage
@@ -85,25 +83,17 @@ function DiffEqBase.solve(prob::DiffEqBase.AbstractSteadyStateProblem,
                           Ftol=1e-10)
     # Define the functions according to DiffEqBase.SteadyStateProblem type
     p = prob.p
-    t = 0
     x0 = copy(prob.u0)
     dx, df = copy(x0), copy(x0)
-    F(x) = prob.f(dx, x, p, t)
-    ∇ₓF(x) = prob.f(df, dx, x, p, t)
+    F(x) = prob.f(x, p)
+    ∇ₓF(x) = prob.f.jac(x, p)
     # Compute `u_steady` and `resid` as per DiffEqBase using my algorithm
     x_steady = newton_solve(F, ∇ₓF, x0, Ftol=Ftol)
     resid = F(x_steady)
     # Return the common DiffEqBase solution type
     DiffEqBase.build_solution(prob, alg, x_steady, resid; retcode=:Success)
 end
 
-# Overload DiffEqBase's SteadyStateProblem constructor
-function DiffEqBase.SteadyStateProblem(F, ∇ₓF, x, p)
-    f(dx, x, p, t) = F(x, p)
-    f(df, dx, x, p, t) = ∇ₓF(x, p)
-    return DiffEqBase.SteadyStateProblem(f, x, p)
-end
-
 # output
 
 
@@ -123,11 +113,11 @@ Finally, we wrap the objective, gradient, and Hessian functions defined by the F
 
 ```jldoctest usage
 # Initialize the cache for storing reusable objects
-mem = F1Method.initialize_mem(F, ∇ₓf, ∇ₓF, x₀, p₀, MyAlg())
+mem = F1Method.initialize_mem(F, ∇ₓf, x₀, p₀, MyAlg())
 # Define the functions via the F1 method
-F1_objective(p) = F1Method.objective(f, F, ∇ₓF, mem, p, MyAlg())
-F1_gradient(p) = F1Method.gradient(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
-F1_Hessian(p) = F1Method.hessian(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
+F1_objective(p) = F1Method.objective(f, F, mem, p, MyAlg())
+F1_gradient(p) = F1Method.gradient(f, F, ∇ₓf, mem, p, MyAlg())
+F1_Hessian(p) = F1Method.hessian(f, F, ∇ₓf, mem, p, MyAlg())
 
 # output
 
@@ -164,3 +154,9 @@ F1_Hessian(p₀)
   0.0    -0.0241434
 ```
 
+# Future
+
+This package is likely not in its final form.
+The API was just changed in v0.5 (to match the API changes in AIBECS.jl v0.11).
+That being said, ultimately, it would make sense for the shortcuts used here to be integrated into a package like ChainRules.jl.
+For the time being, AIBECS users can use F1Method.jl to speed up their optimizations.
diff --git a/src/F1Method.jl b/src/F1Method.jl
@@ -31,60 +31,60 @@ mutable struct Mem{Ts, TA, T∇s, T∇ₓf, Tp}
 end
 
 
-function update_mem!(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+function update_mem!(f, F, ∇ₓf, mem, p, alg; options...)
     if p ≠ mem.p                      # only update mem if 𝒑 has changed
-        update_solution!(F, ∇ₓF, mem, p, alg; options...)
+        update_solution!(F, mem, p, alg; options...)
         ∇ₚF = ForwardDiff.jacobian(p -> F(mem.s, p), p)
-        mem.A = factorize(∇ₓF(mem.s, p)) # update factors of ∇ₓ𝑭(𝒔,𝒑)
+        mem.A = factorize(F.jac(mem.s, p)) # update factors of ∇ₓ𝑭(𝒔,𝒑)
         mem.∇s .= mem.A \ -∇ₚF           # update ∇𝒔
         mem.∇ₓf .= ∇ₓf(mem.s, p)         # update ∇ₓ𝑓(𝒔,𝒑)
         mem.p .= p                  # update 𝒑 for the variables above
     end
 end
 
-function update_solution!(F, ∇ₓF, mem, p, alg; options...)
+function update_solution!(F, mem, p, alg; options...)
     if p ≠ mem.psol
-        prob = SteadyStateProblem(F, ∇ₓF, mem.s, p) # define problem
+        prob = SteadyStateProblem(F, mem.s, p) # define problem
         mem.s .= solve(prob, alg; options...).u      # update 𝒔
         mem.psol .= p                          # update 𝒑 for 𝒔
     end
 end
 
 """
-    objective(f, F, ∇ₓF, mem, p, alg; options...)
+    objective(f, F, mem, p, alg; options...)
 
 Returns `f(x,p)` such that `F(x,p)=0` using the F-1 method.
 
-Specifically, `objective(f, F, ∇ₓF, mem, p, alg; options...)`
+Specifically, `objective(f, F, mem, p, alg; options...)`
 evaluates the objective function defined by `f̂(p) = f(s(p),p)`, where
 `s(p)`, which is the steady-state solution (i.e., such that `F(s(p),p)=0`)
 is computed by the iterative Newton-type solver `alg`.
-The Jacobian, `∇ₓF`, and the memory cache `mem` must be supplied.
+The memory cache `mem` must be supplied.
 """
-function objective(f, F, ∇ₓF, mem, p, alg; options...)
-    update_solution!(F, ∇ₓF, mem, p, alg; options...)
+function objective(f, F, mem, p, alg; options...)
+    update_solution!(F, mem, p, alg; options...)
     return f(mem.s, p)
 end
 
 """
-    gradient(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+    gradient(f, F, ∇ₓf, mem, p, alg; options...)
 
 Returns the gradient of the `objective` function using the F-1 method.
 """
-function gradient(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
-    update_mem!(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+function gradient(f, F, ∇ₓf, mem, p, alg; options...)
+    update_mem!(f, F, ∇ₓf, mem, p, alg; options...)
     s, ∇s, m = mem.s, mem.∇s, length(p)
     ∇ₚf = ForwardDiff.jacobian(p -> [f(s,p)], p)
     return mem.∇ₓf * ∇s + ∇ₚf
 end
 
 """
-    hessian(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+    hessian(f, F, ∇ₓf, mem, p, alg; options...)
 
 Returns the Hessian of the `objective` function using the F-1 method.
 """
-function hessian(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
-    update_mem!(f, F, ∇ₓf, ∇ₓF, mem, p, alg; options...)
+function hessian(f, F, ∇ₓf, mem, p, alg; options...)
+    update_mem!(f, F, ∇ₓf, mem, p, alg; options...)
     s, A, ∇s, m = mem.s, mem.A, mem.∇s, length(p)
     A⁻ᵀ∇ₓfᵀ = vec(A' \ mem.∇ₓf') # independent of (𝑗,𝑘)
     H(λ) = f(s+∇s*λ, p+λ) - F(s+∇s*λ, p+λ)' * A⁻ᵀ∇ₓfᵀ
@@ -96,13 +96,13 @@ end
 
 Initializes the memory cache for the F-1 method.
 """
-function initialize_mem(F, ∇ₓf, ∇ₓF, x, p, alg; options...)
+function initialize_mem(F, ∇ₓf, x, p, alg; options...)
     x = copy(x)
     p = copy(p)
     psol = copy(p)
-    prob = SteadyStateProblem(F, ∇ₓF, x, p)
+    prob = SteadyStateProblem(F, x, p)
     s = solve(prob, alg; options...).u
-    A = factorize(∇ₓF(s, p))
+    A = factorize(F.jac(s, p))
     ∇ₚF = ForwardDiff.jacobian(p -> F(s, p), p)
     return Mem(s, A, A \ -∇ₚF, ∇ₓf(s, p), p, psol)
 end

diff --git a/test/AIBECS_test.jl b/test/AIBECS_test.jl
@@ -78,10 +78,10 @@ prior(::T) where {T<:AbstractParameters} = prior(T)
 p = PmodelParameters()
 λ = p2λ(p)
 nb = sum(iswet(grd))
-F, ∇ₓF = F_and_∇ₓF((T_DIP, T_POP), (G_DIP, G_POP), nb, PmodelParameters)
+F = AIBECSFunction((T_DIP, T_POP), (G_DIP, G_POP), nb, PmodelParameters)
 @unpack DIP_geo = p
 x = DIP_geo * ones(2nb) # initial guess
-prob = SteadyStateProblem(F, ∇ₓF, x, p)
+prob = SteadyStateProblem(F, x, p)
 sol = solve(prob, CTKAlg())
 
 # Observations from World Ocean Atlas used to evaluate the
@@ -100,10 +100,10 @@ f, ∇ₓf = f_and_∇ₓf(ωs, ωp, grd, modify, obs, PmodelParameters)
 
 # Now we apply the F1 method
 τ = ustrip(u"s", 1e3u"Myr")
-mem = F1Method.initialize_mem(F, ∇ₓf, ∇ₓF, x, λ, CTKAlg(), τstop=τ)
-objective(λ) = F1Method.objective(f, F, ∇ₓF, mem, λ, CTKAlg(), τstop=τ)
-gradient(λ) = F1Method.gradient(f, F, ∇ₓf, ∇ₓF, mem, λ, CTKAlg(), τstop=τ)
-hessian(λ) = F1Method.hessian(f, F, ∇ₓf, ∇ₓF, mem, λ, CTKAlg(), τstop=τ)
+mem = F1Method.initialize_mem(F, ∇ₓf, x, λ, CTKAlg(), τstop=τ)
+objective(λ) = F1Method.objective(f, F, mem, λ, CTKAlg(), τstop=τ)
+gradient(λ) = F1Method.gradient(f, F, ∇ₓf, mem, λ, CTKAlg(), τstop=τ)
+hessian(λ) = F1Method.hessian(f, F, ∇ₓf, mem, λ, CTKAlg(), τstop=τ)
 
 # Finally we test the result with the "reliable" FiniteDiff :)
 @test FiniteDiff.finite_difference_gradient(objective, 2λ) ≈ gradient(2λ)' rtol=1e-3

diff --git a/test/diffusion_problem.jl b/test/diffusion_problem.jl
@@ -52,11 +52,11 @@ using Test, FormulaOneMethod
     ∇sink(x,p) = p[4] * I / n
 
     # Define state function F(x,p) and Jacobian ∇ₓF(x,p)
-    F(x,p) = -T(p) * x + source(x,p) - sink(x,p)
-    ∇ₓF(x,p) = -T(p) + ∇source(x,p) - ∇sink(x,p)
+    F(x,p) = ODEFunction((x,p) -> -T(p) * x + source(x,p) - sink(x,p),
+                         jac = (x,p) -> -T(p) + ∇source(x,p) - ∇sink(x,p))
 
     # Basic Newton solver
-    function newton_solve(F, ∇ₓF, x; Ftol=1e-10)
+    function newton_solve(F, x; Ftol=1e-10)
         while norm(F(x)) ≥ Ftol
             x .-= ∇ₓF(x) \ F(x)
         end
@@ -78,14 +78,14 @@ using Test, FormulaOneMethod
         F(x) = prob.f(dx, x, p, t)
         ∇ₓF(x) = prob.f(df, dx, x, p, t)
         # Compute `u_steady` and `resid` as per DiffEqBase using my algorithm
-        x_steady = newton_solve(F, ∇ₓF, x0, Ftol=Ftol)
+        x_steady = newton_solve(F, x0, Ftol=Ftol)
         resid = F(x_steady)
         # Return the common DiffEqBase solution type
         DiffEqBase.build_solution(prob, alg, x_steady, resid; retcode=:Success)
     end
 
     # Overload DiffEqBase's SteadyStateProblem constructor
-    function DiffEqBase.SteadyStateProblem(F, ∇ₓF, x, p)
+    function DiffEqBase.SteadyStateProblem(F, x, p)
         f(dx, x, p, t) = F(x, p)
         f(df, dx, x, p, t) = ∇ₓF(x, p)
         return DiffEqBase.SteadyStateProblem(f, x, p)
@@ -114,15 +114,15 @@ using Test, FormulaOneMethod
     mem = F1.initialize_mem(x₀, p₀)
 
     # Compute the objective function, 𝑓̂(𝒑)
-    objective(p) = F1.f̂(f, F, ∇ₓF, mem, p, MyAlg())
+    objective(p) = F1.f̂(f, F, mem, p, MyAlg())
     objective(p₀)
 
     # Compute the gradient, ∇𝑓̂(𝒑)
-    gradient(p) = F1.∇f̂(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
+    gradient(p) = F1.∇f̂(f, F, ∇ₓf, mem, p, MyAlg())
     gradient(p₀)
 
     # Compute the Hessian matrix, ∇²𝑓̂(𝒑)
-    Hessian(p) = F1.∇²f̂(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
+    Hessian(p) = F1.∇²f̂(f, F, ∇ₓf, mem, p, MyAlg())
     Hessian(p₀)
 
 #end
diff --git a/test/rosenbrock.jl b/test/rosenbrock.jl
@@ -30,24 +30,18 @@ function DiffEqBase.solve(prob::DiffEqBase.AbstractSteadyStateProblem,
                           Ftol=1e-10)
     # Define the functions according to DiffEqBase.SteadyStateProblem type
     p = prob.p
-    t = 0
     x0 = copy(prob.u0)
     dx, df = copy(x0), copy(x0)
-    F(x) = prob.f(dx, x, p, t)
-    ∇ₓF(x) = prob.f(df, dx, x, p, t)
+    F(x) = prob.f(x, p)
+    ∇ₓF(x) = prob.f.jac(x, p)
     # Compute `u_steady` and `resid` as per DiffEqBase using my algorithm
     x_steady = newton_solve(F, ∇ₓF, x0, Ftol=Ftol)
     resid = F(x_steady)
     # Return the common DiffEqBase solution type
     DiffEqBase.build_solution(prob, alg, x_steady, resid; retcode=:Success)
 end
 
-# Overload DiffEqBase's SteadyStateProblem constructor
-function DiffEqBase.SteadyStateProblem(F, ∇ₓF, x, p)
-    f(dx, x, p, t) = F(x, p)
-    f(df, dx, x, p, t) = ∇ₓF(x, p)
-    return DiffEqBase.SteadyStateProblem(f, x, p)
-end
+
 
 
 # This is quasi derived from the Rosenbrock function
@@ -62,13 +56,11 @@ end
 # Specifically, if r(x,y) denotes the rosenbrock function,
 #     F₁([x, y], p) = ∂r/∂x
 #     F₂([x, y], p) = 0.5 * ∂r/∂y
-F(x,p) = [
+statefun(x,p) = [
     -2 * (p[1] - x[1]) - 4 * p[2] * (x[2] - x[1]^2) * x[1]
     p[2] * (x[2] - x[1]^2)
 ]
-
-# Jacobian function of F w.r.t. p
-∇ₓF(x,p) = ForwardDiff.jacobian(x -> F(x,p), x)
+F = ODEFunction(statefun, jac = (x,p) -> ForwardDiff.jacobian(x -> statefun(x, p), x))
 
 # Define mismatch function f(x,p) and its derivative ∇ₓf(x,p)
 # (Note ∇ₓF and ∇ₓf are required by the F1 method)
@@ -87,12 +79,12 @@ f(x,p) = state_mismatch(x) + parameter_mismatch(p)
 x₀ = rand(2)
 p₀ = rand(2)
 # Initialize the memory cache for storing reusable objects
-mem = F1Method.initialize_mem(F, ∇ₓf, ∇ₓF, x₀, p₀, MyAlg())
+mem = F1Method.initialize_mem(F, ∇ₓf, x₀, p₀, MyAlg())
 
 # Define the functions via the F1 method
-F1_objective(p) = F1Method.objective(f, F, ∇ₓF, mem, p, MyAlg())
-F1_gradient(p) = F1Method.gradient(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
-F1_Hessian(p) = F1Method.hessian(f, F, ∇ₓf, ∇ₓF, mem, p, MyAlg())
+F1_objective(p) = F1Method.objective(f, F, mem, p, MyAlg())
+F1_gradient(p) = F1Method.gradient(f, F, ∇ₓf, mem, p, MyAlg())
+F1_Hessian(p) = F1Method.hessian(f, F, ∇ₓf, mem, p, MyAlg())
 
 # Define the exact solution and objective for comparison
 exact_solution(p) = [p[1], p[1]^2]