Scan, Shared Variables, and Root Finding #959

jessegrabowski · 2022-05-14T16:07:26Z

jessegrabowski
May 14, 2022

Hi everyone,

I am trying to implement Newton's method to find the roots of a nonlinear system of equations. I load the system from Sympy expressions, so I am trying to avoid having to pick those apart completely if possible, but we'll see. aesara_ss_system is a vector of non-linear equations built with sympy.printing.aesaracode.aesara_code, which has model parameters to be estimated (aesara_params), and variables to solve the system for given parameters (aesara_vars_ss)

I need the solutions x* (i called it steady_state_values in the code), such that f(x*) = 0 in order to get a first-order linear approximation of the system in the next step, so I want to update the equation matrices in-place. Given that, I thought using Shared Variables would be the way to go. I wrote the algorithm like this:

steady_state_values = aesara.shared(np.full(13, 0.8), name='ss_values')

shared_ss_system = aesara.clone_replace(aesara_ss_system, replace=dict(zip(aesara_vars_ss, 
                                                                           steady_state_values)))

jacobian = at.stack([[at.grad(eq, x) for x in aesara_vars_ss] for eq in aesara_ss_system])
jacobian_shared = aesara.clone_replace(jacobian, replace=dict(zip(aesara_vars_ss, steady_state_values)))

ss_updated = (steady_state_values - 0.8 * at.linalg.solve(jacobian_shared, shared_ss_system))
updates = [(steady_state_values, ss_updated)]

f_ss = aesara.function(aesara_params, shared_ss_system, updates=updates)

f_ss works exactly as I want, and I can iterate on a set of fixed parameter values (aesara_params, which will eventually be estimated with PyMC) to get my x*. The problem is I have to compile this function, which means I can't get end-to-end gradients for the NUTS sampler down the line. I thought it might be possible to instead do keep the same idea with a shared variable but do the iteration with a scan, but I can't seem to get that to work. I tried:

from aesara.scan.utils import until as scan_until

step_size = at.dscalar('step_size')
max_iter_ss = at.iscalar('max_iter_ss')
tol_ss  = at.dscalar('tol_ss')

def newton_step(X, Fx, Jx, step_size, tol):
    new_X = X - step_size * at.linalg.solve(Jx, Fx)
    
    return new_X, scan_until(at.sqr(Fx).sum() < tol)

result, updates = aesara.scan(newton_step,
                              non_sequences=[shared_ss_system, jacobian_shared, step_size, tol_ss],
                              outputs_info=[steady_state_values],
                              n_steps=max_iter_ss,
                              strict=True)

This is similar to what I came up with for another optimization algorithm, but it doesn't work because steady_state_values never updates, so shared_ss_system and jacobian_shared also stay put. I tried putting steady_state_value as a non_sequence, trying to update shared_ss_system and jacobian_shared in the newton_step function using clone_replace, but nothing seems to propagate the updates. In the optimizer I got working I was able to explicitly write down an update rule for each component of the system inside the step function, so I can be sure everything is updating. Here I am just crossing my fingers and praying the information goes where it needs to. It's working about as well as you'd expect.

I'm also surprised that the updates dictionary comes back empty. I thought when a shared variable is passed into a Scan you are meant to get back an update. Any ideas how to get something like this to work? Am I on the right track with a SharedVariable, or is that unnecessary?

brandonwillard · 2022-05-14T23:49:03Z

brandonwillard
May 14, 2022
Maintainer

The updates that are supposed to occur within a Scan's inner-function (e.g. your newton_step) need to be returned (as a dict) by said inner-function. See the here.

N.B. If you're wondering why a dict isn't returned when RandomStream is used inside of a Scan inner-function, it's because RandomStream attaches the update information to the Variables it produces, and those are picked up by Scans automatically.

0 replies

jessegrabowski · 2022-05-15T03:30:41Z

jessegrabowski
May 15, 2022
Author

I saw that example, and it left me with more questions than answer because I was indeed wondering why the RandomStream wasn't returned as dict, so thank you for including that note. I also saw #890 which made me weary of following that specific example too closely.

I changed the returns in the inner function to a dict, and this gets my shared variable updating. Those updates still don't propagate into jacobian_shared and ss_system_shared, though, causing the algorithm to diverge.

Here is my new inner function:

def newton_step(X, Fx, Jx, step_size, tol):
    new_X = X - step_size * at.linalg.solve(Jx, Fx)
    return {X:new_X}, scan_until(at.sqr(Fx).sum() < tol)

The updates object return by the scan is OrderedUpdates([(variables, Subtensor{int64}.0)]) -- why is new_X turning into an int64 tensor, or is that just semantics I'm not understanding?

I wrote up a minimum example to reproduce all the steps I'm taking, in case something about how I am parsing and setting up the matrices is wrong.

from sympy.abc import x, y, a, b, c, d, e, f
from sympy.printing.aesaracode import aesara_function, aesara_code

import aesara
import aesara.tensor as at
from aesara.scan.utils import until as scan_until

import numpy as np

# Sympy equations
cache_dict = {}

eq_1 = a * x ** 2 - b * y - c
eq_2 = d * x - e * y ** 2 + f
system = [eq_1, eq_2]

# Print sympy to aesara
cache_dict = {}
params = [aesara_code(param, cache=cache_dict) for param in [a, b, c, d, e, f]]
variables = [aesara_code(var, cache=cache_dict) for var in [x, y]]

vars_shared = aesara.shared(np.ones(2), name='variables')
shared_replace_dict = dict(zip(variables, vars_shared))

# Inject shared variables into generated code
aesara_system = at.stack([aesara_code(eq, cache=cache_dict) for eq in system])
shared_system = aesara.clone_replace(aesara_system, replace=shared_replace_dict)
jacobian = at.stack([[at.grad(eq, x) for x in variables] for eq in aesara_system])
jacobian_shared = aesara.clone_replace(jacobian, replace=shared_replace_dict)


# "by hand" approach, this works fine by looping over f_newton_1 and using get_value/set_value
# vars_updated = vars_shared - at.linalg.solve(jacobian_shared, shared_system)
# updates = [(vars_shared, vars_updated)]
# f_newton_1= aesara.function(params, shared_system, updates=updates)
# param_values = np.ones(6)
# for _ in range(50):
#     resid = f_ss(*param_values)
#     if np.linalg.norm(resid, 1) < 1e-16:
#         break

## Scan approach, doesn't work
step_size = at.dscalar('step_size')
max_iter = at.iscalar('max_iter')
tol  = at.dscalar('tol')

def newton_step(X, Fx, Jx, step_size, tol):
    new_X = X - step_size * at.linalg.solve(Jx, Fx)
    return {X:new_X}, scan_until(at.sqr(Fx).sum() < tol)

result, updates = aesara.scan(newton_step,
                              non_sequences=[vars_shared, shared_system, jacobian_shared,
                                             step_size, tol],
                              n_steps=max_iter,
                              strict=True)

# f_newton_2 diverges to infinity
f_newton_2 = aesara.function(params + [step_size, max_iter, tol], shared_system, updates=updates)

8 replies

jessegrabowski May 16, 2022
Author

Awesome, thank you. I did expect the shared variable to somehow update the graph automatically. I'll make the simplification you suggested, look into OpFromGraph, and get back at it. Thanks again for taking the time!

jessegrabowski May 16, 2022
Author

I got it working. Here's the Newton Fractal from the example equations as a thank you. Seems to go faster than the shared variable/looping by hand method (~5 minutes for a 1000 x 1000 grid) , but I can't be sure. I bet it would be better on GPU.

In the end I needed to use at.stack on aesara_system and jacobian because OpFromGraph complained about getting lists of variables as outputs; it only wanted tensor variables. I guess it's a shame because you mentioned it makes it slower. The newton_step function also ended up a bit janky:

F_x = OpFromGraph(params + variables, [aesara_system], name='F_x')
J_x = OpFromGraph(params + variables, [jacobian], name='J_x')

def newton_step(X, old_F, old_J, step_count, params, step_size, tol):
    
    new_X = X - step_size * at.linalg.solve(old_J, old_F)
    new_F = F_x(*params, *[new_X[i] for i in range(len(variables))])
    new_J = J_x(*params, *[new_X[i] for i in range(len(variables))])
    step_count += 1
    
    return (new_X, new_F, new_J, step_count), scan_until(at.linalg.norm(new_F, ord=1) < tol)

Since the Ops generated by OpFromGraph expect some number of scalar inputs, and since tensor variables don't support iteration, I ended up having to do that list comprehension and unpacking to get the arguments passed in. If you know a better way (and have a moment to think about it) I would be grateful.

Otherwise thanks again. I'm enjoying working with Aesara!

brandonwillard May 16, 2022
Maintainer

I guess it's a shame because you mentioned it makes it slower.

Those superfluous Subtensors should get optimized out (more specifically, constant folded), so the extra latency should only be at compile time.

brandonwillard May 16, 2022
Maintainer

Since the Ops generated by OpFromGraph expect some number of scalar inputs, and since tensor variables don't support iteration, I ended up having to do that list comprehension and unpacking to get the arguments passed in.

There's nothing wrong with that. It's really just a result of the mixed parameterization chosen for this problem. If you can reformulate everything so that it has a single parameterization, that would be a little nicer.

brandonwillard May 16, 2022
Maintainer

I'm enjoying working with Aesara!

Glad to hear! At this point, we're really trying to fix some core design issues and long-standing Theano problems, and we want that work to result in a more hackable Aesara. See here for more background on the project and its goals.

That said, we encourage folks to look into the codebase as they use it. There's not too much user-interface dressing to prevent one from seeing the core mechanics, and we're very interested in making those mechanics even more transparent. Feel free to ask questions, request documentation, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scan, Shared Variables, and Root Finding #959

{{title}}

Replies: 2 comments 8 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Scan, Shared Variables, and Root Finding #959

jessegrabowski May 14, 2022

Replies: 2 comments · 8 replies

brandonwillard May 14, 2022 Maintainer

jessegrabowski May 15, 2022 Author

jessegrabowski May 16, 2022 Author

jessegrabowski May 16, 2022 Author

brandonwillard May 16, 2022 Maintainer

brandonwillard May 16, 2022 Maintainer

brandonwillard May 16, 2022 Maintainer

jessegrabowski
May 14, 2022

Replies: 2 comments 8 replies

brandonwillard
May 14, 2022
Maintainer

jessegrabowski
May 15, 2022
Author

jessegrabowski May 16, 2022
Author

jessegrabowski May 16, 2022
Author

brandonwillard May 16, 2022
Maintainer

brandonwillard May 16, 2022
Maintainer

brandonwillard May 16, 2022
Maintainer