Compute a Jacobian matrix of a function w.r.t. a tensor #671

Maverobot · 2024-02-11T19:48:35Z

Maverobot
Feb 11, 2024

Hi,

I'm pretty new here and recently started to learn about mlx by reimplementing existing PyTorch code with mlx. The PyTorch code, which I'm trying to rewrite is

import torch.autograd as autograd

def jacobian(f, x):
    """Computes the Jacobian of f w.r.t x.
    :param f: function R^N -> R^N
    :param x: torch.tensor of shape [B, N]
    :return: Jacobian matrix (torch.tensor) of shape [B, N, N]
    """
    B, N = x.shape
    y = f(x)
    jacobian = list()
    for i in range(N):
        v = torch.zeros_like(y)
        v[:, i] = 1.
        dy_i_dx = autograd.grad(y, x, grad_outputs=v, retain_graph=True, create_graph=True, allow_unused=True)[0]  # shape [B, N]
        jacobian.append(dy_i_dx)
    jacobian = torch.stack(jacobian, dim=2).requires_grad_()
    return jacobian

As far as I understand, the relevant function for this is mlx.core.grad. However, I could not figure out how to perform partial derivatives with it. I would very appreciate any feedback and help. Thanks!

Answered by awni

Feb 12, 2024

You can do this very nicely in MLX with mx.vjp (or mv.jvp) and mx.vmap. Jax has some really nice documentation on combining vmap and autograd to get Jacobians, Hessians, etc. The ideas mostly translate to MLX with a few slight API changes.

So in MLX you could do:

import mlx.core as mx

def fun(x):
    return x / x.sum()

def jacobian(f, x):
    B, N = x.shape
    I = mx.broadcast_to(mx.eye(N), (B, N, N))
    def vjpfn(y):
        return mx.vjp(f, (x,), (y,))
    return mx.vmap(vjpfn, in_axes=1, out_axes=1)(I)

# B = 2, N = 3
x = mx.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
print(jacobian(fun, x))

View full answer

awni · 2024-02-12T14:27:18Z

awni
Feb 12, 2024
Maintainer

You can do this very nicely in MLX with mx.vjp (or mv.jvp) and mx.vmap. Jax has some really nice documentation on combining vmap and autograd to get Jacobians, Hessians, etc. The ideas mostly translate to MLX with a few slight API changes.

So in MLX you could do:

import mlx.core as mx

def fun(x):
    return x / x.sum()

def jacobian(f, x):
    B, N = x.shape
    I = mx.broadcast_to(mx.eye(N), (B, N, N))
    def vjpfn(y):
        return mx.vjp(f, (x,), (y,))
    return mx.vmap(vjpfn, in_axes=1, out_axes=1)(I)

# B = 2, N = 3
x = mx.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
print(jacobian(fun, x))

2 replies

awni Feb 12, 2024
Maintainer

Basic idea:

VJP stands for Vector-Jacobian product. Computing the vector jacobian product - $v^\top J$. So here we use mx.vjp to compute $I J$ which of course is just $J$.

Maverobot Feb 12, 2024
Author

I definitely did not expect such a simple solution. Thank you also for pointing out the nice documentation of Jax!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute a Jacobian matrix of a function w.r.t. a tensor #671

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Compute a Jacobian matrix of a function w.r.t. a tensor #671

Maverobot Feb 11, 2024

Replies: 1 comment · 2 replies

awni Feb 12, 2024 Maintainer

awni Feb 12, 2024 Maintainer

Maverobot Feb 12, 2024 Author

Maverobot
Feb 11, 2024

Replies: 1 comment 2 replies

awni
Feb 12, 2024
Maintainer

awni Feb 12, 2024
Maintainer

Maverobot Feb 12, 2024
Author