Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autograd add grad error #68

Open
swfsql opened this issue Jun 20, 2024 · 1 comment
Open

Autograd add grad error #68

swfsql opened this issue Jun 20, 2024 · 1 comment

Comments

@swfsql
Copy link
Contributor

swfsql commented Jun 20, 2024

Hi, I've found a runtime error and made this minimized example:

#[test]
fn test_add_grad_decreasing_idx() {
    let mut cx = Graph::new();
    let a: GraphTensor<R1<2>> = cx.tensor();
    let a: GraphTensor<R3<1, 1, 2>> = a.expand::<_, LAxes2<0, 1>>();
    let a: GraphTensor<R3<2, 1, 1>> = a.permute::<_, LAxes3<2, 1, 0>>();
    // a.shape.fake = [false, true, true]
    // a.shape.indexes = [0, 2, 1] // note that the idx isn't necessarily increasing (0,1,2)
    let b: GraphTensor<R3<2, 1, 1>> = cx.tensor();
    let weights = vec![a.id, b.id];

    let m: GraphTensor<R3<2, 1, 1>> = a * b;
    let loss: GraphTensor<R0> = m.sum_reduce();
    let _grads = cx.compile(Autograd::new(weights, loss), ());
}

The error:

thread 'autograd::tests::test_add_grad_decreasing_idx' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tinyvec-1.6.0/src/arrayvec.rs:681:26:
index out of bounds: the len is 0 but the index is 0
stack backtrace:
   ..
   3: tinyvec::arrayvec::ArrayVec<A>::remove
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tinyvec-1.6.0/src/arrayvec.rs:681:26
   4: luminal::shape::tracker::ShapeTracker::remove_dim
             at /workspaces/coursera-deep-learning-specialization/luminal_original/src/shape/tracker.rs:58:21
   5: luminal_training::autograd::add_grad
             at ./src/autograd.rs:208:13
   6: <luminal_training::autograd::Autograd as luminal::compiler_utils::Compiler>::compile
             at ./src/autograd.rs:113:21
   ..

For what I've noticed, the add_grad function may assume that the fwd shape.indexes is always increasing, but in this case, thanks to how the a.permute axes were defined (LAxes3<2, 1, 0>), the indexes may end up decreasing [0, 2, 1] - to note, if the axes were defined as LAxes3<2, 0, 1>, the indexes would not decrease and there would be no error. Since this shape also has two fake axes (the fake is [false, true, true]), at the last axis iteration, the previous fake axis removal changes the axes length, and so the last iteration goes out of bounds.

Relevant code section:

for i in fwd.shape.indexes.into_iter().rev() {
if fwd.shape.fake[i] {
grad.id = graph
.add_op(SumReduce(i))
.input(grad.id, 0, grad.shape)
.finish();
grad.shape.remove_dim(i);
grad.shape = grad.shape.contiguous();
}
}

I'm not sure how to solve this as I'm still learning how the shapes works in overall, but I'll try to experiment more.
I've also linked a PR which shows the example and error.

@swfsql
Copy link
Contributor Author

swfsql commented Jun 20, 2024

On the PR I've simplified the test, added another test and then added a draft fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant