An auto-differentiation engine for arbitrary tensors written from scratch in C++.
The engine can be used to find derivatives of any tensor with respect to any other tensor, and so it can be used as a backpropagation algorithm to train neural networks via gradient descent, which typically requires finding derivatives of a scalar loss with respect to various weight tensors.
An example of usage can be found in example.cpp
, where main
has a runtime ~45ms.
TODO:
- Minor refactors and optimizations
- Tensor slicing operation
The general idea is that we have some tensor
We wish to find the derivative of
which has shape
(some of
In the case of neural networks, this target variable may be a weight/bias matrix, where the tensor we wish to find the derivative of,
The total derivative of
where
Since
Computing
Let
where
And otherwise
Implementations of the core operations can be found in tensor/operations
. Since functions can be written as a combination of a handful of core operations, only the derivative information of the core operations need to be implemented.
An example of derivative information for tensor multiplication:
For two arbitrary tensors
It then follows that
And similarly for