-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Element-wise BLAS APIs & new Tensor for Python: ⬆️ 450 kernels #220
Open
ashvardanian
wants to merge
68
commits into
main
Choose a base branch
from
main-elementwise
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SimSIMD becomes more similar to BLAS with every commit! New operations are: - Element-wise Sum: `a[i] + b[i]` - Scale & Shift: `alpha * a[i] + beta` Those are similar to `axpy` and `scal` in BLAS.
ashvardanian
changed the title
Element-wise BLAS-like APIs
Element-wise BLASAPIs & new Tensor for Python
Nov 1, 2024
ashvardanian
changed the title
Element-wise BLASAPIs & new Tensor for Python
Element-wise BLAS APIs & new Tensor for Python
Nov 1, 2024
Tests & CI
Without defining the executables as tests automatic tools like "ctest" will not find tests.
New classes are added to the Python SDK: public NDArray and NDIndex and internal BufferOrScalarArgument. Those can be used for high-dimensional tensor processing with up to 64 dimensions, as opposed to 32 in NumPy. The new interface handles mixed-precision much better, allowing to override the type spec of every tensor individually. That feature is in preview until the actual implementation in subsequent commits. No LibC is needed anymore for rounding floats to 8-bit integers in elementwise ops. New 16-, 32-, and 64-bit integer element-wise kernels are added for compatibility with NumPy. Serial for now. SimSIMD implementation is supposed to be more efficient when at least a few continuous dimensions are present.
…mSIMD into main-elementwise
ashvardanian
force-pushed
the
main-elementwise
branch
from
November 4, 2024 18:05
598c5fe
to
d54f567
Compare
ashvardanian
force-pushed
the
main-elementwise
branch
from
November 5, 2024 22:32
14fd5d3
to
18c41fd
Compare
This entry is largely unnecessary, and its computation in linearization procedure depends on the value at the previous dim, making it hard to parallelize with SIMD.
ashvardanian
force-pushed
the
main-elementwise
branch
from
November 8, 2024 15:33
56f1a5d
to
38df49c
Compare
ashvardanian
force-pushed
the
main-elementwise
branch
from
November 10, 2024 22:51
16a54dc
to
79c4552
Compare
ashvardanian
force-pushed
the
main-elementwise
branch
from
November 11, 2024 17:58
ecb475a
to
e568e6c
Compare
…mSIMD into main-elementwise
The reason to rename is the addition of new trigonometric APIs that will cause a confusion among users.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It started as a straightforward optimization request from the @albumentations-team: to improve the special case of the
wsum
(Weighted Sum) operation for the "non-weighted" scenario and to add APIs for scalar multiplication and addition. This update introduces new public APIs in both C and Python:scale
: Implementssum
: ComputesRecognizing the value of consistency with widely-used libraries, we’ve also added "aliases" aligned with names familiar to developers using NumPy and OpenCV for element-wise addition and multiplication across vectors and scalars:
np.add
cv.add
simd.add
np.multiply
cv.multiply
simd.multiply
The real excitement came when we realized that larger projects would take time to adopt emerging numeric types like
bfloat16
andfloat8
, which are well-known in AI circles. To bridge this gap, SimSIMD now introduces anAnyTensor
type designed for maximum interoperability via CPython's Buffer Protocol and beyond, setting it apart from similar types in NumPy, PyTorch, TensorFlow, and JAX.Tensor Class for C, Python, and Rust 🦀
Element-wise Operations 🧮
Geospatial Operations 🛰️
If you have any feedback regarding the limitations of current array-processing software in a single- or multi-node AI training settings, I am all ears 👂