Missing broadcast capability in optimized kernels. #8051
Labels
module: kernels
Issues related to kernel libraries and utilities, and code under kernels/
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Missing broadcast causes abort.
So, I ran into a bug where the optimized kernel library, op_add, op_sub, and op_div are missing broadcast support. I specifically ran into it when trying to add a 2x8x12x12 tensor with a 2x1x12x12 tensor. (I only ran into it when I was doing batch size of greater than 1, which is my assumption why this has not been an issue?)
The cause of this seems to be an update a while back that presumably added new optimization paths for use with op_mul:
However, the first 4 are the only ones known in op_add, op_sub, and op_div, and the logic falls to an error if it encounters an unexpected optimization path.
Playing with it, a quick and dirty fix is to add a check for if we are using one of the un-implemented broadcasts, then fallback to
kNone
That is obviously non-ideal, but was my temp fix for myself.
Versions
Executorch main branch.
cc @larryliu0820 @manuelcandales
The text was updated successfully, but these errors were encountered: