Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Commit

Permalink
fix batch bug
Browse files Browse the repository at this point in the history
  • Loading branch information
luoyu-intel committed May 30, 2024
1 parent 41f146b commit 4e99482
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion neural_speed/core/ne_layers.c
Original file line number Diff line number Diff line change
Expand Up @@ -5444,7 +5444,8 @@ static void ne_compute_forward_mul_f32(const struct ne_compute_params* params, c
const size_t nb3 = dst->nb[3];
if ((ne_nrows(src1) == 1 || ne_nrows(src1) == ne_nrows(src0)) && ne10 == ne00) {
if (nb10 == sizeof(float)) {
bestla_mul(nr, ne00, (const float*)src0->data, (const float*)src1->data, ne11 == 1 ? 0 : ne11, (float*)dst->data);
int step1 = ne11 == 1 ? 0 : ne10;
bestla_mul(nr, ne00, (const float*)src0->data, (const float*)src1->data, step1, (float*)dst->data);
return;
}
}
Expand Down

0 comments on commit 4e99482

Please sign in to comment.