You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NumPy performance difference between stock and intel is not observed on default buffer size, and only marginally better when numpy.setbufsize() is set to 16*10^5.
This behavior is not observed on SPR node in Intel DevCloud:
https://gist.github.com/samaid/bb680421ee29926cc7b8e536ee9a931c
Test was run on Intel DevCloud on TGL node in two setups
NumPy performance difference between stock and intel is not observed on default buffer size, and only marginally better when
numpy.setbufsize()
is set to 16*10^5.This behavior is not observed on SPR node in Intel DevCloud:
Looks like no multithreading is exercised on TGL system. Second, default buffer size is too small to get any benefits from multi-threading. According to this chart, multithreading is beneficial with the buffer size greater than 10K and the performance is materially different on sizes 100K-1M:
https://www.intel.com/content/www/us/en/develop/documentation/onemkl-vmperfdata/top/real-functions/trigonometric/sin.html
The text was updated successfully, but these errors were encountered: