-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] 40% slowdown in ONNX Resize Operator on CPU #23391
Comments
I did not reproduce the issue in my machine. |
The latency outputs for ORT 1.18.0 and 1.20.1 are also very close on my machine. Could you test with ORT 1.19.0 and 1.20.1? You can also use these two adjacent commits: ORT 1.18 vs 1.20. (1.20 is faster than 1.18)
ORT 1.19 vs 1.20 (1.19 is faster than 1.20)
ORT 5fa4505 vs 6cc06ad (5fa4505 is faster)
|
@SuhwanSong, both commits (5fa4505 6cc06ad) has not changed the Resize operator. It is strange that it could cause performance difference. The result in my test machine using your test script:
Note that the model has three nodes after optimization: Resize, Resize and Ceil. So the above latency difference between 1.19.0 and 1.20.1 is likely caused by the Ceil node. |
Thank you for your efforts! In my experiments on a 48-core server, I observed a performance difference. However, when I tested on a 16-core server, I obtained results similar to yours (i.e., no difference). I identified a perf regression test case from the 16-core setup.
code was executed (while REGISTER_KERNEL_TYPED(float) was executed in both commits). Although I’m not sure about the root cause, I’ve noticed similar behavior with ArgMin and ArgMax as well.
|
@SuhwanSong, could you try build from source with this PR: #23433 to see whether it could resolve the regression. The reason is that Ceil/Clip/ArgMax are all implemented using Eigen. |
I attempted to build this PR: #23433, but it failed due to the use of Eigen::PropagateNaN, which was introduced in Eigen 3.4. Here’s the error message I encountered:
It seems this feature is only available in Eigen 3.4 (reference: Eigen 3.4 Release Notes#Improvement_to_NaN_propagation). The root cause of this issue (#23337) might also be related to the Eigen library. |
I tested commit 6cc06ad with both Eigen 3.4 and the Eigen nightly version (Eigen nightly) to compare performance. Here are the results: 3.4 vs nightly
Overall, the performance overhead decreased by approximately 10% with the Eigen nightly version. |
Describe the issue
We observed a significant performance regression (~40% slowdown) in the
Resize
operator when usingFloat32
andInt64
data types on the CPU.This slowdown impacts workloads that rely heavily on the
Resize
operator, particularly in image processing tasks.After the bisect, we found the commit 6cc06ad introduces the slowdown.
range: 5fa4505..6cc06ad
model:
analysis:
To reproduce
Urgency
No response
Platform
Linux
OS Version
6.8.0
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.20.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
model.zip
Is this a quantized model?
Yes
The text was updated successfully, but these errors were encountered: