You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am developing a gRPC project using C++ and integrating OpenVINO (ov) into it. The project involves multiple thread pools for preprocessing. I have observed that the inference performance is significantly lower than the data measured by benchmark_app. I suspect that this is due to thread competition between ov and the preprocessing threads in the project. I conducted the following tests:
When infer_thread=24, the utilization of all 24 CPUs fluctuates around 50%.
When infer_thread=16, the utilization of the first 16 CPUs is around 80%, while the utilization of the last 8 CPUs is 0%.
Since my project runs with two models loaded simultaneously, I want to dedicate CPUs 0-11 to Model A, CPUs 12-19 to Model B, and CPUs 20-23 for other operations in the project. However, I haven't found an interface in ov to bind CPUs when loading models. Are there any other suggestions? Thank you.
Step-by-step reproduction
No response
Issue submission checklist
I'm reporting a performance issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.
The text was updated successfully, but these errors were encountered:
@LinGeLin Reserving specific CPU resource for specific model in CPU inference is planned but not enabled. Ticket CVS-154222 is created to follow this issue. Will update to you when the feature is enabled in master branch.
OpenVINO Version
2024.4.0
Operating System
Ubuntu 20.04 (LTS)
Device used for inference
CPU
OpenVINO installation
Build from source
Programming Language
C++
Hardware Architecture
x86 (64 bits)
Model used
ps model
Model quantization
No
Target Platform
No response
Performance issue description
I am developing a gRPC project using C++ and integrating OpenVINO (ov) into it. The project involves multiple thread pools for preprocessing. I have observed that the inference performance is significantly lower than the data measured by benchmark_app. I suspect that this is due to thread competition between ov and the preprocessing threads in the project. I conducted the following tests:
infer_thread=24
, the utilization of all 24 CPUs fluctuates around 50%.infer_thread=16
, the utilization of the first 16 CPUs is around 80%, while the utilization of the last 8 CPUs is 0%.Since my project runs with two models loaded simultaneously, I want to dedicate CPUs 0-11 to Model A, CPUs 12-19 to Model B, and CPUs 20-23 for other operations in the project. However, I haven't found an interface in ov to bind CPUs when loading models. Are there any other suggestions? Thank you.
Step-by-step reproduction
No response
Issue submission checklist
The text was updated successfully, but these errors were encountered: