[Performance]: How to assign model inference to specific CPUs? #27083

LinGeLin · 2024-10-16T09:52:24Z

OpenVINO Version

2024.4.0

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

OpenVINO installation

Build from source

Programming Language

C++

Hardware Architecture

x86 (64 bits)

Model used

ps model

Model quantization

No

Target Platform

No response

Performance issue description

I am developing a gRPC project using C++ and integrating OpenVINO (ov) into it. The project involves multiple thread pools for preprocessing. I have observed that the inference performance is significantly lower than the data measured by benchmark_app. I suspect that this is due to thread competition between ov and the preprocessing threads in the project. I conducted the following tests:

When infer_thread=24, the utilization of all 24 CPUs fluctuates around 50%.
When infer_thread=16, the utilization of the first 16 CPUs is around 80%, while the utilization of the last 8 CPUs is 0%.

Since my project runs with two models loaded simultaneously, I want to dedicate CPUs 0-11 to Model A, CPUs 12-19 to Model B, and CPUs 20-23 for other operations in the project. However, I haven't found an interface in ov to bind CPUs when loading models. Are there any other suggestions? Thank you.

Step-by-step reproduction

No response

Issue submission checklist

I'm reporting a performance issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

wangleis · 2024-10-16T11:04:35Z

hi @LinGeLin Do you run two models in one application process?

LinGeLin · 2024-10-16T11:17:08Z

hi @LinGeLin Do you run two models in one application process?

yes

LinGeLin · 2024-10-17T10:29:01Z

@wangleis Any suggestions? Or is it just that inferencing multiple models will inherently interfere with each other?

wangleis · 2024-10-17T11:11:21Z

@LinGeLin Reserving specific CPU resource for specific model in CPU inference is planned but not enabled. Ticket CVS-154222 is created to follow this issue. Will update to you when the feature is enabled in master branch.

LinGeLin added performance Performance related topics support_request labels Oct 16, 2024

ilya-lavrenov assigned wangleis Oct 16, 2024

wangleis mentioned this issue Nov 11, 2024

Reserving CPU resource in CPU inference #27321

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance]: How to assign model inference to specific CPUs? #27083

[Performance]: How to assign model inference to specific CPUs? #27083

LinGeLin commented Oct 16, 2024 •

edited

Loading

wangleis commented Oct 16, 2024

LinGeLin commented Oct 16, 2024

LinGeLin commented Oct 17, 2024

wangleis commented Oct 17, 2024

[Performance]: How to assign model inference to specific CPUs? #27083

[Performance]: How to assign model inference to specific CPUs? #27083

Comments

LinGeLin commented Oct 16, 2024 • edited Loading

OpenVINO Version

Operating System

Device used for inference

OpenVINO installation

Programming Language

Hardware Architecture

Model used

Model quantization

Target Platform

Performance issue description

Step-by-step reproduction

Issue submission checklist

wangleis commented Oct 16, 2024

LinGeLin commented Oct 16, 2024

LinGeLin commented Oct 17, 2024

wangleis commented Oct 17, 2024

LinGeLin commented Oct 16, 2024 •

edited

Loading