-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xe: conv_v2: enable Stream-K kernels #2345
base: main
Are you sure you want to change the base?
Conversation
04c449b
to
acce223
Compare
acce223
to
f766ccd
Compare
make test |
oss << std::uppercase << std::hex << std::setw(2) << std::setfill('0') | ||
<< (int)d; | ||
<< (int)v; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: into<int>(v)
This conversion is perfectly safe, the suggestion is largely to prevent additional noise when searching for conversion issues. On the other hand, all unsafe conversion should go through into<T>
to enable runtime validation in debug builds.
@@ -239,6 +239,8 @@ struct deserializer_t { | |||
} | |||
} | |||
|
|||
bool empty() const { return idx >= s.get_data().size(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool empty() const { return idx >= s.get_data().size(); } | |
bool empty() const { return s.get_data().empty(); } |
@@ -614,6 +614,7 @@ bench_data_t bench(const bench_manager_t &bench_mger, | |||
|
|||
bool try_create( | |||
const bench_manager_t &bench_mger, const kernel_desc_t &kernel_desc) { | |||
clear_primitive_cache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to just set the primitive cache capacity to 0?
Thanks for the data comparing to |
Jira: MFDNN-11721
PR updates performance modeling and benchmarking logic to handle Stream-K, and also updates the registry to use new Stream-K kernels. Will add more details later.
ResNet-50 performance data on PVC is below. A few comments: