-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate cub::FpLimits in favor of cuda::std::numeric_limits #3635
Conversation
🟨 CI finished in 1h 58m: Pass: 94%/89 | Total: 2d 15h | Avg: 42m 54s | Max: 1h 12m | Hits: 145%/10896
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 89)
# | Runner |
---|---|
65 | linux-amd64-cpu16 |
8 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
cf55ae5
to
8e44ac4
Compare
🟩 CI finished in 1h 50m: Pass: 100%/90 | Total: 2d 17h | Avg: 43m 29s | Max: 1h 17m | Hits: 177%/12730
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 90)
# | Runner |
---|---|
65 | linux-amd64-cpu16 |
9 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
template <> | ||
struct CUB_NS_QUALIFIER::FpLimits<bfloat16_t> | ||
struct __is_extended_floating_point<bfloat16_t> : true_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can you help me understand where this is used? My mental model of of our custom bfloat16_t
and half_t
is that we use them to "emulate" the native extended fp types in the absence of those. The reason I am asking is that I would like to make sure we're not promoting these "emulated" wrapper types to be a "real extended fp type" in places where it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used by cuda::is_floating_point
and cuda::std::numeric_limits
. Adding bfloat16_t
here allows __numeric_limits_impl
to be chosen internally by cuda::std::numeric_limits
. It also makes sure that since cuda::is_floating_point<__nv_bfloat16>
is true, cuda::is_floating_point<bfloat16_t>
is also true. So in short, it alignes the traits and limits for __nv_bfloat16
and its wrapper type.
The same goes for __half
and half_t
.
8e44ac4
to
3a81d5f
Compare
3a81d5f
to
d2562b3
Compare
🟩 CI finished in 1h 52m: Pass: 100%/90 | Total: 2d 17h | Avg: 43m 27s | Max: 1h 21m | Hits: 177%/12730
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
+/- | Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 90)
# | Runner |
---|---|
65 | linux-amd64-cpu16 |
9 | windows-amd64-cpu16 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
4 | linux-arm64-cpu16 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
(cherry picked from commit d85c66a)
Successfully created backport PR for |
Pulled out of #3384.