You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This means they will always preform PTX JIT, even when running on an SM_35 device, resulting in wait time and potentially less useful error messages (I.e. if too much constant cache is requested, the error is just a ptx jit compilation failed).
Instead, IMO it should always requrest a real and virtual arch as a minimum:
The CUDA
Makefile
throughout the many, many branches of this repository only embedSM_35
PTX, they do not compile for any "real" architectures.This means they will always preform PTX JIT, even when running on an SM_35 device, resulting in wait time and potentially less useful error messages (I.e. if too much constant cache is requested, the error is just
a ptx jit compilation failed
).Instead, IMO it should always requrest a real and virtual arch as a minimum:
e.g.
or
are some of the many ways this could be achieved.
NVCC docs
The text was updated successfully, but these errors were encountered: