Makefile `-gencode` only embed ptx #9

ptheywood · 2023-03-14T10:34:28Z

The CUDA Makefile throughout the many, many branches of this repository only embed SM_35 PTX, they do not compile for any "real" architectures.

NVCC_FLAGS= -gencode arch=compute_35,code=compute_35

This means they will always preform PTX JIT, even when running on an SM_35 device, resulting in wait time and potentially less useful error messages (I.e. if too much constant cache is requested, the error is just a ptx jit compilation failed).

Instead, IMO it should always requrest a real and virtual arch as a minimum:

e.g.

NVCC_FLAGS= -gencode arch=compute_35,code=sm_35 arch=compute_35,code=compute_35

or

NVCC_FLAGS= -gencode arch=compute_35,code=[sm_35,compute_35]

are some of the many ways this could be achieved.

NVCC docs

The text was updated successfully, but these errors were encountered:

Robadob · 2023-03-14T11:12:13Z

Same change probably useful for visual studio too, whereby sm_52, sm_52 is not working on sm_61.

Robadob added the bug Something isn't working label Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Makefile `-gencode` only embed ptx #9

Makefile `-gencode` only embed ptx #9

ptheywood commented Mar 14, 2023

Robadob commented Mar 14, 2023

Makefile -gencode only embed ptx #9

Makefile -gencode only embed ptx #9

Comments

ptheywood commented Mar 14, 2023

Robadob commented Mar 14, 2023

Makefile `-gencode` only embed ptx #9

Makefile `-gencode` only embed ptx #9