You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using version 0.10.2 of decode and the same parameter.yaml file, the time to sample training data on Linux with an A800 is significantly larger than on Windows with GTX3080Ti .Both using the parameters below for simulation,
On Windows with GTX3080Ti, the time to sample training data per epoch during training is about 8 seconds. However, on Linux with an A800, it takes tens of minutes (I didn’t wait for it to finish sampling in an epoch because it took too long). I investigated the code and added print statements at key points, and found that it was very slow at the line:
However, using nvidia-smi, I saw that the GPU utilization was consistently at 100%, which is very strange. I specifically checked the spline library and found that it was compiled with sm_37. Could this be the reason for the performance issue? But sm_37 compiled code does not affect the performance on Windows with GTX 3080Ti. Recompiling to test whether it is the problem is quite difficult for me, so I hope to seek your help.
The text was updated successfully, but these errors were encountered:
When using version 0.10.2 of decode and the same parameter.yaml file, the time to sample training data on Linux with an A800 is significantly larger than on Windows with GTX3080Ti .Both using the parameters below for simulation,
On Windows with GTX3080Ti, the time to sample training data per epoch during training is about 8 seconds. However, on Linux with an A800, it takes tens of minutes (I didn’t wait for it to finish sampling in an epoch because it took too long). I investigated the code and added print statements at key points, and found that it was very slow at the line:
However, using nvidia-smi, I saw that the GPU utilization was consistently at 100%, which is very strange. I specifically checked the spline library and found that it was compiled with sm_37. Could this be the reason for the performance issue? But sm_37 compiled code does not affect the performance on Windows with GTX 3080Ti. Recompiling to test whether it is the problem is quite difficult for me, so I hope to seek your help.
The text was updated successfully, but these errors were encountered: