Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VISTA-3D:About TensorRT speedup Error #703

Open
xiaocaimmm opened this issue Oct 24, 2024 · 1 comment
Open

VISTA-3D:About TensorRT speedup Error #703

xiaocaimmm opened this issue Oct 24, 2024 · 1 comment

Comments

@xiaocaimmm
Copy link

When I use Vista3D, I encountered the following problems when running the command "python -m monai.bundle run --config_file "['configs/inference.json', 'configs/inference_trt.json']""

environment
TensorRT: 10.1.0
Torch-TensorRT Version: 2.4.0
Python version:3.10.15
CUDA version: 12.4
Torch Version:2.4.0+cu121
GPU:NVIDIA GeForce RTX 4090

error information
2024-10-24 10:30:17,210 - root - INFO - Restored all variables from .//models/model.pt
2024-10-24 10:30:17,211 - ignite.engine.engine.Vista3dEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs
2024-10-24 10:30:18,220 - INFO - Loading TensorRT engine: .//models/model.pt.image_encoder.encoder.plan
[I] Loading bytes from .//models/model.pt.image_encoder.encoder.plan
[E] IExecutionContext::enqueueV3: Error Code 1: Cask (Cask convolution execution)
2024-10-24 10:30:19,129 - INFO - Exception: CUDA ERROR: 700
Falling back to Pytorch ...
2024-10-24 10:30:19,131 - ignite.engine.engine.Vista3dEvaluator - ERROR - Current run is terminating due to exception: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

It looks like an environmental problem, but I don't know what went wrong.

@yiheng-wang-nv
Copy link
Collaborator

Hi @xiaocaimmm , the error message shows that we cannot determine the root cause, can you set CUDA_LAUNCH_BLOCKING=1 and rerun the script? It may produce more accurate error trace.

In addition, did you modify the inference_trt.json? Like enabling head_trt_enabled. It requires more GPU memory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants