You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I use Vista3D, I encountered the following problems when running the command "python -m monai.bundle run --config_file "['configs/inference.json', 'configs/inference_trt.json']""
error information:
2024-10-24 10:30:17,210 - root - INFO - Restored all variables from .//models/model.pt
2024-10-24 10:30:17,211 - ignite.engine.engine.Vista3dEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs
2024-10-24 10:30:18,220 - INFO - Loading TensorRT engine: .//models/model.pt.image_encoder.encoder.plan
[I] Loading bytes from .//models/model.pt.image_encoder.encoder.plan
[E] IExecutionContext::enqueueV3: Error Code 1: Cask (Cask convolution execution)
2024-10-24 10:30:19,129 - INFO - Exception: CUDA ERROR: 700
Falling back to Pytorch ...
2024-10-24 10:30:19,131 - ignite.engine.engine.Vista3dEvaluator - ERROR - Current run is terminating due to exception: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
It looks like an environmental problem, but I don't know what went wrong.
The text was updated successfully, but these errors were encountered:
Hi @xiaocaimmm , the error message shows that we cannot determine the root cause, can you set CUDA_LAUNCH_BLOCKING=1 and rerun the script? It may produce more accurate error trace.
In addition, did you modify the inference_trt.json? Like enabling head_trt_enabled. It requires more GPU memory
When I use Vista3D, I encountered the following problems when running the command "python -m monai.bundle run --config_file "['configs/inference.json', 'configs/inference_trt.json']""
environment:
TensorRT: 10.1.0
Torch-TensorRT Version: 2.4.0
Python version:3.10.15
CUDA version: 12.4
Torch Version:2.4.0+cu121
GPU:NVIDIA GeForce RTX 4090
error information:
2024-10-24 10:30:17,210 - root - INFO - Restored all variables from .//models/model.pt
2024-10-24 10:30:17,211 - ignite.engine.engine.Vista3dEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs
2024-10-24 10:30:18,220 - INFO - Loading TensorRT engine: .//models/model.pt.image_encoder.encoder.plan
[I] Loading bytes from .//models/model.pt.image_encoder.encoder.plan
[E] IExecutionContext::enqueueV3: Error Code 1: Cask (Cask convolution execution)
2024-10-24 10:30:19,129 - INFO - Exception: CUDA ERROR: 700
Falling back to Pytorch ...
2024-10-24 10:30:19,131 - ignite.engine.engine.Vista3dEvaluator - ERROR - Current run is terminating due to exception: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.It looks like an environmental problem, but I don't know what went wrong.
The text was updated successfully, but these errors were encountered: