You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you explain your motivation to setting it as 0 instead of 1 << 30? 1 << 30 should work equivalent to 0 and it is more friend to our kernel implementation.
System Info
GPU-A100,
TensorRT-LLM version = tensorrt_llm-0.13.0.dev2024090300
Ubuntu machine.
Who can help?
hi @ncomly-nvidia , @byshiue ,
I want to set the 'no_repeat_ngram_size'=0 for mistral model. But I get the following assertion error:
RuntimeError: [TensorRT-LLM][ERROR] Assertion failed: noRepeatNgramSize.value() > 0 (/home/jenkins/agent/workspace/LLM/main/L0_PostMerge/llm/cpp/tensorrt_llm/executor/samplingConfig.cpp:332)
As per the documentation the default value is 1 << 30, is there way to set the value to 0? If not, can this feature be added?
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Setting no_repear_ngram_size=0 under SamplingParams for mistral model.
Expected behavior
User should be allowed to allowed to set this value to 0.
actual behavior
Getting assertion error.
additional notes
We want to set it to 0 like we do for pytorch-eager used for inference.
The text was updated successfully, but these errors were encountered: