ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating #160

HAOYON-666 · 2024-12-16T02:10:31Z

(VILA) (base) user@ubuntu(125):/data/workspace/zhaoyong/model/VILA$ sh 1.sh
[2024-12-18 09:46:02,468] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO: Started server process [2989388]
INFO: Waiting for application startup.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.08s/it]
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
ERROR: Traceback (most recent call last):
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/user/miniconda3/envs/VILA/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/data/workspace/zhaoyong/model/VILA/server.py", line 118, in lifespan
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, None)
File "/data/workspace/zhaoyong/model/VILA/llava/model/builder.py", line 115, in load_pretrained_model
model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 174, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(conversation, add_generation_prompt=add_generation_prompt, tokenize=False)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

ERROR: Application startup failed. Exiting.

HAOYON-666 · 2024-12-18T01:47:44Z

python -W ignore server.py
--port 8000
--model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B
--conv-mode vicuna_v1

HAOYON-666 · 2024-12-18T01:48:06Z

May I ask how I can solve this？ thanks！！！

doruksonmez · 2024-12-19T18:43:02Z

Having the same issue

Arhosseini77 · 2024-12-31T06:14:26Z

Having the same issue using vila-infer

Traceback (most recent call last):
  File "/root/miniconda3/envs/vila/bin/vila-infer", line 8, in <module>
    sys.exit(main())
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/cli/infer.py", line 39, in main
    response = model.generate_content(prompt)
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/model/llava_arch.py", line 825, in generate_content
    input_ids = tokenize_conversation(conversation, self.tokenizer, add_generation_prompt=True).cuda().unsqueeze(0)
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
    text = tokenizer.apply_chat_template(
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
    chat_template = self.get_chat_template(chat_template, tools)
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
    raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

How I can solve this ?

fangbaolei · 2025-01-09T13:58:04Z

python3 build_visual_engine.py --model_path tmp/hf_models/${MODEL_NAME} --model_type vila --vila_path ${VILA_PATH} # for VILA

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:128: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
[TensorRT-LLM] TensorRT-LLM version: 0.12.0
[2025-01-09 13:56:09,944] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-01-09 13:56:10,123] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/home/alpha/work/multimodal/VILA/llava/model/qlinear_te.py:95: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@amp.custom_fwd(cast_inputs=torch.bfloat16)
/home/alpha/work/multimodal/VILA/llava/model/qlinear_te.py:147: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
def backward(ctx, grad_output):
/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py:113: UserWarning: model_dtype not found in config, defaulting to torch.float16.
warnings.warn("model_dtype not found in config, defaulting to torch.float16.")
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.30s/it]
Traceback (most recent call last):
File "/home/alpha/work/multimodal/TensorRT-LLM/examples/multimodal/build_visual_engine.py", line 12, in
builder.build()
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/tools/multimodal_builder.py", line 85, in build
build_vila_engine(args)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/tools/multimodal_builder.py", line 391, in build_vila_engine
model = AutoModel.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/llava_llama.py", line 67, in from_pretrained
return cls.load_pretrained(
File "/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py", line 132, in load_pretrained
vlm = cls(config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/home/alpha/work/multimodal/VILA/llava/utils/tokenizer.py", line 176, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/home/alpha/work/multimodal/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

the same erro with TensorRT-LLM

doruksonmez · 2025-01-09T14:12:35Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50

For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

HAOYON-666 · 2025-01-10T02:08:16Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50

For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

fangbaolei · 2025-01-10T03:09:26Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM
"chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

HAOYON-666 · 2025-01-10T03:45:20Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B

Traceback (most recent call last):
File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in
response = client.chat.completions.create(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper
return func(*args, **kwargs)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create
return self._post(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request
return self._retry_request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request
return self._retry_request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

Arhosseini77 · 2025-01-11T04:04:35Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Thanks

fangbaolei · 2025-01-11T04:33:33Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Thanks

vila code version is the newest?

tolgaouz · 2025-01-15T16:15:48Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B

Traceback (most recent call last): File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in response = client.chat.completions.create( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper return func(*args, **kwargs) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create return self._post( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

i'm having this issue right now with the server code too. Did anyone solve this?

HAOYON-666 · 2025-01-16T01:33:13Z

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B

Traceback (most recent call last): File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in response = client.chat.completions.create( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper return func(*args, **kwargs) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create return self._post( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

i'm having this issue right now with the server code too. Did anyone solve this?

It seems that we haven't found a solution to this problem yet, but I have rebuilt a server and client using the inference code from the project, which can achieve inference. You can refer to the section in/Vila/llava/cli/infer.py.

tolgaouz · 2025-01-16T11:01:24Z

@HAOYON-666 thanks for the suggestion! yes, that's what i ended up doing too but still it would be cool to have an official server implementation compatible with streaming and openapi

gheinrich pushed a commit to gheinrich/VILA that referenced this issue Dec 16, 2024

Add long-short sampler and long-video needle test (NVlabs#160)

f92c7a5

fangbaolei mentioned this issue Jan 10, 2025

an error occur(module ‘torch.distributed’ has no attribute ‘ReduceOp’]) NVIDIA/TensorRT-LLM#2674

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HAOYON-666 commented Dec 16, 2024 •

edited

Loading

HAOYON-666 commented Dec 18, 2024

HAOYON-666 commented Dec 18, 2024

doruksonmez commented Dec 19, 2024

Arhosseini77 commented Dec 31, 2024 •

edited

Loading

fangbaolei commented Jan 9, 2025

doruksonmez commented Jan 9, 2025

HAOYON-666 commented Jan 10, 2025

fangbaolei commented Jan 10, 2025 •

edited

Loading

HAOYON-666 commented Jan 10, 2025 •

edited

Loading

Arhosseini77 commented Jan 11, 2025

fangbaolei commented Jan 11, 2025

tolgaouz commented Jan 15, 2025

HAOYON-666 commented Jan 16, 2025 •

edited

Loading

tolgaouz commented Jan 16, 2025

Comments

HAOYON-666 commented Dec 16, 2024 • edited Loading

HAOYON-666 commented Dec 18, 2024

HAOYON-666 commented Dec 18, 2024

doruksonmez commented Dec 19, 2024

Arhosseini77 commented Dec 31, 2024 • edited Loading

fangbaolei commented Jan 9, 2025

doruksonmez commented Jan 9, 2025

HAOYON-666 commented Jan 10, 2025

fangbaolei commented Jan 10, 2025 • edited Loading

HAOYON-666 commented Jan 10, 2025 • edited Loading

Arhosseini77 commented Jan 11, 2025

fangbaolei commented Jan 11, 2025

tolgaouz commented Jan 15, 2025

HAOYON-666 commented Jan 16, 2025 • edited Loading

tolgaouz commented Jan 16, 2025

HAOYON-666 commented Dec 16, 2024 •

edited

Loading

Arhosseini77 commented Dec 31, 2024 •

edited

Loading

fangbaolei commented Jan 10, 2025 •

edited

Loading

HAOYON-666 commented Jan 10, 2025 •

edited

Loading

HAOYON-666 commented Jan 16, 2025 •

edited

Loading