Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating #160

Open
HAOYON-666 opened this issue Dec 16, 2024 · 14 comments

Comments

@HAOYON-666
Copy link

HAOYON-666 commented Dec 16, 2024

(VILA) (base) user@ubuntu(125):/data/workspace/zhaoyong/model/VILA$ sh 1.sh
[2024-12-18 09:46:02,468] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO: Started server process [2989388]
INFO: Waiting for application startup.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.08s/it]
We've detected an older driver with an RTX 4000 series GPU. These drivers have issues with P2P. This can affect the multi-gpu inference when using accelerate device_map.Please make sure to update your driver to the latest version which resolves this.
ERROR: Traceback (most recent call last):
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
async with self.lifespan_context(app) as maybe_state:
File "/home/user/miniconda3/envs/VILA/lib/python3.10/contextlib.py", line 199, in aenter
return await anext(self.gen)
File "/data/workspace/zhaoyong/model/VILA/server.py", line 118, in lifespan
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, model_name, None)
File "/data/workspace/zhaoyong/model/VILA/llava/model/builder.py", line 115, in load_pretrained_model
model = LlavaLlamaModel(config=config, low_cpu_mem_usage=True, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/data/workspace/zhaoyong/model/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 174, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/data/workspace/zhaoyong/model/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(conversation, add_generation_prompt=add_generation_prompt, tokenize=False)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

ERROR: Application startup failed. Exiting.

gheinrich pushed a commit to gheinrich/VILA that referenced this issue Dec 16, 2024
@HAOYON-666
Copy link
Author

python -W ignore server.py
--port 8000
--model-path /data/workspace/zhaoyong/model/weight_files/VILA1.5-3B
--conv-mode vicuna_v1

@HAOYON-666
Copy link
Author

May I ask how I can solve this? thanks!!!

@HAOYON-666 HAOYON-666 changed the title ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating Dec 18, 2024
@doruksonmez
Copy link

Having the same issue

@Arhosseini77
Copy link

Arhosseini77 commented Dec 31, 2024

Having the same issue using vila-infer

Traceback (most recent call last):
  File "/root/miniconda3/envs/vila/bin/vila-infer", line 8, in <module>
    sys.exit(main())
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/cli/infer.py", line 39, in main
    response = model.generate_content(prompt)
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/model/llava_arch.py", line 825, in generate_content
    input_ids = tokenize_conversation(conversation, self.tokenizer, add_generation_prompt=True).cuda().unsqueeze(0)
  File "/code/Datasets/ARH/NVIDIA_VILA/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
    text = tokenizer.apply_chat_template(
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
    chat_template = self.get_chat_template(chat_template, tools)
  File "/root/miniconda3/envs/vila/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
    raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

How I can solve this ?

@fangbaolei
Copy link

python3 build_visual_engine.py --model_path tmp/hf_models/${MODEL_NAME} --model_type vila --vila_path ${VILA_PATH} # for VILA

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:128: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
[TensorRT-LLM] TensorRT-LLM version: 0.12.0
[2025-01-09 13:56:09,944] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-01-09 13:56:10,123] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/home/alpha/work/multimodal/VILA/llava/model/qlinear_te.py:95: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@amp.custom_fwd(cast_inputs=torch.bfloat16)
/home/alpha/work/multimodal/VILA/llava/model/qlinear_te.py:147: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
def backward(ctx, grad_output):
/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py:113: UserWarning: model_dtype not found in config, defaulting to torch.float16.
warnings.warn("model_dtype not found in config, defaulting to torch.float16.")
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.30s/it]
Traceback (most recent call last):
File "/home/alpha/work/multimodal/TensorRT-LLM/examples/multimodal/build_visual_engine.py", line 12, in
builder.build()
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/tools/multimodal_builder.py", line 85, in build
build_vila_engine(args)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/tools/multimodal_builder.py", line 391, in build_vila_engine
model = AutoModel.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/llava_llama.py", line 67, in from_pretrained
return cls.load_pretrained(
File "/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py", line 132, in load_pretrained
vlm = cls(config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/llava_llama.py", line 49, in init
self.init_vlm(config=config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/llava_arch.py", line 74, in init_vlm
self.llm, self.tokenizer = build_llm_and_tokenizer(llm_cfg, config, *args, **kwargs)
File "/home/alpha/work/multimodal/VILA/llava/model/language_model/builder.py", line 203, in build_llm_and_tokenizer
tokenizer.stop_tokens = infer_stop_tokens(tokenizer)
File "/home/alpha/work/multimodal/VILA/llava/utils/tokenizer.py", line 176, in infer_stop_tokens
template = tokenize_conversation(DUMMY_CONVERSATION, tokenizer, overrides={"gpt": SENTINEL_TOKEN})
File "/home/alpha/work/multimodal/VILA/llava/utils/tokenizer.py", line 110, in tokenize_conversation
text = tokenizer.apply_chat_template(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1803, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1967, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

the same erro with TensorRT-LLM

@doruksonmez
Copy link

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50

For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

@HAOYON-666
Copy link
Author

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50

For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

@fangbaolei
Copy link

fangbaolei commented Jan 10, 2025

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM
"chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

@HAOYON-666
Copy link
Author

HAOYON-666 commented Jan 10, 2025

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B

image

image

Traceback (most recent call last):
File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in
response = client.chat.completions.create(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper
return func(*args, **kwargs)
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create
return self._post(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request
return self._retry_request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request
return self._retry_request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request
return self._request(
File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

@Arhosseini77
Copy link

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Thanks

@fangbaolei
Copy link

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Thanks

vila code version is the newest?

@tolgaouz
Copy link

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B

image

image

Traceback (most recent call last): File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in response = client.chat.completions.create( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper return func(*args, **kwargs) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create return self._post( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

i'm having this issue right now with the server code too. Did anyone solve this?

@HAOYON-666
Copy link
Author

HAOYON-666 commented Jan 16, 2025

You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default chat template for the unknown models. However, since they have changed that behavior due to logical errors caused by a default chat template, you will need to append a chat template on your own. I'm not sure if the chat_template entry from the NVILA model would work, but you can get the idea: https://huggingface.co/Efficient-Large-Model/NVILA-8B/blob/049743cc51bdd8872cdf696a1a58b03fef2c367e/llm/tokenizer_config.json#L50
For comparison, here is the tokenizer_config.json from the old model. There is no entry for the chat_template: https://huggingface.co/Efficient-Large-Model/VILA1.5-3b/blob/main/llm/tokenizer_config.json

I have added the same content in the tokenizer_comfig.json of VILA1.5-3b, but it still reports the same error

only need add this to the tokenizer_comfig.json of VILA1.5-3b, it works for me for TensorRT-LLM "chat_template": "{% if messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{% for message in messages if message['content'] is not none %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Okay, it seems that this problem has been solved. Taking this opportunity, I would like to ask another similar question. Thank you! When I open this server, similar errors are also reported when the client initiates a request, whether it is NVILA-15B or VILA1.5-3B
image
image
Traceback (most recent call last): File "/data/workspace/zhaoyong/model/VILA/infer.py", line 13, in response = client.chat.completions.create( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_utils/_utils.py", line 271, in wrapper return func(*args, **kwargs) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 648, in create return self._post( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 1167, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 856, in request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 932, in _request return self._retry_request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 980, in _retry_request return self._request( File "/home/user/miniconda3/envs/VILA/lib/python3.10/site-packages/openai/_base_client.py", line 947, in _request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': 'Invalid style: SeparatorStyle.AUTO'}

i'm having this issue right now with the server code too. Did anyone solve this?

It seems that we haven't found a solution to this problem yet, but I have rebuilt a server and client using the inference code from the project, which can achieve inference. You can refer to the section in/Vila/llava/cli/infer.py.

@tolgaouz
Copy link

@HAOYON-666 thanks for the suggestion! yes, that's what i ended up doing too but still it would be cool to have an official server implementation compatible with streaming and openapi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants