You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a constellation of issues around using smaller models (e.g., with Ollama), when agents rely on client token counting features. For example,
For example, remaining_tokens and get_token_limit fail with a key error:
Traceback (most recent call last):
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 412, in on_messages_stream
content = await self._generate_reply(cancellation_token=cancellation_token)
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 585, in _generate_reply
return await self._execute_tool(message, rects, tool_names, cancellation_token=cancellation_token)
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 715, in _execute_tool
return await self._summarize_page(cancellation_token=cancellation_token)
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 876, in _summarize_page
remaining = self._model_client.remaining_tokens(messages + [message])
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/models/openai/_openai_client.py", line 953, in remaining_tokens
token_limit = _model_info.get_token_limit(self._create_args["model"])
File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/models/openai/_model_info.py", line 175, in get_token_limit
return _MODEL_TOKEN_LIMITS[resolved_model]
KeyError: 'llama3.3:latest'
Likewise, when relying on count_tokens, the following get's printed to the console outside of our logging mechanism:
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
The text was updated successfully, but these errors were encountered:
There is a constellation of issues around using smaller models (e.g., with Ollama), when agents rely on client token counting features. For example,
For example,
remaining_tokens
andget_token_limit
fail with a key error:Likewise, when relying on
count_tokens
, the following get's printed to the console outside of our logging mechanism:The text was updated successfully, but these errors were encountered: