Issues with token counting in smaller models. #5400

afourney · 2025-02-06T15:53:05Z

There is a constellation of issues around using smaller models (e.g., with Ollama), when agents rely on client token counting features. For example,

For example, remaining_tokens and get_token_limit fail with a key error:

Traceback (most recent call last):
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 412, in on_messages_stream
    content = await self._generate_reply(cancellation_token=cancellation_token)
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 585, in _generate_reply
    return await self._execute_tool(message, rects, tool_names, cancellation_token=cancellation_token)
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 715, in _execute_tool
    return await self._summarize_page(cancellation_token=cancellation_token)
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/agents/web_surfer/_multimodal_web_surfer.py", line 876, in _summarize_page
    remaining = self._model_client.remaining_tokens(messages + [message])
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/models/openai/_openai_client.py", line 953, in remaining_tokens
    token_limit = _model_info.get_token_limit(self._create_args["model"])
  File "/home/afourney/repos/autogen/python/packages/autogen-ext/src/autogen_ext/models/openai/_model_info.py", line 175, in get_token_limit
    return _MODEL_TOKEN_LIMITS[resolved_model]
KeyError: 'llama3.3:latest'

Likewise, when relying on count_tokens, the following get's printed to the console outside of our logging mechanism:

Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.
Model llama3.3:latest not found. Using cl100k_base encoding.

The text was updated successfully, but these errors were encountered:

afourney · 2025-02-06T15:54:07Z

It's unclear what the best solution is here. One option is to allow the context window and tokenizer to be specified in ModelInfo?

afourney added this to the python-v0.4.6 milestone Feb 6, 2025

afourney assigned ekzhu and jackgerrits Feb 6, 2025

github-actions bot added the needs-triage label Feb 6, 2025

raimondasl added proj-extensions and removed needs-triage labels Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with token counting in smaller models. #5400

Issues with token counting in smaller models. #5400

afourney commented Feb 6, 2025

afourney commented Feb 6, 2025

Issues with token counting in smaller models. #5400

Issues with token counting in smaller models. #5400

Comments

afourney commented Feb 6, 2025

afourney commented Feb 6, 2025