Use ollama's structured outputs feature #242

daniel-j-h · 2024-12-13T15:35:45Z

I'm looking into pydantic-ai with small and locally running ollama models as backbones.

I'm noticing that sometimes even for simple models it's possible to run into unexpected ValidationErrors.

Here's what I mean: With a pydantic model as simple as

class Answer(BaseModel):
    value: str = ""

I can see pytandic-ai sometimes retrying and failing in validation.

Having experience with llama.cpp's grammars this was unexpected to me. I was under the assumption that pydantic-ai would transform the pydantic model into a grammar or json schema to hard-restrict the llm's output accordingly. Then validation could never fail by design since the llm's output is restricted to the specific grammar.

Instead when I debug the request pydantic-ai sends to the locally running ollama with

nc -l -p 11434

I can see pydantic-ai turning the pydantic model into a tool use invocation.

With ollama v0.5.0 structured output via json schema is now supported:

https://github.com/ollama/ollama/releases/tag/v0.5.0

I was wondering if that would solve the issue of small locally running models sometimes running into validation errors, since we hard-restrict the output to the shape of our pydantic model.

Any thoughts on this, or ideas why validation can fail with tool usage as implemented right now? Any pointers in terms of for which model providers validation might fail and for what reason? Thanks!

The text was updated successfully, but these errors were encountered:

daniel-j-h · 2024-12-13T20:41:33Z

I have looked into this a bit more and here is what is happening

The user specifies a return type for a pydantic-ai agent in form of a pydantic model
The pydantic model's jsonschema gets passed to the llm in form of a tool and its arguments
Because tool usage is optional the ollama models on the smaller side often never use the tool
pydantic-ai then fails validation and retries, and more often than not the 2nd try also doesn't use the tool

The underlying issue here is that tool usage is optional. This makes the pydantic-ai approach of validating and parsing types unreliable and non-deterministic.

There exists a tool_choice=required parameter in the OpenAI API but it's not supported in ollama as of today. From the ollama tool blog post:

> Future improvements [..] Tool choice: force a model to use a tool

I see the following ways forward

Change the documentation and make it clear that validation and parsing may or may not happen and that especially with smaller models it appears like it's often failing due to lack of tool usage. This undermines the main selling point of pydantic-ai
Look into and implement the structured output approach which, I believe, should make validation and parsing deterministic

Note: I have only looked at the ollama model as implemented in pydantic-ai. I do not know if other models are affected by this, too, or if their output is deterministically constrained.

samuelcolvin · 2024-12-13T23:35:58Z

Look into and implement the structured output approach which, I believe, should make validation and parsing deterministic

Is this part of the OpenAI API, or would we need a dedicated model for it?

Happy to consumer consider both, just trying to understand.

samuelcolvin · 2024-12-14T09:58:46Z

Looks like we should switch from the OpenAI compatibility API to using the Ollama python library.

renkehohl · 2024-12-18T09:49:10Z

As I was also facing several validation errors when working with Ollama, I have made an own implementation of the OllamaModel utilizing Ollama's new Structured Output feature. If you are interested in it, I can contribute it here.

gt732 · 2024-12-18T22:27:08Z

@renkehohl do you mind sharing? I'm running into the same issue running local models it sometimes works but its random. I'm mostly testing with qwen models.

renkehohl · 2024-12-19T07:50:44Z

@gt732 here is the gist: https://gist.github.com/renkehohl/407cdcdd3bfc8d0baee3783782b31e3d

Usage is same as PydanticAI's OllamaModel:

from pydantic_ai import Agent
from typing import List
from ollama_model import OllamaModel

agent = Agent(
     model=OllamaModel(model_name="llama3.1"),
     result_type=List[str]
)

Please keep in mind, that I've not implemented stream responses yet.

christopherfowers · 2024-12-19T14:05:02Z

I don't see Message,ModelAnyResponse, ModelStructuredResponse, ToolCall, or ModelTextResponse on pydantic_ai.messages at all. Nor do I see any of these types noted in the messages documentation. What am I missing to be able to replicate your gist @renkehohl?

renkehohl · 2024-12-19T15:17:18Z

@christopherfowers the message format has been changed in the commit from December 15th. I was using PydanticAI at version 0.0.12, so my implementation needs to get updated for later versions.

gabrielgrant · 2024-12-19T17:32:06Z

@samuelcolvin to answer your question from earlier about whether this is something OpenAI also supports: yes, they have structured outputs (as of end of Aug 2024 iirc?): https://platform.openai.com/docs/guides/structured-outputs

This avoids the whole roundabout method of having to ask for a function call when you know that you just want a response to always conform to a specific schema. Would recommend looking into switching to this for OpenAI calls too (this is their official recommendation)

Their docs on it are a bit lacking. That page shows using the client.beta.chat.completions.parse() call, which accepts a pydantic BaseModel directly in the response_format arg, and returns a populated model instance (iirc from looking at the code a while ago it does a bit more work behind the scenes).

Not documented on that page, but you can also just use openai.chat.completions.create() and pass a JSON schema into response_format directly:

response_format: {
        // See /docs/guides/structured-outputs
        type: "json_schema",
        json_schema: {
            name: "email_schema",
            schema: {
                type: "object",
                properties: {
                    email: {
                        description: "The email address that appears in the input",
                        type: "string"
                    }
                },
                additionalProperties: false
            }
        }
    }

samuelcolvin · 2025-01-07T09:31:33Z

Looks like we should switch from the OpenAI compatibility API to using the Ollama python library.

This is blocked on ollama/ollama-python#380.

andrewdmalone · 2025-01-09T03:49:42Z

FWIW to add to the discussion - the OpenAI library's formatted response feature works just fine with Ollama. Here's a modification of the example from the OpenAI Python docs. The "client.beta.chat.completions.parse" form's BaseModel integration is very nice compared to the "JSON Mode" provided via "client.chat.completions.create".

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI(api_key="ollama", base_url='http://localhost:11434/v1/')

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="llama3.2",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

Value of "event" using Ollama 3.2:

CalendarEvent(name='Event Name', date='Friday', participants=['Alice', 'Bob'])

Seems fine with nested models too:

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI(api_key="ollama", base_url='http://localhost:11434/v1/')

class Country(BaseModel):
    name: str
    abbreviation: str

class State(BaseModel):
    name: str
    abbreviation: str

class City(BaseModel):
    name: str
    nickname: str

class Location(BaseModel):
    city: City
    state: State
    country: Country

completion = client.beta.chat.completions.parse(
    model="llama3.2",
    messages=[
        {"role": "system", "content": "Extract the location information."},
        {"role": "user", "content": "I am in the windy city - Chicago, Illinois - in the United States of America"},
    ],
    response_format=Location,
)

event = completion.choices[0].message.parsed

Value of "event" using Ollama 3.2:

Location(city=City(name='Chicago', nickname='The Windy City'), state=State(name='Illinois', abbreviation='IL'), country=Country(name='United States of America', abbreviation='USA'))

Finndersen · 2025-01-13T23:40:40Z

BTW both Anthropic and Gemini also support structured outputs - seems like it's the industry standard approach that should be used instead of via tool calling?

gabrielgrant · 2025-01-13T23:56:57Z

@Finndersen I don't think Anthropic currently supports forcing structured outputs in the same way as OpenAI and Gemini -- the page you've linked gives some examples of ways to encourage use of a given format (system prompt and response prefilling being the most applicable), but afaict the only way to truly ensure adherence to a specific JSON schema with Claude is to force use of a tool. This is their recommended approach:

use tools anytime you want the model to return JSON output that follows a provided schema

Expanded example here: https://docs.anthropic.com/en/docs/build-with-claude/tool-use#json-mode

Finndersen · 2025-01-14T00:04:26Z

@gabrielgrant right apologies I didn't read it properly. However I'd imagine that OpenAI and Gemini's implementations are probably just an extra system prompt anyway (like the example in the Anthropic docs), so it could probably be achieved that way with Claude instead of tool calling?

Because from that page you linked:

When using tools in this way:

You usually want to provide a single tool

I guess it seems like it's working fine with multiple tools but maybe not ideal?

YanSte · 2025-01-16T10:44:04Z

Is my bug will be solve with your new feature ?

#667

YanSte · 2025-01-17T13:10:00Z

Hi,
I encountered couple of issues with Ollama.

It seems like Ollama is not very reliable, so I created this pull request to switch to LMStudio: Pull Request #705.

For those who want local LLM.

sydney-runkle · 2025-01-24T14:53:19Z

Going to close this in favor of #582 which covers the more broad request 👍

daniel-j-h · 2025-01-25T00:26:58Z

Note that this ticket is about ollama not supporting the forced usage of a tool which breaks pydantic-ai as it's implemented today.Using structured output with ollama is a way to fix pydantic-ai and make sure it works with ollama.The broader ticket you linked will add scope creep and might result in the current issue of pydantic-ai and ollama not properly working together to not get fixed and instead be treated as additional feature.On Jan 24, 2025 3:53 PM, Sydney Runkle ***@***.***> wrote: Going to close this in favor of #582 which covers the more broad request 👍 —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

samuelcolvin mentioned this issue Dec 14, 2024

support a custom HTTPX client in Client and AsyncClient ollama/ollama-python#380

Open

fils mentioned this issue Dec 22, 2024

result_type as List #523

Closed

sydney-runkle added the enhancement New feature or request label Dec 23, 2024

xindoreen mentioned this issue Jan 2, 2025

UnexpectedModelBehavior with ollama0.5.4 and llama3.2 #576

Closed

samuelcolvin mentioned this issue Jan 7, 2025

Validation with nested Pydantic models (ollama, llama3.1) #607

Closed

andrewdmalone mentioned this issue Jan 7, 2025

Improving Ollama nested model behavior #621

Closed

Finndersen mentioned this issue Jan 16, 2025

Structured outputs as an alternative to Tool Calling #582

Open

sydney-runkle closed this as completed Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ollama's structured outputs feature #242

Use ollama's structured outputs feature #242

daniel-j-h commented Dec 13, 2024

daniel-j-h commented Dec 13, 2024

samuelcolvin commented Dec 13, 2024

samuelcolvin commented Dec 14, 2024

renkehohl commented Dec 18, 2024

gt732 commented Dec 18, 2024

renkehohl commented Dec 19, 2024

christopherfowers commented Dec 19, 2024

renkehohl commented Dec 19, 2024

gabrielgrant commented Dec 19, 2024 •

edited

Loading

samuelcolvin commented Jan 7, 2025

andrewdmalone commented Jan 9, 2025 •

edited

Loading

Finndersen commented Jan 13, 2025

gabrielgrant commented Jan 13, 2025 •

edited

Loading

Finndersen commented Jan 14, 2025 •

edited

Loading

YanSte commented Jan 16, 2025 •

edited

Loading

YanSte commented Jan 17, 2025 •

edited

Loading

sydney-runkle commented Jan 24, 2025

daniel-j-h commented Jan 25, 2025 via email

Use ollama's structured outputs feature #242

Use ollama's structured outputs feature #242

Comments

daniel-j-h commented Dec 13, 2024

daniel-j-h commented Dec 13, 2024

samuelcolvin commented Dec 13, 2024

samuelcolvin commented Dec 14, 2024

renkehohl commented Dec 18, 2024

gt732 commented Dec 18, 2024

renkehohl commented Dec 19, 2024

christopherfowers commented Dec 19, 2024

renkehohl commented Dec 19, 2024

gabrielgrant commented Dec 19, 2024 • edited Loading

samuelcolvin commented Jan 7, 2025

andrewdmalone commented Jan 9, 2025 • edited Loading

Finndersen commented Jan 13, 2025

gabrielgrant commented Jan 13, 2025 • edited Loading

Finndersen commented Jan 14, 2025 • edited Loading

YanSte commented Jan 16, 2025 • edited Loading

YanSte commented Jan 17, 2025 • edited Loading

sydney-runkle commented Jan 24, 2025

daniel-j-h commented Jan 25, 2025 via email

gabrielgrant commented Dec 19, 2024 •

edited

Loading

andrewdmalone commented Jan 9, 2025 •

edited

Loading

gabrielgrant commented Jan 13, 2025 •

edited

Loading

Finndersen commented Jan 14, 2025 •

edited

Loading

YanSte commented Jan 16, 2025 •

edited

Loading

YanSte commented Jan 17, 2025 •

edited

Loading