Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: All OpenAI Assistant messages are registered as completions, even if they are user messages #2553

Open
1 task done
dinmukhamedm opened this issue Jan 25, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@dinmukhamedm
Copy link
Contributor

dinmukhamedm commented Jan 25, 2025

Which component is this bug for?

OpenAI Instrumentation

📜 Description

Open AI assistant is initialized with one or more system messages, and then the rest of the chat history may be added later in the run. These followup messages are always recorded as gen_ai.completions.N.content even if the message was created by the user.

Yes, I know that

  • There is room for discussions if completions.role=user makes sense, and if the role must be the source of truth
  • Work on migrating to the events instead of attributes is going on, and conventions will be slightly different.

But, given that this should be a fairly small change, I'm suggesting we update this

👟 Reproduction steps

  1. Run the code below with some observability configured
  2. Have a look at the attributes of the resulting span with name openai.assistant.run
import time
from openai import OpenAI

NOT_GIVEN = None

class MyAssistant:
    def __init__(self):
        self.client = OpenAI(
        )
        self.assistant = self.client.beta.assistants.create(
            name="Math Tutor",
            instructions="You are a personal math tutor. Write and run code to answer math questions.",
            tools=[{"type": "code_interpreter"}],
            model="gpt-4o-mini",
        )

    def execute(self):
        self.thread = self.client.beta.threads.create()
        self._create_thread_message(
            "I need to solve the equation `3x + 11 = 14`. Can you help me?"
        )
        run = self._create_run(
            instructions="Please address the user as Jane Doe. The user has a premium account."
        )
        self._process_run(run)

    def _create_thread_message(self, prompt: str):
        return self.client.beta.threads.messages.create(
            thread_id=self.thread.id,
            role="user",
            content=prompt,
        )

    def _create_run(
        self,
        tool_override: None = None,
        instructions: str | None = None,
    ):
        return self.client.beta.threads.runs.create(
            thread_id=self.thread.id,
            assistant_id=str(self.assistant.id),
            model=self.assistant.model if self.assistant.model else NOT_GIVEN,
            instructions=instructions,
            temperature=(
                self.assistant.temperature if self.assistant.temperature else NOT_GIVEN
            ),
            top_p=self.assistant.top_p if self.assistant.top_p else NOT_GIVEN,
            tools=self.assistant.tools if self.assistant.tools else NOT_GIVEN,
            tool_choice=None,
        )

    def _process_run(self, run):
        while run.status in ["queued", "in_progress", "cancelling"]:
            time.sleep(1)  # Wait for 1 second
            run = self.client.beta.threads.runs.retrieve(
                thread_id=self.thread.id, run_id=run.id
            )

        if run.status == "completed":
            messages = self.client.beta.threads.messages.list(
                thread_id=self.thread.id, order="asc"
            )
            for data in messages.data:
                print(f"{data.role}: {data.content[0].text.value}")
        else:
            print(run.status)


a = MyAssistant()
a.execute()

👍 Expected behavior

gen_ai.prompt.0.role=system
gen_ai.prompt.0.content=You are a personal math tutor. Write and run code to answer math questions.
gen_ai.prompt.1.role=system
gen_ai.prompt.1.content=Please address the user as Jane Doe. The user has a premium account.
gen_ai.prompt.2.role=user
gen_ai.prompt.2.content=I need to solve the equation `3x + 11 = 14`. Can you help me?

gen_ai.completions.0.role=assistant
gen_ai.completions.0.content=<whatever the LLM outputs>

👎 Actual Behavior with Screenshots

gen_ai.prompt.0.role=system
gen_ai.prompt.0.content=You are a personal math tutor. Write and run code to answer math questions.
gen_ai.prompt.1.role=system
gen_ai.prompt.1.content=Please address the user as Jane Doe. The user has a premium account.

gen_ai.completions.0.role=user
gen_ai.completions.0.content=I need to solve the equation `3x + 11 = 14`. Can you help me?
gen_ai.completions.1.role=assistant
gen_ai.completions.1.content=<whatever the LLM outputs>

🤖 Python Version

3.12

📃 Provide any additional context for the Bug.

It's this for loop that does it.

I suggest we set it either next prompt or next (first?) completion based on the role in the message. What do you think?

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

Copy link

dosubot bot commented Jan 25, 2025

The issue you're encountering is related to how messages are being categorized in the OpenAI Instrumentation package. Currently, all messages, including user messages, are being recorded as completions, which is not the intended behavior. The expected behavior is to have user messages recorded as prompts and only the assistant's responses as completions.

The suggestion to update the logic in the loop to differentiate between user and assistant messages based on their roles is a valid approach. By checking the role of each message, you can decide whether it should be categorized as a prompt or a completion. This change would align the behavior with the expected output, where user messages are recorded as prompts and assistant messages as completions.

Implementing this change should help in accurately capturing the roles and contents of messages, improving the observability and traceability of interactions with the OpenAI API.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the bug Something isn't working label Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant