Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpX instrumentation does not work on classes that extend httpx client #2364

Closed
jeremydvoss opened this issue Mar 20, 2024 · 12 comments · Fixed by #2909
Closed

HttpX instrumentation does not work on classes that extend httpx client #2364

jeremydvoss opened this issue Mar 20, 2024 · 12 comments · Fixed by #2909
Labels
bug Something isn't working

Comments

@jeremydvoss
Copy link
Contributor

Becuase the HttpX instrumentation changes the httpx.client class, it does not work on classes that are defined on import (even if the class is only instantiated after instrumentation). This means that as soon as OpenAI created an extension of the HttpX client, the httpx instrumentation stopped working. This could be fixed in OpenAI by defining the class at runtime:

class SyncAPIClient(BaseClient[httpx.Client, Stream[Any]]):
    ...
    def __init__(
        ...
    ) -> None:
        ...
        # Define at runtime
        class SyncHttpxClientWrapper(httpx.Client):
            def __del__(self) -> None:
                try:
                    self.close()
                except Exception:
                    pass
        self._client = http_client or SyncHttpxClientWrapper(
            base_url=base_url,
            # cast to a valid type because mypy doesn't understand our type narrowing
            timeout=cast(Timeout, timeout),
            proxies=proxies,
            transport=transport,
            limits=limits,
            follow_redirects=True,
        )

This could also be solved by instrumenting httpx even before importing any library that uses httpx. However, I think these restrictions mean that the HttpX instrumentation is too fragile. We need to improve it so that it works intuitively for all such scenarios.

Describe your environment
Windows
opentelemetry-api 1.23.0
opentelemetry-instrumentation 0.44b0
opentelemetry-instrumentation-httpx 0.44b0
opentelemetry-instrumentation-openai 0.14.1
opentelemetry-sdk 1.23.0
opentelemetry-semantic-conventions 0.44b0
opentelemetry-semantic-conventions-ai 0.0.23
opentelemetry-util-http 0.44b0

Steps to reproduce

from openai import OpenAI # 1.x
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
import httpx
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

HTTPXClientInstrumentor().instrument()

url = "https://www.example.org/"
with httpx.Client() as client:
     response = client.get(url)

OpenAIInstrumentor().instrument()

client = OpenAI() # 1.x
completion = client.chat.completions.create( # 1.x
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."},
    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
  ]
)
print(completion.choices[0].message)

input()

What is the expected behavior?
api.openai.com POST should be captured. Note that this is separate from the openai.chat span captured by the openai instrumentation.

What is the actual behavior?
Only the httpx example span and openai.chat spans are collected.

Additional context
Add any other context about the problem here.

@hbibel
Copy link

hbibel commented Mar 28, 2024

I believe one solution could be to monkeypatch the httpx.BaseClient.event_hooks property instead of overwriting the httpx.Client class. That would be a pretty fundamental change in the httpx instrumentation however. I'd like to try this out.

@jeremydvoss
Copy link
Contributor Author

Aaron noted that the httpx instrumentation isn't using wrapt. I'll experiment with that change and see if that improves this.

@jeremydvoss
Copy link
Contributor Author

Would be good to update the sphinx docs and/or readme

@aabmass
Copy link
Member

aabmass commented Mar 28, 2024

@jeremydvoss somewhat related: https://docs.python.org/3/library/unittest.mock.html#where-to-patch describes monkey patches not taking effect for tests. Looking at the httpx instrumentation code, we are very naiively monkey patching the httpx module

httpx.Client = _InstrumentedClient
httpx.AsyncClient = _InstrumentedAsyncClient

Which means like you mentioned, anyone who uses a from httpx import ... import before it is instrumented will still have a reference to the unpatched version. I think many of our other instrumentations use wrapt to patch the actual implementation, i.e. monkey patching the class's methods instead of the python module's properties.

As a general principle for instrumentation taking effect, that seems better (maybe we can jot this down somewhere). I'm definitely open to re-implementing the pathcing to make this more robust.

@hbibel
Copy link

hbibel commented Mar 29, 2024

Using wrapt seems like a better approach than what I propsed. Please let me know if you need any help.

@WillDaSilva
Copy link

This is also an issue for the HTTPX clients defined by Authlib: https://github.com/lepture/authlib/blob/master/authlib/integrations/httpx_client/oauth2_client.py

@nabheet
Copy link

nabheet commented Jul 12, 2024

OMG! I just ran into this issue!!! Luckily in our QA environment!

here is a simple code sample, in case it helps:

import httpx
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
import asyncio
from authlib.integrations.httpx_client import AsyncOAuth2Client


async def test():
    hci = HTTPXClientInstrumentor()
    hci.instrument()

    scope = "openid email"
    AsyncOAuth2Client(
        client_id="ABC",
        client_secret="DEF",
        scope=scope,
        redirect_uri="",
    )


r = asyncio.run(test())

print(r)

Here is the exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/runpy.py", line 198, in _run_module_as_main
    return _run_code(code, main_globals, None,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/runpy.py", line 88, in _run_code
    exec(code, run_globals)
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/nabheet/.vscode-server/extensions/ms-python.debugpy-2024.8.0-linux-arm64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/workdir/src/test.py", line 20, in <module>
    r = asyncio.run(test())
        ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/workdir/src/test.py", line 12, in test
    AsyncOAuth2Client(
  File "/workdir/.venv/lib/python3.11/site-packages/authlib/integrations/httpx_client/oauth2_client.py", line 65, in __init__
    httpx.AsyncClient.__init__(self, **client_kwargs)
  File "/workdir/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 514, in __init__
    super().__init__(*args, **kwargs)
    ^^^^^^^
TypeError: super(type, obj): obj must be an instance or subtype of type

@michaelgmiller1
Copy link

We are also encountering this. It seems broken that a library would depend on import order. Is there any way to fix this elegantly?

@alexmojaki
Copy link
Contributor

Another issue that would be fixed by avoiding subclassing: #2609

@xrmx
Copy link
Contributor

xrmx commented Oct 22, 2024

Anyone want to give #2909 a try?

@lzchen
Copy link
Contributor

lzchen commented Oct 28, 2024

This issue probably exists for flask and fastapi as well.

@michaelgmiller1
Copy link

Would be great if a release could be cut for this. It would clean up our code a ton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants