Skip to content

Commit

Permalink
feat(langchain): add support for langchain==0.1 (#8563)
Browse files Browse the repository at this point in the history
Fixes #8212.

### For Reviewers

**NOTE: Please disregard the LOC count in this PR, 90% of the lines of
code changed are cassette test data / moved files / snapshots files /
requirements lockfiles.**
The important files to review are:
- `ddtrace/contrib/langchain/patch.py`: version gating which methods to
patch
- `releasenotes/notes/feat-support-langchain-0-1-0d8e0ddd6248c4ed.yaml`
for release note
- `tests/contrib/langchain/test_langchain_community.py` for testing
langchain >= 0.1
- `tests/contrib/langchain/test_langchain_patch.py` for testing which
methods are traced

## Change Summary

This PR adds support for `langchain>=0.1`, which has deprecated multiple
traced methods from `langchain<=0.0.354` for removal in the upcoming
`langchain==0.2.0` release. Whereas the older version of langchain was
all contained in one library `langchain`, the new version has split its
library into separate subpackages:
- `langchain_core`: Core base classes for llms, chat models
- `langchain_community`: Community subpackage that contains integrations
for different libraries (i.e. `cohere, anthropic, etc...`)
- `langchain_<partner_name>`: Individual subpackages for each partner
integration such as `langchain_openai`, `langchain_pinecone`. It appears
that the ultimate goal is for all of the code in `langchain_community`
to get extracted to these partner subpackages.
- `langchain`: The old library which contains still a lot of base
methods such as Chains.

tldr; the functionality in this PR adds version gating the langchain
integration to patch the correct methods/import paths with the correct
traced functions. No functionality has been added.

### Risks
Currently a design flaw of the langchain integration is that it keeps a
static list of embeddings/vectorstore class names to then patch
`langchain_community.embeddings/vectorstores.<CLASS_NAME>`. However with
langchain's apparent push for moving integrations to their individual
partner subpackages, this will break our patching for
embeddings/vectorstores since now we have to patch each subpackage
individually. Currently only pinecone/openai (which our integration
specifically patches) and elasticsearch/vertexAI (not yet patched) have
their own subpackages, so the risk is low since there will be no
breaking changes (and [Langchain promises no more breaking changes on
minor
releases](https://blog.langchain.dev/week-of-1-22-24-langchain-release-notes/#:~:text=No%20more%20breaking%20changes%20on%20a%20minor%20version%20release.)).

Our proposed approach of resolving this issue is to offer limited
tracing support for embeddings/vectorstores with individual subpackages
(openai/pinecone for now) at first, then gradually expand our support
once the others become available. We will likely need to
redesign/refactor the integration in that case.

## Testing
This PR adds a separate venv for
`langchain-community/langchain-core/langchain-openai/langchain-pinecone`,
and only tests with python 3.10 and over. The reason is that python 3.9
and under requires a separate set of test cassette files, and that is an
unnecessary number of files/tests.

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Yun Kim <[email protected]>
  • Loading branch information
sabrenner and Yun-Kim authored Mar 14, 2024
1 parent 18a7e49 commit 9e22fd1
Show file tree
Hide file tree
Showing 115 changed files with 10,575 additions and 848 deletions.
55 changes: 31 additions & 24 deletions .riot/requirements/4631ae3.txt → .riot/requirements/13688bb.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,68 +2,75 @@
# This file is autogenerated by pip-compile with Python 3.9
# by the following command:
#
# pip-compile --no-annotate .riot/requirements/4631ae3.in
# pip-compile --no-annotate .riot/requirements/13688bb.in
#
ai21==1.3.3
aiohttp==3.9.1
ai21==1.3.4
aiohttp==3.9.3
aiosignal==1.3.1
anyio==4.3.0
async-timeout==4.0.3
attrs==23.2.0
backoff==2.2.1
certifi==2023.11.17
certifi==2024.2.2
charset-normalizer==3.3.2
cohere==4.40
coverage[toml]==7.4.0
cohere==4.53
coverage[toml]==7.4.3
dataclasses-json==0.5.14
dnspython==2.4.2
dnspython==2.6.1
exceptiongroup==1.2.0
fastavro==1.9.2
fastavro==1.9.4
filelock==3.13.1
frozenlist==1.4.1
fsspec==2023.12.2
fsspec==2024.2.0
greenlet==3.0.3
huggingface-hub==0.20.1
huggingface-hub==0.21.4
hypothesis==6.45.0
idna==3.6
importlib-metadata==6.11.0
iniconfig==2.0.0
jsonpatch==1.33
jsonpointer==2.4
langchain==0.0.192
langchain-community==0.0.14
langchain-core==0.1.23
langchainplus-sdk==0.0.4
langsmith==0.0.87
loguru==0.7.2
marshmallow==3.20.1
marshmallow==3.21.1
mock==5.1.0
multidict==6.0.4
multidict==6.0.5
mypy-extensions==1.0.0
numexpr==2.8.8
numpy==1.26.3
numexpr==2.9.0
numpy==1.26.4
openai==0.27.8
openapi-schema-pydantic==1.2.4
opentracing==2.4.0
packaging==23.2
pinecone-client==2.2.4
pluggy==1.3.0
psutil==5.9.7
pydantic==1.10.13
pytest==7.4.4
pluggy==1.4.0
psutil==5.9.8
pydantic==1.10.14
pytest==8.1.1
pytest-asyncio==0.21.1
pytest-cov==4.1.0
pytest-mock==3.12.0
pytest-randomly==3.15.0
python-dateutil==2.8.2
python-dateutil==2.9.0.post0
pyyaml==6.0.1
regex==2023.12.25
requests==2.31.0
six==1.16.0
sniffio==1.3.1
sortedcontainers==2.4.0
sqlalchemy==2.0.25
sqlalchemy==2.0.28
tenacity==8.2.3
tiktoken==0.5.2
tiktoken==0.6.0
tomli==2.0.1
tqdm==4.66.1
typing-extensions==4.9.0
tqdm==4.66.2
typing-extensions==4.10.0
typing-inspect==0.9.0
urllib3==1.26.18
vcrpy==5.1.0
vcrpy==6.0.1
wrapt==1.16.0
yarl==1.9.4
zipp==3.17.0
79 changes: 79 additions & 0 deletions .riot/requirements/15757fd.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#
# This file is autogenerated by pip-compile with Python 3.11
# by the following command:
#
# pip-compile --no-annotate .riot/requirements/15757fd.in
#
ai21==2.1.2
ai21-tokenizer==0.3.11
aiohttp==3.9.3
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.3.0
attrs==23.2.0
backoff==2.2.1
certifi==2024.2.2
charset-normalizer==3.3.2
cohere==4.53
coverage[toml]==7.4.3
dataclasses-json==0.6.4
distro==1.9.0
exceptiongroup==1.2.0
fastavro==1.9.4
filelock==3.13.1
frozenlist==1.4.1
fsspec==2024.2.0
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.4
httpx==0.27.0
huggingface-hub==0.21.4
hypothesis==6.45.0
idna==3.6
importlib-metadata==6.11.0
iniconfig==2.0.0
jsonpatch==1.33
jsonpointer==2.4
langchain==0.1.9
langchain-community==0.0.24
langchain-core==0.1.27
langchain-openai==0.0.8
langchain-pinecone==0.0.3
langsmith==0.1.9
marshmallow==3.21.1
mock==5.1.0
multidict==6.0.5
mypy-extensions==1.0.0
numexpr==2.9.0
numpy==1.26.4
openai==1.12.0
opentracing==2.4.0
orjson==3.9.15
packaging==23.2
pinecone-client==3.1.0
pluggy==1.4.0
psutil==5.9.8
pydantic==2.6.3
pydantic-core==2.16.3
pytest==8.1.1
pytest-asyncio==0.21.1
pytest-cov==4.1.0
pytest-mock==3.12.0
pytest-randomly==3.15.0
pyyaml==6.0.1
regex==2023.12.25
requests==2.31.0
sentencepiece==0.1.99
sniffio==1.3.1
sortedcontainers==2.4.0
sqlalchemy==2.0.28
tenacity==8.2.3
tiktoken==0.6.0
tqdm==4.66.2
typing-extensions==4.10.0
typing-inspect==0.9.0
urllib3==2.2.1
vcrpy==6.0.1
wrapt==1.16.0
yarl==1.9.4
zipp==3.17.0
69 changes: 0 additions & 69 deletions .riot/requirements/17cfc25.txt

This file was deleted.

81 changes: 81 additions & 0 deletions .riot/requirements/1bd8794.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile --no-annotate .riot/requirements/1bd8794.in
#
ai21==2.1.2
ai21-tokenizer==0.3.11
aiohttp==3.9.3
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.3.0
async-timeout==4.0.3
attrs==23.2.0
backoff==2.2.1
certifi==2024.2.2
charset-normalizer==3.3.2
cohere==4.53
coverage[toml]==7.4.3
dataclasses-json==0.6.4
distro==1.9.0
exceptiongroup==1.2.0
fastavro==1.9.4
filelock==3.13.1
frozenlist==1.4.1
fsspec==2024.2.0
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.4
httpx==0.27.0
huggingface-hub==0.21.4
hypothesis==6.45.0
idna==3.6
importlib-metadata==6.11.0
iniconfig==2.0.0
jsonpatch==1.33
jsonpointer==2.4
langchain==0.1.9
langchain-community==0.0.24
langchain-core==0.1.27
langchain-openai==0.0.8
langchain-pinecone==0.0.3
langsmith==0.1.9
marshmallow==3.21.1
mock==5.1.0
multidict==6.0.5
mypy-extensions==1.0.0
numexpr==2.9.0
numpy==1.26.4
openai==1.12.0
opentracing==2.4.0
orjson==3.9.15
packaging==23.2
pinecone-client==3.1.0
pluggy==1.4.0
psutil==5.9.8
pydantic==2.6.3
pydantic-core==2.16.3
pytest==8.1.1
pytest-asyncio==0.21.1
pytest-cov==4.1.0
pytest-mock==3.12.0
pytest-randomly==3.15.0
pyyaml==6.0.1
regex==2023.12.25
requests==2.31.0
sentencepiece==0.1.99
sniffio==1.3.1
sortedcontainers==2.4.0
sqlalchemy==2.0.28
tenacity==8.2.3
tiktoken==0.6.0
tomli==2.0.1
tqdm==4.66.2
typing-extensions==4.10.0
typing-inspect==0.9.0
urllib3==2.2.1
vcrpy==6.0.1
wrapt==1.16.0
yarl==1.9.4
zipp==3.17.0
Loading

0 comments on commit 9e22fd1

Please sign in to comment.