Integrate cachetools for in-memory LM caching, including unhashable types & pydantic #1896

dbczumar · 2024-12-06T09:06:10Z

Integrate cachetools for in-memory LM caching, including unhashable types & pydantic

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-12-06T09:07:18Z

dspy/clients/lm.py

+            key=lambda request, *args, **kwargs: cache_key(request),
+            # Use a lock to ensure thread safety for the cache when DSPy LMs are queried
+            # concurrently, e.g. during optimization and evaluation
+            lock=threading.Lock(),


cachetools provides thread safety natively. alternatively, we could try to implement our own cache with the required thread safety functionality, but I suspect there might be bugs (best to reuse something that is known to work)

This is not a blocker for merge, but I'm slightly uneasy about Python-level locking (compared to whatever functools normally does?). Maybe it's required for thread safety, but since it's happening for every single LM call it's a bit worrisome.

Thanks @okhat ! Functools uses a Python lock as well (Rlock). Ill follow up with a small PR to use Rlock instead of Lock.

dbczumar · 2024-12-06T09:08:44Z

dspy/clients/lm.py

+        @cached(
+            # NB: cachetools doesn't support maxsize=None; it recommends using float("inf") instead
+            cache=LRUCache(maxsize=maxsize or float("inf")),
+            key=lambda request, *args, **kwargs: cache_key(request),


This is the key advantage of cachetools. Unlike lru_cache, it allows us to define a cache key by applying a custom function to one or more arguments, rather than forcing all arguments to be hashed / JSON-encoded, passed to the function, and then decoded afterwards. Encoding / decoding is infeasible for callables.

Can you do a global dspy.settings.request_cache default = LRUCache(maxsize=10_000_000) and then have this function pull from that?

With this naming, it could be confused with the disk cache though, right? It seems like we'd want some unified way to refer to both caches, or more distinctive naming. Thoughts?

dbczumar · 2024-12-06T09:09:59Z

dspy/clients/lm.py

    return litellm_completion(
        request,
        cache={"no-cache": False, "no-store": False},
        num_retries=num_retries,
    )


-def litellm_completion(request, num_retries: int, cache={"no-cache": True, "no-store": True}):
-    kwargs = ujson.loads(request)


We no longer have to serialize / deserialize request within the litellm_completion and litellm_text_completion calls

dbczumar · 2024-12-06T09:10:29Z

tests/caching/test_caching.py

+def test_lm_calls_support_unhashable_types(litellm_test_server, temporary_blank_cache_dir):
+    api_base, server_log_file_path = litellm_test_server
+
+    lm_with_unhashable_callable = dspy.LM(
+        model="openai/dspy-test-model",
+        api_base=api_base,
+        api_key="fakekey",
+        # Define a callable kwarg for the LM to use during inference
+        azure_ad_token_provider=lambda *args, **kwargs: None,
+    )
+    lm_with_unhashable_callable("Query")


Fails on main with:

) E TypeError: <function test_lm_calls_support_unhashable_types.<locals>.<lambda> at 0x31204d5a0> is not JSON serializable

dbczumar · 2024-12-06T09:10:46Z

tests/caching/test_caching.py

+def test_lm_calls_support_pydantic_models(litellm_test_server, temporary_blank_cache_dir):
+    api_base, server_log_file_path = litellm_test_server
+
+    class ResponseFormat(pydantic.BaseModel):
+        response: str
+
+    lm = dspy.LM(
+        model="openai/dspy-test-model",
+        api_base=api_base,
+        api_key="fakekey",
+        response_format=ResponseFormat,
+    )
+    lm("Query")


Fails on main with:

TypeError: <class 'tests.caching.test_caching.test_lm_calls_support_pydantic_models.<locals>.ResponseFormat'> is not JSON serializable

dbczumar · 2024-12-06T09:11:57Z

dspy/clients/lm.py

@@ -212,47 +219,82 @@ def copy(self, **kwargs):
        return new_instance


-@functools.lru_cache(maxsize=None)
-def cached_litellm_completion(request, num_retries: int):
+def request_cache(maxsize: Optional[int] = None):


@okhat @bahtman @CyrusNuevoDia Thoughts on this approach? See inline comments discussing advantages below

Looks cool!

Could set default maxsize = float("inf") here

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-12-06T10:15:11Z

tests/clients/test_lm.py

+    assert azure_openai_lm("azure openai query") == expected_response
+
+
+def test_text_lms_can_be_queried(litellm_test_server):


Since we're making changes to litellm_text_completion as well, we should have some coverage for LM queries with model_type="text"

dbczumar · 2024-12-06T10:15:33Z

tests/clients/test_lm.py

+def test_lm_calls_support_unhashable_types(litellm_test_server):
+    api_base, server_log_file_path = litellm_test_server
+
+    lm_with_unhashable_callable = dspy.LM(
+        model="openai/dspy-test-model",
+        api_base=api_base,
+        api_key="fakekey",
+        # Define a callable kwarg for the LM to use during inference
+        azure_ad_token_provider=lambda *args, **kwargs: None,
+    )
+    lm_with_unhashable_callable("Query")


Fails on main with:

E TypeError: <function test_lm_calls_support_unhashable_types.<locals>.<lambda> at 0x31204d5a0> is not JSON serializable

dbczumar · 2024-12-06T10:15:46Z

tests/clients/test_lm.py

+def test_lm_calls_support_pydantic_models(litellm_test_server):
+    api_base, server_log_file_path = litellm_test_server
+
+    class ResponseFormat(pydantic.BaseModel):
+        response: str
+
+    lm = dspy.LM(
+        model="openai/dspy-test-model",
+        api_base=api_base,
+        api_key="fakekey",
+        response_format=ResponseFormat,
+    )
+    lm("Query")


Fails on main with:

TypeError: <class 'tests.caching.test_caching.test_lm_calls_support_pydantic_models.<locals>.ResponseFormat'> is not JSON serializable

CyrusNuevoDia · 2024-12-06T21:24:16Z

Looks awesome! Is there a way to have a global cache that we can dump/load?

dbczumar · 2024-12-06T21:28:42Z

Looks awesome! Is there a way to have a global cache that we can dump/load?

Totally! We can add that if / when we need it by leveraging cachetools LRUCache.items() method

CyrusNuevoDia · 2024-12-06T23:20:27Z

Awesome, lgtm! Appreciate you 🙏

CyrusNuevoDia

One quick fix and lgtm

CyrusNuevoDia · 2024-12-06T23:21:31Z

dspy/clients/lm.py

+        @cached(
+            # NB: cachetools doesn't support maxsize=None; it recommends using float("inf") instead
+            cache=LRUCache(maxsize=maxsize or float("inf")),
+            key=lambda request, *args, **kwargs: cache_key(request),


Can you do a global dspy.settings.request_cache default = LRUCache(maxsize=10_000_000) and then have this function pull from that?

…ypes & pydantic (#1896) * Impl Signed-off-by: dbczumar <[email protected]> * Cachetools add Signed-off-by: dbczumar <[email protected]> * Inline Signed-off-by: dbczumar <[email protected]> * tweak Signed-off-by: dbczumar <[email protected]> * fix Signed-off-by: dbczumar <[email protected]> * fix Signed-off-by: dbczumar <[email protected]> * Update lm.py --------- Signed-off-by: dbczumar <[email protected]>

dbczumar added 3 commits December 6, 2024 00:59

Impl

c0e5c5d

Signed-off-by: dbczumar <[email protected]>

Cachetools add

92f1525

Signed-off-by: dbczumar <[email protected]>

Inline

2c311a8

Signed-off-by: dbczumar <[email protected]>

dbczumar commented Dec 6, 2024

View reviewed changes

dbczumar requested review from okhat, CyrusNuevoDia and chenmoneygithub December 6, 2024 09:12

dbczumar added 3 commits December 6, 2024 01:26

tweak

c543086

Signed-off-by: dbczumar <[email protected]>

fix

954a773

Signed-off-by: dbczumar <[email protected]>

fix

935ce1e

Signed-off-by: dbczumar <[email protected]>

dbczumar commented Dec 6, 2024

View reviewed changes

Update lm.py

a98c7fe

CyrusNuevoDia requested changes Dec 6, 2024

View reviewed changes

okhat merged commit a8d1107 into stanfordnlp:main Dec 7, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate cachetools for in-memory LM caching, including unhashable types & pydantic #1896

Integrate cachetools for in-memory LM caching, including unhashable types & pydantic #1896

dbczumar commented Dec 6, 2024 •

edited

Loading

dbczumar Dec 6, 2024 •

edited

Loading

okhat Dec 7, 2024 •

edited

Loading

dbczumar Dec 7, 2024 •

edited

Loading

dbczumar Dec 6, 2024 •

edited

Loading

CyrusNuevoDia Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

CyrusNuevoDia Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

dbczumar Dec 6, 2024

CyrusNuevoDia commented Dec 6, 2024

dbczumar commented Dec 6, 2024

CyrusNuevoDia commented Dec 6, 2024

CyrusNuevoDia left a comment

CyrusNuevoDia Dec 6, 2024

		assert azure_openai_lm("azure openai query") == expected_response


		def test_text_lms_can_be_queried(litellm_test_server):

Integrate cachetools for in-memory LM caching, including unhashable types & pydantic #1896

Integrate cachetools for in-memory LM caching, including unhashable types & pydantic #1896

Conversation

dbczumar commented Dec 6, 2024 • edited Loading

dbczumar Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

okhat Dec 7, 2024 • edited Loading

Choose a reason for hiding this comment

dbczumar Dec 7, 2024 • edited Loading

Choose a reason for hiding this comment

dbczumar Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CyrusNuevoDia commented Dec 6, 2024

dbczumar commented Dec 6, 2024

CyrusNuevoDia commented Dec 6, 2024

CyrusNuevoDia left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbczumar commented Dec 6, 2024 •

edited

Loading

dbczumar Dec 6, 2024 •

edited

Loading

okhat Dec 7, 2024 •

edited

Loading

dbczumar Dec 7, 2024 •

edited

Loading

dbczumar Dec 6, 2024 •

edited

Loading