Streaming #1874

CyrusNuevoDia · 2024-11-29T02:21:34Z

dspy.streamify can be used to convert the dspy program to a streaming mode. This is useful when you want to stream
the intermediate outputs (i.e. O1-style reasoning) to the client before the final prediction is ready. This uses
asyncify under the hood and inherits the execution semantics.

The deltas of every module in the program are streamed directly with no processing and then once the final prediction is ready it is yielded.

Here's how it works for deployment

from fastapi.responses import StreamingResponse

streaming_dspy_program = dspy.streamify(dspy.ChainOfThought("question -> answer"))

@app.post("/predict/stream")
async def stream(question: Question):
    async def generate():
        async for value in streaming_dspy_program(question=question.text):
            if isinstance(value, dspy.Prediction):
                data = {"prediction": value.labels().toDict()}
            elif isinstance(value, litellm.ModelResponse):
                data = {"chunk": value.json()}
            yield f"data: {ujson.dumps(data)}\n\n"
        yield "data: [DONE]\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

# Since you're often going to want to stream the result of a DSPy program as server-sent events,
# we've included a helper function for that, which is equivalent to the code above.

from dspy.utils.streaming import streaming_response

@app.post("/predict/stream")
async def stream(question: Question):
    stream = streaming_dspy_program(question=question.text)
    return StreamingResponse(streaming_response(stream), media_type="text/event-stream")

Changes

New in-memory LMRequestLRUCache with a default max size of 10_000_000.

Notes

No intermediate details are streamed with a cache hit on the in-memory LRU cache because we have the final result instantly. Streaming should work with in-memory cache turned off, which enables the LiteLLM cache

okhat · 2024-12-01T16:31:31Z

This looks AMAZING, thanks @CyrusNuevoDia ! I'm just wrapping my head around the caching improvements here (which I quite like so far) and then will merge

CyrusNuevoDia · 2024-12-01T16:38:52Z

@okhat

Before: Infinite LRU caches (unbounded memory growth in prod with cache=True)
After: Bounded LRU caches

Before: Had to serialize/deserialize JSON to cache properly
After: Just need to serialize (to compute the hash key)

Anything else I can help clarify?

bahtman · 2024-12-02T17:57:42Z

Hi @CyrusNuevoDia. Sick feature, have been waiting for this! :)
Do you know whether this fixes the issue with serializing callable kwargs?
If it does not, you might be able to implement the fix built in #1862

CyrusNuevoDia · 2024-12-02T22:22:30Z

Hey @bahtman just added your proposed fix, and @okhat added dump/load so you can share request caches :)

dspy/utils/caching.py

docs/docs/tutorials/deployment/index.md

dspy/utils/caching.py

Signed-off-by: dbczumar <[email protected]>

dspy/utils/streaming.py

dspy/utils/caching.py

bahtman · 2024-12-05T20:21:41Z

@dbczumar
Do you see my comments? It says pending when I reply.

rohitgarud · 2024-12-07T07:43:05Z

Hi @dbczumar, Will it be possible to apply assertions/suggestions on the partially streamed output so that we can stop streaming if the assertion fails and reduce the token usage by trying again with prompt added due to assertion failure? This might be particularly useful if we are applying multiple assertions to assess the output according to multiple dimensions

CyrusNuevoDia · 2024-12-08T21:00:24Z

@rohitgarud great idea! out of scope for this current version but would be awesome

CyrusNuevoDia · 2024-12-10T16:27:02Z

@dbczumar merged in your request cache logic, getting this error on tests — any idea how to fix it?

ImportError while importing test module '/Users/knrz/Git/stanfordnlp/dspy/tests/caching/test_caching.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/caching/test_caching.py:10: in <module>
    from tests.test_utils.server import litellm_test_server, read_litellm_test_server_request_logs
E   ModuleNotFoundError: No module named 'tests.test_utils'
____________________________________________________________________________________ ERROR collecting tests/clients/test_lm.py ____________________________________________________________________________________
ImportError while importing test module '/Users/knrz/Git/stanfordnlp/dspy/tests/clients/test_lm.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/clients/test_lm.py:8: in <module>
    from tests.test_utils.server import litellm_test_server
E   ModuleNotFoundError: No module named 'tests.test_utils'
_______________________________________________________________________________________ ERROR collecting tests/reliability ________________________________________________________________________________________
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
.venv/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
tests/reliability/conftest.py:6: in <module>
    from tests.conftest import clear_settings
E   ImportError: cannot import name 'clear_settings' from 'tests.conftest' (/Users/knrz/Git/stanfordnlp/dspy/.venv/lib/python3.9/site-packages/tests/conftest.py)

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-12-17T19:33:46Z

@dbczumar merged in your request cache logic, getting this error on tests — any idea how to fix it?

ImportError while importing test module '/Users/knrz/Git/stanfordnlp/dspy/tests/caching/test_caching.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/caching/test_caching.py:10: in <module>
    from tests.test_utils.server import litellm_test_server, read_litellm_test_server_request_logs
E   ModuleNotFoundError: No module named 'tests.test_utils'
____________________________________________________________________________________ ERROR collecting tests/clients/test_lm.py ____________________________________________________________________________________
ImportError while importing test module '/Users/knrz/Git/stanfordnlp/dspy/tests/clients/test_lm.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/clients/test_lm.py:8: in <module>
    from tests.test_utils.server import litellm_test_server
E   ModuleNotFoundError: No module named 'tests.test_utils'
_______________________________________________________________________________________ ERROR collecting tests/reliability ________________________________________________________________________________________
../../../.rye/py/[email protected]/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
.venv/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
tests/reliability/conftest.py:6: in <module>
    from tests.conftest import clear_settings
E   ImportError: cannot import name 'clear_settings' from 'tests.conftest' (/Users/knrz/Git/stanfordnlp/dspy/.venv/lib/python3.9/site-packages/tests/conftest.py)

Thanks @CyrusNuevoDia ! I'll push some updates to remove caching changes from this PR. My reasoning is that streaming and caching aren't directly related (I added some test coverage to verify that caching works properly with streaming, though). We can make adjustments to caching in future PRs.

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-12-17T19:38:35Z

dspy/dsp/utils/settings.py

@@ -17,6 +18,7 @@
    backoff_time=10,
    callbacks=[],
    async_max_workers=8,
+    send_stream=None,


This is the only meaningful change in the file - everything else is a linter adjustment

Signed-off-by: dbczumar <[email protected]>

MohammedAlhajji · 2024-12-23T10:10:35Z

Thanks @CyrusNuevoDia This looks great. Appreciate it.

CyrusNuevoDia added 2 commits November 28, 2024 20:14

dspy.streamify

7816ba3

Update docs

d8bc33c

CyrusNuevoDia requested a review from okhat November 29, 2024 02:21

CyrusNuevoDia added 6 commits November 28, 2024 20:22

Merge branch 'main' into streaming

48a2376

Fix ruff lint error

ab8b28e

Bring back send_stream to settings

7c251a9

Improve doc

00c6e76

Bring back request_cache setting

c02a7fc

sse => streaming_response

707e5a3

CyrusNuevoDia mentioned this pull request Nov 29, 2024

Adding stream to DSPy LMs #338

Open

Simplify dsp.utils.settings diff

e852b4c

CyrusNuevoDia mentioned this pull request Nov 29, 2024

streaming after LiteLLM integration #1715

Open

Add load/dump to LRUCache + drop callable request params

d4064e1

ujson => pickle for dump/load

92f137f

dbczumar reviewed Dec 3, 2024

View reviewed changes

dspy/utils/caching.py Outdated Show resolved Hide resolved

dbczumar reviewed Dec 3, 2024

View reviewed changes

docs/docs/tutorials/deployment/index.md Show resolved Hide resolved

dbczumar reviewed Dec 3, 2024

View reviewed changes

dspy/utils/caching.py Outdated Show resolved Hide resolved

Stream fix

070101f

Signed-off-by: dbczumar <[email protected]>

dbczumar reviewed Dec 5, 2024

View reviewed changes

dspy/utils/streaming.py Show resolved Hide resolved

dbczumar reviewed Dec 5, 2024

View reviewed changes

dspy/utils/streaming.py Show resolved Hide resolved

dbczumar reviewed Dec 5, 2024

View reviewed changes

dspy/utils/caching.py Outdated Show resolved Hide resolved

okhat added the Behavior 2.5 label Dec 9, 2024

dbczumar added 7 commits December 16, 2024 20:02

Merge conflict

a573ed0

Signed-off-by: dbczumar <[email protected]>

test streaming

10c7bbd

Signed-off-by: dbczumar <[email protected]>

fix

dfa0e3b

Signed-off-by: dbczumar <[email protected]>

fix

68aca74

Signed-off-by: dbczumar <[email protected]>

Streaming works

ca928fd

Signed-off-by: dbczumar <[email protected]>

Fix

962283d

Signed-off-by: dbczumar <[email protected]>

fix

93fe33b

Signed-off-by: dbczumar <[email protected]>

dbczumar force-pushed the streaming branch from c4ce17d to 93fe33b Compare December 17, 2024 19:34

no ignore change

a234c7b

Signed-off-by: dbczumar <[email protected]>

dbczumar reviewed Dec 17, 2024

View reviewed changes

dbczumar added 3 commits December 17, 2024 12:10

fix

eb7e792

Signed-off-by: dbczumar <[email protected]>

Simple init

7772a0d

Simple init

4232dc4

okhat merged commit 027312b into main Dec 17, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming #1874

Streaming #1874

CyrusNuevoDia commented Nov 29, 2024 •

edited

Loading

okhat commented Dec 1, 2024

CyrusNuevoDia commented Dec 1, 2024

bahtman commented Dec 2, 2024 •

edited

Loading

CyrusNuevoDia commented Dec 2, 2024

bahtman commented Dec 5, 2024 •

edited

Loading

rohitgarud commented Dec 7, 2024

CyrusNuevoDia commented Dec 8, 2024

CyrusNuevoDia commented Dec 10, 2024

dbczumar commented Dec 17, 2024

dbczumar Dec 17, 2024

MohammedAlhajji commented Dec 23, 2024

Streaming #1874

Streaming #1874

Conversation

CyrusNuevoDia commented Nov 29, 2024 • edited Loading

Here's how it works for deployment

Changes

Notes

okhat commented Dec 1, 2024

CyrusNuevoDia commented Dec 1, 2024

bahtman commented Dec 2, 2024 • edited Loading

CyrusNuevoDia commented Dec 2, 2024

bahtman commented Dec 5, 2024 • edited Loading

rohitgarud commented Dec 7, 2024

CyrusNuevoDia commented Dec 8, 2024

CyrusNuevoDia commented Dec 10, 2024

dbczumar commented Dec 17, 2024

dbczumar Dec 17, 2024

Choose a reason for hiding this comment

MohammedAlhajji commented Dec 23, 2024

CyrusNuevoDia commented Nov 29, 2024 •

edited

Loading

bahtman commented Dec 2, 2024 •

edited

Loading

bahtman commented Dec 5, 2024 •

edited

Loading