fix(vertex_ai/gemini): improve chunk parsing for streaming responses #8401

miraclebakelaser · 2025-02-08T19:05:00Z

This PR fixes bug #8143 by updating the vertex_ai streaming chunk parsing logic to remove only a leading data: prefix.

Relevant issues

#8143

Type

🐛 Bug Fix

Changes

    def _common_chunk_parsing_logic(self, chunk: str) -> GenericStreamingChunk:
        try:
-            chunk = chunk.replace("data:", "")
+            chunk = chunk.strip()
+            if chunk.startswith("data:"):
+                chunk = chunk[len("data:"):].strip()
             if len(chunk) > 0:
                 """
                 Check if initial chunk valid json
                 - if partial json -> enter accumulated json logic
                 - if valid - continue
                 """
                 if self.chunk_type == "valid_json":
                     return self.handle_valid_json_chunk(chunk=chunk)
                 elif self.chunk_type == "accumulated_json":
                     return self.handle_accumulated_json_chunk(chunk=chunk)

Fix incorrect removal of "data:" strings within response content.

vercel · 2025-02-08T19:05:05Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Feb 9, 2025 10:18am

ishaan-jaff

please a add a test and / or unit test

ishaan-jaff · 2025-02-08T20:40:49Z

litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py

-            chunk = chunk.replace("data:", "")
+            chunk = chunk.strip()
+            if chunk.startswith("data:"):
+                chunk = chunk[len("data:"):].strip()


.strip() Is this needed ? Wouldn't this remove an start / end empty space that actually came from llm ?

In the case where self.chunk_type == "valid_json", the strip wouldn't have any effect on the llm message content itself (added testing for it). I updated the handling to be more precise.

Enhance test coverage for ModelResponseIterator's chunk parsing logic, focusing on preserving 'data:' prefixes and surrounding spaces in Vertex AI and Google Studio Gemini streaming responses

miraclebakelaser · 2025-02-09T11:12:12Z

please add a test and / or unit test

Added tests that cover cases where self.chunk_type == "valid_json". How are chunks formatted when self.chunk_type == "accumulated_json"?

I understand that in the case of self.chunk_type == "valid_json", the messages arrive in event stream format and are chunked by message:

chunk 1:
data: {"candidates": [{"content": {"role": "model","parts": [{"text": "data"}]}}],"usageMetadata": {},"modelVersion": "gemini-2.0-flash-exp","createTime": "2025-01-01T00:00:00.000000Z","responseId": "12345"}

chunk 2:
data: {"candidates": [{"content": {"role": "model","parts": [{"text": ": line "}]}}],"modelVersion": "gemini-2.0-flash-exp","createTime": "2025-01-01T00:00:00.000000Z","responseId": "12345"}

chunk 3:
data: {"candidates": [{"content": {"role": "model","parts": [{"text": "1\\ndata: line 2\\ndata: line 3\\ndata:"}]}}],"modelVersion": "gemini-2.0-flash-exp","createTime": "2025-01-01T00:00:00.000000Z","responseId": "12345"}

chunk 4:
data: {"candidates": [{"content": {"role": "model","parts": [{"text": " line 4\\n"}]},"finishReason": "STOP"}],"usageMetadata": {"promptTokenCount": 42,"candidatesTokenCount": 24,"totalTokenCount": 66,"promptTokensDetails": [{"modality": "TEXT","tokenCount": 42}],"candidatesTokensDetails": [{"modality": "TEXT","tokenCount": 24}]},"modelVersion": "gemini-2.0-flash-exp","createTime": "2025-01-01T00:00:00.000000Z","responseId": "12345"}

When self.chunk_type == "accumulated_json", do chunks have the same format as above (i.e. data: ...), or are they something like a json string split into pieces like the following:

chunk 1:
{"candidates": [{"content": {"role": "model","parts

chunk 2:
": [{"text": "```"}]}}],"usageMetadata": {},"modelVersion": "gemini-1.5-flash-001","createTime":

I haven't encountered a situation where the code goes down this path so your input would help me add a test for that case too.

fix(vertex_ai/gemini): improve chunk parsing for streaming responses

2b71c72

Fix incorrect removal of "data:" strings within response content.

vercel bot deployed to Preview February 8, 2025 19:05 View deployment

ishaan-jaff requested changes Feb 8, 2025

View reviewed changes

test(vertex_ai/gemini): add tests for SSE stream chunk parsing fix

c238f56

Enhance test coverage for ModelResponseIterator's chunk parsing logic, focusing on preserving 'data:' prefixes and surrounding spaces in Vertex AI and Google Studio Gemini streaming responses

vercel bot deployed to Preview February 9, 2025 04:20 View deployment

More precise handling of 'data: ' prefix

61a5de9

vercel bot deployed to Preview February 9, 2025 10:18 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vertex_ai/gemini): improve chunk parsing for streaming responses #8401

fix(vertex_ai/gemini): improve chunk parsing for streaming responses #8401

miraclebakelaser commented Feb 8, 2025

vercel bot commented Feb 8, 2025 •

edited

Loading

ishaan-jaff left a comment

ishaan-jaff Feb 8, 2025

miraclebakelaser Feb 9, 2025

miraclebakelaser commented Feb 9, 2025

fix(vertex_ai/gemini): improve chunk parsing for streaming responses #8401

Are you sure you want to change the base?

fix(vertex_ai/gemini): improve chunk parsing for streaming responses #8401

Conversation

miraclebakelaser commented Feb 8, 2025

Relevant issues

Type

Changes

vercel bot commented Feb 8, 2025 • edited Loading

ishaan-jaff left a comment

Choose a reason for hiding this comment

ishaan-jaff Feb 8, 2025

Choose a reason for hiding this comment

miraclebakelaser Feb 9, 2025

Choose a reason for hiding this comment

miraclebakelaser commented Feb 9, 2025

vercel bot commented Feb 8, 2025 •

edited

Loading