[Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment #2628

Yusepp · 2024-10-21T16:52:41Z

Motivation

In the Vision-Language Models (VLMs) Deployment section, particularly under the Offline Inference Pipeline, there are examples that demonstrate two separate functionalities:

Running batched inference.
Running a chat conversation, which processes multiple inputs.

I’m wondering if there's a way to combine both approaches. Specifically, I’d like to provide a list of lists, where each list represents a conversation (composed of tuples of (text, image) pairs, or just text). The goal would be to run batched inference for 8 conversations at a time.

This would streamline scenarios involving multiple conversations with image and text inputs. Is there any existing support for this, or could this be considered as a new feature?

Thanks in advance!

Related resources

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment #2628

[Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment #2628

Yusepp commented Oct 21, 2024

[Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment #2628

[Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment #2628

Comments

Yusepp commented Oct 21, 2024

Motivation

Related resources

Additional context