-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Really appreciate your effort planned on this PR!
It would be great if you can share some design decisions for the these two items as an RFC (or two separate RFCs) first before we proceed with implementation. We (vLLM team) are also thinking about how we want to support multimodal output and streaming/realtime API on vLLM so it's probably the best time for us to discuss these items! |
Thank you for suggestion! I'll start these two RFCs tomorrow. |
@DarkLight1337 I think I might need some help for verifying LoRA support. Should I do any changes for it? |
@jeejeelee can help with this. Please keep in mind though that currently LoRA is only supported for the language part of multi-modal models. |
Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
…tended design (vllm-project#11672) Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>
…ect#11921) Signed-off-by: shaochangxu.scx <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Signed-off-by: hzh <[email protected]>
…ject#11934) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>
…#11951) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: NickLucche <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>
…roject#11100) Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>
…m-project#9685) Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>
…project#11973) Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
…project#11979) Signed-off-by: hzh <[email protected]>
Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
if "image" in result["mm_placeholders"] and \ | ||
self.info.get_model_version() in [(2, 6), (2, "6O")]: | ||
result["mm_placeholders"]["image"] = [ | ||
PlaceholderRange(offset=p["offset"] + 3 + idx // 10, | ||
length=p["length"] - 3 - idx // 10) | ||
for idx, p in enumerate(result["mm_placeholders"]["image"]) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use PromptReplacementDetails
(introduced by #12269) to simplify this code
Co-authored-by: Cyrus Leung <[email protected]>
Please take a look at the CI failures from before: https://buildkite.com/vllm/fastcheck/builds/12337#0194961e-b0c4-46a6-8033-93d6ca567c9f Can you fix the tests locally? Afterwards I'll unblock the tests on CI. |
When will this feature be available if only non-streaming image input is used? |
For none-streaming image input support of |
Co-authored-by: Cyrus Leung <[email protected]>
I am very eager to test the Minicpm-o2.6 model using the vllm engine, but you haven't merged the PR yet. Now, I would like to know, for a 4090/5090 graphics card using vllm to infer the Minicpm-o2.6 model with audio input and output, what do you estimate the concurrency and latency to be approximately? |
TTS功能会在这两天合并吗?(因为您的路线图上没有勾选TTS功能)如果是,那就太棒了 |
This PR aims to adapt and support all the features of MiniCPM-V and MiniCPM-o. It is designed to be compatible with various modalities (image, video, audio), different model versions (2.0, 2.5, 2.6, o), and diverse input types (raw, embeddings), while maintaining support for LORA, which might require significant effort.
Below is the roadmap for this PR:
MultiModalInputsV2
of vLLM.This PR is still in development. Once I complete the support for audio, I will request to merge. I'll get this work done ASAP.
FIX #12162