Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add speculatice decoding to opea #617

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ClarkChin08
Copy link

Description

Add one micro-service speculative decoding draft code.

Issues

speculative decoding support on cpu and gpu.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • [ * ] New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

Used forked vLLM from https://github.com/jiqing-feng/vllm.git

Tests

Have tested on the test/test_spec_decode_text-generation_vllm.sh

@@ -0,0 +1,226 @@
# Copyright (C) 2024 Intel Corporation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to refine follow the new code structure, there should be no docker folder.
So this file should named as Dockerfile.nvidia_gpu.

@@ -0,0 +1,53 @@
# This vLLM Dockerfile is used to construct image that can build and run vLLM on x86 CPU platform.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile.cpu -> Dockerfile

@letonghan
Copy link
Collaborator

Hi @ClarkChin08 , thanks for contributing.
The code structure of this PR should align with llm/text-generation/vllm/langchain.
Note:

  • Create a langchain folder inside spec_decode/text-generation/vllm.
  • Reorg all files in spec_decode/text-generation/vllm/langchain folder.
  • Create a dependency folder in the deepest folder, move vllm dependency Dockerfile, xx.sh into it.
  • Rename all Dockerfiles refer to the link above.
  • Add a spec_decode folder in tests and move the test script into it.

Thank you : )

lkk12014402 pushed a commit that referenced this pull request Sep 19, 2024
Copy link

codecov bot commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 77.14286% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
comps/cores/mega/gateway.py 27.27% 8 Missing ⚠️
Files with missing lines Coverage Δ
comps/cores/mega/constants.py 98.38% <100.00%> (+0.08%) ⬆️
comps/cores/proto/docarray.py 99.43% <100.00%> (+0.07%) ⬆️
comps/cores/mega/gateway.py 35.10% <27.27%> (-0.22%) ⬇️

... and 1 file with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants