update doc

Signed-off-by: jiang1.li <[email protected]>
vllm-project · Nov 15, 2024 · e9ddc0e · e9ddc0e
1 parent d301325
commit e9ddc0e
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 7 deletions.
diff --git a/docs/source/getting_started/cpu-installation.rst b/docs/source/getting_started/cpu-installation.rst
@@ -5,11 +5,10 @@ Installation with CPU
 
 vLLM initially supports basic model inferencing and serving on x86 CPU platform, with data types FP32, FP16 and BF16. vLLM CPU backend supports the following vLLM features:
 
-- Tensor Parallel (``-tp = N``)
-- Quantization (``INT8 W8A8, AWQ``)
-
-.. note::
-    More advanced features on `chunked-prefill`, `prefix-caching` and `FP8 KV cache` are under development and will be available soon.
+- Tensor Parallel 
+- Model Quantization (``INT8 W8A8, AWQ``)
+- Chunked-prefill
+- Prefix-caching
 
 Table of contents:
 

diff --git a/docs/source/serving/compatibility_matrix.rst b/docs/source/serving/compatibility_matrix.rst
@@ -311,15 +311,15 @@ Feature x Hardware
      - ✅
      - ✅
      - ✅
-     - ✗ 
+     - ✅
      - ✅
    * - :ref:`APC <apc>`
      - `✗ <https://github.com/vllm-project/vllm/issues/3687>`__ 
      - ✅
      - ✅
      - ✅
      - ✅
-     - ✗
+     - ✅
      - ✅
    * - :ref:`LoRA <lora>`
      - ✅