From 1ac843a37263f6a1611f3416cf545ace14e03b22 Mon Sep 17 00:00:00 2001
From: Michael Wyatt <michaelwyatt@microsoft.com>
Date: Fri, 19 Jan 2024 15:00:46 -0800
Subject: [PATCH] Update README.md

---
 blogs/deepspeed-fastgen/2024-01-19/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blogs/deepspeed-fastgen/2024-01-19/README.md b/blogs/deepspeed-fastgen/2024-01-19/README.md
index 98a9346441a4..9a5c8a83df46 100644
--- a/blogs/deepspeed-fastgen/2024-01-19/README.md
+++ b/blogs/deepspeed-fastgen/2024-01-19/README.md
@@ -29,7 +29,7 @@ Today, we are happy to share that we are improving DeepSpeed-FastGen along three
 
 - **Performance Optimizations**
 
-  We drastically reduced the scheduling overhead of Dynamic SplitFuse and increased the efficiency of token sampling. As a result, we see higher throughput and lower latency, particularly when handling concurrent requests from many clients. We demonstrate the performance optimizations with benchmarks and evaluation of DeepSpeed-FastGen against vLLM for the newly added model families. The benchmark results can be seen in [Performance Evaluation](#performance-evaluation) and the benchmark code is available at [DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/benchmarks/inference/mii).
+  We drastically reduced the scheduling overhead of Dynamic SplitFuse and increased the efficiency of token sampling. As a result, we see higher throughput and lower latency, particularly when handling concurrent requests from many clients. We demonstrate the performance optimizations with benchmarks and evaluation of DeepSpeed-FastGen against vLLM for the newly added model families. The benchmark results can be seen in [Performance Evaluation](#performance-optimizations) and the benchmark code is available at [DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/benchmarks/inference/mii).
 
 - **Feature Enhancements**