Small fix

intel · Jan 23, 2025 · 9d04a0e · 9d04a0e
1 parent ad4c611
commit 9d04a0e
Show file tree

Hide file tree

Showing 3 changed files with 15 additions and 15 deletions.
diff --git a/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/README.md b/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/README.md
@@ -1,7 +1,7 @@
 # MiniCPM-o-2_6
 In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on MiniCPM-o-2_6 model on [Intel GPUs](../../../README.md). For illustration purposes, we utilize [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) as reference MiniCPM-o-2_6 model.
 
-In the following examples, we will guide you to apply IPEX-LLM optimizations on MiniCPM-o-2_6 model for text/image/audio/video inputs.
+In the following examples, we will guide you to apply IPEX-LLM optimizations on MiniCPM-o-2_6 model for text/audio/image/video inputs.
 
 ## 0. Requirements & Installation
 
@@ -18,7 +18,7 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
   pip install --pre --upgrade ipex-llm[xpu_lnl] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us/
   pip install torchaudio==2.3.1.post0 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us/
   ``` 
-- For **Intel Arc B-Series GPU (code name Battlemage)** on Linux
+- For **Intel Arc B-Series GPU (code name Battlemage)** on Linux:
   ```cmd
   conda create -n llm python=3.11
   conda activate llm
@@ -49,19 +49,19 @@ pip install moviepy
   ```cmd
   set SYCL_CACHE_PERSISTENT=1
   ``` 
-- For **Intel Arc B-Series GPU (code name Battlemage)** on Linux
+- For **Intel Arc B-Series GPU (code name Battlemage)** on Linux:
   ```cmd
   unset OCL_ICD_VENDOR
   export SYCL_CACHE_PERSISTENT=1
   ``` 
 
 > [!NOTE]
-> We will update for runtime Configuration on more Intel GPU platforms.
+> We will update for runtime configuration on more Intel GPU platforms.
 
 ### 1. Example: Chat in Omni Mode
 In [omni.py](./omni.py), we show a use case for a MiniCPM-V-2_6 model to chat in omni mode with IPEX-LLM INT4 optimizations on Intel GPUs. In this example, the model will take a video as input, and conduct inference based on the images and audio of this video.
 
-For example, the video input shows a clip of an athlete swimming, with background audio asking "What the athlete is doing?". Then the model in omni mode should inference based on the images and the question in audio.
+For example, the video input shows a clip of an athlete swimming, with background audio asking "What the athlete is doing?". Then the model in omni mode should inference based on the images of the video and the question in audio.
 
 #### 1.1 Running example
 
@@ -70,7 +70,7 @@ python omni.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --video-path VIDEO_
 ```
 
 Arguments info:
-- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the openbmb/MiniCPM-o-2_6 (e.g. `openbmb/MiniCPM-o-2_6`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-o-2_6'`.
+- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for MiniCPM-o-2_6 model (e.g. `openbmb/MiniCPM-o-2_6`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-o-2_6'`.
 - `--video-path VIDEO_PATH`: argument defining the video input.
 - `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
 
@@ -87,7 +87,7 @@ In [chat.py](./chat.py), we show a use case for a MiniCPM-V-2_6 model to chat ba
 
 - Chat with text input
   ```bash
-  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt Prompt
+  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT
   ```
 
 - Chat with audio input
@@ -102,12 +102,12 @@ In [chat.py](./chat.py), we show a use case for a MiniCPM-V-2_6 model to chat ba
 
 - Chat with text + audio inputs
   ```bash
-  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt Prompt --audio-path AUDIO_PATH
+  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --audio-path AUDIO_PATH
   ```
 
 - Chat with text + image inputs
   ```bash
-  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt Prompt --image-path IMAGE_PATH
+  python chat.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --image-path IMAGE_PATH
   ```
 
 - Chat with audio + image inputs
@@ -117,10 +117,10 @@ In [chat.py](./chat.py), we show a use case for a MiniCPM-V-2_6 model to chat ba
 
 
 Arguments info:
-- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the openbmb/MiniCPM-o-2_6 (e.g. `openbmb/MiniCPM-o-2_6`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-o-2_6'`.
-- `--image-path IMAGE_PATH`: argument defining the image input.
-- `--audio-path AUDIO_PATH`: argument defining the audio input.
+- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for MiniCPM-o-2_6 model (e.g. `openbmb/MiniCPM-o-2_6`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-o-2_6'`.
 - `--prompt PROMPT`: argument defining the text input.
+- `--audio-path AUDIO_PATH`: argument defining the audio input.
+- `--image-path IMAGE_PATH`: argument defining the image input.
 - `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
 
 > [!TIP]

diff --git a/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/chat.py b/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/chat.py
@@ -28,8 +28,8 @@
 
 
 if __name__ == '__main__':
-    parser = argparse.ArgumentParser(description='Chat with MiniCPM-o-2_6 in Omni mode')
-    parser.add_argument('--repo-id-or-model-path', type=str,
+    parser = argparse.ArgumentParser(description='Chat with MiniCPM-o-2_6 with text/audio/image')
+    parser.add_argument('--repo-id-or-model-path', type=str, default="openbmb/MiniCPM-o-2_6",
                         help='The Hugging Face or ModelScope repo id for the MiniCPM-o-2_6 model to be downloaded'
                              ', or the path to the checkpoint folder')
     parser.add_argument('--image-path', type=str,

diff --git a/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/omni.py b/python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-o-2_6/omni.py
@@ -54,7 +54,7 @@ def get_video_chunk_content(video_path, temp_audio_name, flatten=True):
 
 if __name__ == '__main__':
     parser = argparse.ArgumentParser(description='Chat with MiniCPM-o-2_6 in Omni mode')
-    parser.add_argument('--repo-id-or-model-path', type=str,
+    parser.add_argument('--repo-id-or-model-path', type=str, default="openbmb/MiniCPM-o-2_6",
                         help='The Hugging Face or ModelScope repo id for the MiniCPM-o-2_6 model to be downloaded'
                              ', or the path to the checkpoint folder')
     parser.add_argument('--video-path', type=str, required=True,