From 48381ddefce33535cdd2a61596ca97546e4deea5 Mon Sep 17 00:00:00 2001 From: Dustin Franklin Date: Sat, 9 Mar 2024 20:56:35 -0500 Subject: [PATCH] updated commands --- docs/tutorial_live-llava.md | 38 ++++++++++++++++++------------------- docs/tutorial_nano-vlm.md | 2 +- 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/tutorial_live-llava.md b/docs/tutorial_live-llava.md index eae16ad3..2a5aeaac 100644 --- a/docs/tutorial_live-llava.md +++ b/docs/tutorial_live-llava.md @@ -47,12 +47,12 @@ The [VideoQuery](https://github.com/dusty-nv/jetson-containers/blob/master/packa ```bash ./run.sh $(./autotag local_llm) \ - python3 -m local_llm.agents.video_query --api=mlc --verbose \ - --model Efficient-Large-Model/VILA-2.7b \ - --max-context-len 768 \ - --max-new-tokens 32 \ - --video-input /dev/video0 \ - --video-output webrtc://@:8554/output + python3 -m local_llm.agents.video_query --api=mlc \ + --model Efficient-Large-Model/VILA-2.7b \ + --max-context-len 768 \ + --max-new-tokens 32 \ + --video-input /dev/video0 \ + --video-output webrtc://@:8554/output ``` @@ -67,12 +67,12 @@ The example above was running on a live camera, but you can also read and write ./run.sh \ -v /path/to/your/videos:/mount $(./autotag local_llm) \ - python3 -m local_llm.agents.video_query --api=mlc --verbose \ - --model Efficient-Large-Model/VILA-2.7b \ - --max-new-tokens 32 \ - --video-input /mount/my_video.mp4 \ - --video-output /mount/output.mp4 \ - --prompt "What does the weather look like?" + python3 -m local_llm.agents.video_query --api=mlc \ + --model Efficient-Large-Model/VILA-2.7b \ + --max-new-tokens 32 \ + --video-input /mount/my_video.mp4 \ + --video-output /mount/output.mp4 \ + --prompt "What does the weather look like?" ``` This example processes and pre-recorded video (in MP4, MKV, AVI, FLV formats with H.264/H.265 encoding), but it also can input/output live network streams like [RTP](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#rtp), [RTSP](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#rtsp), and [WebRTC](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#webrtc) using Jetson's hardware-accelerated video codecs. @@ -85,13 +85,13 @@ To enable this mode, first follow the [**NanoDB tutorial**](tutorial_nanodb.md) ```bash ./run.sh $(./autotag local_llm) \ - python3 -m local_llm.agents.video_query --api=mlc --verbose \ - --model Efficient-Large-Model/VILA-2.7b \ - --max-context-len 768 \ - --max-new-tokens 32 \ - --video-input /dev/video0 \ - --video-output webrtc://@:8554/output \ - --nanodb /data/nanodb/coco/2017 + python3 -m local_llm.agents.video_query --api=mlc \ + --model Efficient-Large-Model/VILA-2.7b \ + --max-context-len 768 \ + --max-new-tokens 32 \ + --video-input /dev/video0 \ + --video-output webrtc://@:8554/output \ + --nanodb /data/nanodb/coco/2017 ``` You can also tag incoming images and add them to the database using the panel in the web UI. diff --git a/docs/tutorial_nano-vlm.md b/docs/tutorial_nano-vlm.md index 090c0a57..798fee84 100644 --- a/docs/tutorial_nano-vlm.md +++ b/docs/tutorial_nano-vlm.md @@ -116,7 +116,7 @@ These models can also be used with the [Live Llava](tutorial_live-llava.md) agen ``` bash ./run.sh $(./autotag local_llm) \ python3 -m local_llm.agents.video_query --api=mlc \ - --model NousResearch/Obsidian-3B-V0.5 \ + --model Efficient-Large-Model/VILA-2.7b \ --max-context-len 768 \ --max-new-tokens 32 \ --video-input /dev/video0 \