Merge pull request #43 from tokk-nv/dev-nanoowl

Add NanoOWL tutorial
NVIDIA-AI-IOT · Nov 7, 2023 · 24249e2 · 24249e2
2 parents 296fbf1 + 1291aac
commit 24249e2
Show file tree

Hide file tree

Showing 3 changed files with 99 additions and 9 deletions.
diff --git a/docs/overrides/home.html b/docs/overrides/home.html
@@ -21,7 +21,7 @@
     }
 
     .tx-hero {
-        margin: 32px 2.8rem;
+        margin: 32px 1.8rem;
         color: var(--md-default-fg-color);
         justify-content: center;
     }
@@ -38,12 +38,26 @@
     }
 
     .tx-hero__image{
-        width:17rem;
-        height:17rem;
+        width:32rem;
+        height:18rem;
         order:1;
         padding-right: 2.5rem;
     }
 
+    .tx-hero__ytvideo{
+        position: relative;
+        width: 100%;
+        height: 100%;
+    }
+
+    .video {
+        position: absolute;
+        top: 0;
+        left: 0;
+        width: 100%;
+        height: 100%;
+    }
+
     .tx-hero .md-button {
         margin-top: .5rem;
         margin-right: .5rem;

diff --git a/docs/tutorial_nanoowl.md b/docs/tutorial_nanoowl.md
@@ -1,5 +1,81 @@
 # Tutorial - NanoOWL 
 
-Check out the GitHub repo, [https://github.com/NVIDIA-AI-IOT/nanoowl](https://github.com/NVIDIA-AI-IOT/nanoowl).
+Let's run [NanoOWL](https://github.com/NVIDIA-AI-IOT/nanoowl), [OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) optimized to run real-time on Jetson with [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt).
 
-![](https://raw.githubusercontent.com/NVIDIA-AI-IOT/nanoowl/main/assets/tree_predict_out.jpg)
+![](https://github.com/NVIDIA-AI-IOT/nanoowl/raw/main/assets/jetson_person_2x.gif)
+
+!!! abstract "What you need"
+
+    1. One of the following Jetson:
+
+        <span class="blobDarkGreen4">Jetson AGX Orin 64GB</span>
+        <span class="blobDarkGreen5">Jetson AGX Orin (32GB)</span>
+        <span class="blobLightGreen4">Jetson Orin Nano Orin (8GB)</span>
+
+    2. Running one of the following [JetPack.5x](https://developer.nvidia.com/embedded/jetpack)
+
+        <span class="blobPink1">JetPack 5.1.2 (L4T r35.4.1)</span>
+        <span class="blobPink2">JetPack 5.1.1 (L4T r35.3.1)</span>
+        <span class="blobPink3">JetPack 5.1 (L4T r35.2.1)</span>
+
+    3. Sufficient storage space (preferably with NVMe SSD).
+
+        - `7.2 GB` for container image
+        - Spaces for models
+
+## Clone and set up `jetson-containers`
+
+```
+git clone https://github.com/dusty-nv/jetson-containers
+cd jetson-containers
+sudo apt update; sudo apt install -y python3-pip
+pip3 install -r requirements.txt
+```
+
+## How to start
+
+Use `run.sh` and `autotag` script to automatically pull or build a compatible container image.
+
+```
+cd jetson-containers
+./run.sh $(./autotag nanoowl)
+```
+
+## How to run the tree prediction (live camera) example
+
+1. Ensure you have a camera device connected
+
+    ```
+    ls /dev/video*
+    ```
+
+    > If no video device is found, exit from the container and check if you can see a video device on the host side.
+
+2. Launch the demo
+    ```bash
+    cd examples/tree_demo
+    python3 tree_demo.py ../../data/owl_image_encoder_patch32.engine
+    ```
+
+    !!! info
+
+        If it fails to find or load the TensorRT engine file, build the TensorRT engine for the OWL-ViT vision encoder on your Jetson device.
+
+        ```bash
+        python3 -m nanoowl.build_image_encoder_engine \
+            data/owl_image_encoder_patch32.engine
+        ```
+
+3. Second, open your browser to ``http://<ip address>:7860``
+
+4. Type whatever prompt you like to see what works!  
+
+    Here are some examples
+
+    - Example: `[a face [a nose, an eye, a mouth]]`
+    - Example: `[a face (interested, yawning / bored)]`
+    - Example: `(indoors, outdoors)`
+
+### Result
+
+![](https://github.com/NVIDIA-AI-IOT/nanoowl/raw/main/assets/jetson_person_2x.gif)
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -78,17 +78,17 @@ nav:
       - Stable Diffusion: tutorial_stable-diffusion.md
       - Stable Diffusion XL: tutorial_stable-diffusion-xl.md
     - Vision Transformers (ViT):
-      - EfficientViT: tutorial_efficientvit.md
+      - EfficientViT 🆕: tutorial_efficientvit.md
       - NanoSAM: tutorial_nanosam.md
-      - NanoOWL: tutorial_nanoowl.md
+      - NanoOWL 🆕: tutorial_nanoowl.md
       - SAM: tutorial_sam.md
       - TAM: tutorial_tam.md
     # - NanoOWL: tutorial_nanoowl.md
     - Vector Database:
       - NanoDB: tutorial_nanodb.md
     - Audio:
-      - Audiocraft: tutorial_audiocraft.md
-      - Whisper: tutorial_whisper.md
+      - Audiocraft 🆕: tutorial_audiocraft.md
+      - Whisper 🆕: tutorial_whisper.md
     # - Tools:
     #   - LangChain: tutorial_distillation.md
     - Tips: