Added a Gradio UI for multi-modal inferencing using Llama 3.2 Vision/ #718

himanshushukla12 · 2024-10-08T17:48:50Z

What does this PR do?

This PR introduces multi-modal inference using the Gradio UI for Llama 3.2 vision models. The Gradio UI allows users to upload images and generate descriptive text based on a prompt, with adjustable parameters such as top-k, max-tokens, temperature and top-p for fine-tuning text generation. With chatbox like interface.

Additionally, this PR:

Integrates the transformers and accelerate libraries for efficient model loading and inference.
Implements memory management for releasing GPU resources after inference.
Adds support for Hugging Face tokens to authenticate and access Llama models.

…o UI

…ty, image processing and text generation

init27

Thanks for the super fast PR! I left some requests

recipes/quickstart/inference/local_inference/README.md

recipes/quickstart/inference/local_inference/multi_modal_infer_Gradio_UI.py

…sssing hugigng-face token from the arguments

…ature and top_p sliders there.

himanshushukla12 · 2024-10-08T19:58:11Z

@init27 I did the changes you asked, please check and let me know... I'll be happy to make it better

Modified readme for new code for passing token via argument

Used small case "g" in gradio

himanshushukla12 · 2024-10-09T14:49:31Z

@init27 added the changes you asked, please check...

himanshushukla12 · 2024-10-14T07:18:49Z

@init27 please let me know if anything required...
I'm waiting for your response😄

himanshushukla12 and others added 6 commits October 8, 2024 17:27

added a file to start with Inferencing on llama3.2 vision using gradi…

0c985d8

…o UI

Added basic LlamaInference class structure, model loading functionali…

4053712

…ty, image processing and text generation

Implemented memory management to release GPU resources after inference

22be586

Modified requirements.txt by adding the gradio dependency

b2f9655

Added instructions in README.md for using the Gradio UI

19938dd

Merge branch 'meta-llama:main' into main

b94a340

facebook-github-bot added the cla signed label Oct 8, 2024

init27 requested changes Oct 8, 2024

View reviewed changes

recipes/quickstart/inference/local_inference/README.md Outdated Show resolved Hide resolved

recipes/quickstart/inference/local_inference/README.md Outdated Show resolved Hide resolved

recipes/quickstart/inference/local_inference/multi_modal_infer_Gradio_UI.py Outdated Show resolved Hide resolved

himanshushukla12 added 5 commits October 8, 2024 18:55

Change Gradio -> gradio

c609a44

Added passing of Hugging-face token from the arguments

750b499

Changed readme for usage of multimodal inferencing of gradio UI by pa…

3170c27

…sssing hugigng-face token from the arguments

Changes the UI from textbox to chatbox with max_tokens, rop_k, temper…

c0405b6

…ature and top_p sliders there.

added the passing of hugging-face token from the argument

6f7c028

himanshushukla12 added 2 commits October 9, 2024 01:41

Update README.md

a261aea

Modified readme for new code for passing token via argument

Update README.md

597e44e

Used small case "g" in gradio

himanshushukla12 requested a review from init27 October 10, 2024 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a Gradio UI for multi-modal inferencing using Llama 3.2 Vision/ #718

Added a Gradio UI for multi-modal inferencing using Llama 3.2 Vision/ #718

himanshushukla12 commented Oct 8, 2024 •

edited

Loading

init27 left a comment

himanshushukla12 commented Oct 8, 2024

himanshushukla12 commented Oct 9, 2024

himanshushukla12 commented Oct 14, 2024

Added a Gradio UI for multi-modal inferencing using Llama 3.2 Vision/ #718

Are you sure you want to change the base?

Added a Gradio UI for multi-modal inferencing using Llama 3.2 Vision/ #718

Conversation

himanshushukla12 commented Oct 8, 2024 • edited Loading

init27 left a comment

Choose a reason for hiding this comment

himanshushukla12 commented Oct 8, 2024

himanshushukla12 commented Oct 9, 2024

himanshushukla12 commented Oct 14, 2024

himanshushukla12 commented Oct 8, 2024 •

edited

Loading