InsightVision is a powerful AI-powered application designed to detect objects within images, analyze the scene, and generate a summary with object counts. Additionally, it provides a scene description based on the detected objects. This tool is perfect for understanding complex images and gaining valuable insights from them.
- YOLOv5: A cutting-edge object detection model for identifying objects in images.
- Gradio: For creating a user-friendly web interface.
- OpenCV: For image processing and manipulation.
- Torch: For deep learning model handling.
- Python: The primary programming language used to implement the application.
- PIL (Python Imaging Library): For image processing and displaying images.
-
Object Detection:
- Automatically detects objects in uploaded images using YOLOv5.
- Supports common objects like people, vehicles, animals, and more.
-
Scene Summarization:
- Provides a textual summary of the detected objects along with their counts.
- Offers insights into the scene based on the object count and types.
-
Scene Description:
- Automatically generates a scene description based on the detected objects, helping users understand the context of the image.
-
Image Download Option:
- After processing, users can download the annotated image with all detected objects marked.
-
Share Feature:
- Users can generate a shareable link to share their results with others directly.
-
Interactive Web Interface:
- Easy-to-use web interface built with Gradio.
- Upload images, capture from a webcam, or paste from the clipboard to get results instantly.
-
Object Count:
- Displays the count of detected objects, making it easier to analyze object distribution within the image.
-
Animated Gradient Background:
- Features an aesthetically pleasing animated gradient background that enhances the user experience.
InsightVision allows three ways to provide images:
- Upload from File: You can upload an image directly from your device.
- Capture from Webcam: Capture a live image using your device's webcam.
- Paste from Clipboard: Paste an image directly from your clipboard into the app.
Follow these steps to deploy the InsightVision app:
-
Create a Space on Hugging Face:
- Log in to Hugging Face and create a new Space for the application.
-
Upload Files:
- Upload the following files to the Hugging Face Space:
app.py
(Main application file)requirements.txt
(Dependencies file)
- Upload the following files to the Hugging Face Space:
-
Set Up Requirements:
- Add the following dependencies to
requirements.txt
:gradio torch opencv-python-headless numpy pillow
- Add the following dependencies to
-
Store API Keys (Optional):
- If your app requires API keys for additional functionality, store them securely in the Secrets section of Hugging Face.
-
Launch the Space:
- Once the files are uploaded and dependencies are set up, launch your Space, and the application will be live!
We welcome contributions to enhance the InsightVision project! Here are some ideas to get started:
- Improve Object Detection: Support for additional objects or models.
- Refine Scene Descriptions: Enhance the scene description generation for more diverse contexts.
- UI Enhancements: Add more interactive elements or improve the UI/UX.
Feel free to fork the repository, submit pull requests, or open issues for suggestions!
This project is powered by a community of developers and AI enthusiasts. Let’s collaborate and make this tool even more powerful!
🎉 Start your journey with InsightVision now!