The "SAVI project - Where's my coffee mug?" implements an advanced perception system that processes information collected from 3D sensors and conventional cameras. The goal is to extract objects from a generated point cloud and using them to train a neural network classifier. This classifier will then be able to tell what the object is.
The second assignment of the SAVI (Advanced Industrial Vision Systems) a curricular unit given at the university of aveiro in the Master's degree in mechanical engineering the project aimed to teach the basics of 3D point cloud understanding and processing, as well as the use of classifiers and integration as a system. The main objective was to recognize objects identified in the point cloud using the "Washington RGB-D Dataset".
This project uses Open3D for point cloud processing of a dataset, OpenCV for image processing and feature extraction and PyTorch for deep neural network training of a classifier that will be able to recognize objects.
It is necessary to install the following softwares before any use:
- Open3D
- OpenCV
- PyTorch
- Pickle
- Matplotlib
- GTTS
Also network connection is required.
For the point cloud and image generation this program uses the Washington RGB-D Dataset.
You can use the following command to download the program:
git clone https://github.com/joaodmatias/SaviProject2.git
To run the program you can start by moving to the directory where you cloned the repository. Once in there you can use:
./main.py -h
to get some help on options to run, including to add a path to run different scenarios. You can then use:
./main.py -p DATASET_PATH
while replacing "DATASET_PATH" with the path to the scenario you want to run. If no scenario is chosen, there is a preset scenario that will run.
- Different objects classification
- using ICP
- using volume
- using dimensions
- using shape
- 3D dataset processing
- find table
- processed items on table
- automation for all tipes of table
- for items on ground (2 planes of comparison)
- Extracting information from the point cloud such as:
- color
- dimensions
- volume
- orientation
- Audio processing
- Classificator:
- trained
- tested
- implementation
The color information will appear on the terminal where you run the program, as an approximation to the CSS21 list of colors as well as the actual RGB value.
The dimensions will appear as a tuple such as (width, height) in meters.
Here we can see the extraction of images of objects used to train the classifier:
- @jotadateta - [email protected] 93366
- @joaodmatias - [email protected] 93098
- @joaodrc - [email protected] 93439