Today machine learning, deep learning and thanks to TensorFlow you dont need anymore to classify images using human and the algorithms that are available are better even the humans if you provide them with the right set of images and teach the algorithm the right way. I want to prove with this repo/code that is it possible to use TensorFlow and PHP to create automation algorithm that is using Reinforcement learning to re-train itself and improve by itself and get better and better of classifying images. I'm happy to help Albert Einstein as well and all the team.
This is a gravity-spy image classifier using TensorFlow for https://www.zooniverse.org/projects/zooniverse/gravity-spy/classify. The Zooniverse is the world’s largest and most popular platform for people-powered research. The project that this repo is about is for Gravity Spy which is trying to prove Albert Einstein who predicted that accelerating masses create ripples that propagate through the fabric of space time known as gravitational waves. You can read more about it here: https://www.zooniverse.org/projects/zooniverse/gravity-spy/about/research and also read those research papers here: https://arxiv.org/pdf/1611.04596.pdf and here https://arxiv.org/pdf/1705.00034.pdf
There are videos how i have build this project. You can watch them from here https://www.liveedu.tv/bulgaria_mitko/ZeLkP-learning-tensorflow-for-begginers/xWLzE-learning-tensorflow-for-begginers/
- Docker
- Python
- Jupyter Notebook
- TensorFlow
- Imagemagick
- Python: https://www.python.org/downloads/
- Imagemagick: https://www.imagemagick.org/script/download.php
- Docker: https://www.docker.com/community-edition. Check if you have installed Docker with:
docker run hello-world
and you should get
Hello from Docker!
This message shows that your installation appears to be working correctly.
...
- Install/Run image of TensorFlow inside Docker:
docker run -it tensorflow/tensorflow:1.1.0 bash
To test if TensorFlow is installed and running correctly run this code:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session() # It will print some warnings here.
print(sess.run(hello))
You should get: Hello TensorFlow!
5. Run docker. First create new folder called tf_files
and then run this command to run Docker:
docker run -it \
--publish 6006:6006 \
--volume ${HOME}/tf_files:/tf_files \
--workdir /tf_files \
tensorflow/tensorflow:1.1.0 bash
Your prompt should change to this: root@xxxxxxxxx:/tf_files#
=> xxxxxxxxx is the containerId
We will use images from https://www.zooniverse.org/projects/zooniverse/gravity-spy/classify and first when you load this page without registration you will have 3 classifiers: Blip
, Whistle
, None of the above
- Put at least 30 images from any category into a folder. I called my folder
Trainset
, but you can call it anything you want. The structure will be something like /tf_files/Trainset/blip/[images].png - Convert the images from png to jpg using this command, but in order to do this you need to be inside the folder of images:
mogrify -format jpg *.png
- Remove the png images as we dont need them: rm *.png
- Send /Trainset to docker container docker cp /tf_files/Trainset
containerId
:/tf_files/Trainset [i had some issue with this step, but it should work] - Download retrain.py in order to train algorithm:
curl -O https://raw.githubusercontent.com/tensorflow/tensorflow/r1.1/tensorflow/examples/image_retraining/retrain.py
- OPTIONAL STEP: In order to see how the algorithm is trained you need to run this code:
tensorboard --logdir training_summaries &
- Train the algorithm:
python retrain.py \
--bottleneck_dir=bottlenecks \
--how_many_training_steps=500 \
--model_dir=inception \
--summaries_dir=training_summaries/basic \
--output_graph=retrained_graph.pb \
--output_labels=retrained_labels.txt \
--image_dir=Trainset
- Download
tf_files
from docker container to local machine: docker cpcontainerId
:/tf_files /tf_files [i had some issue with this step, but it should work] - Download the python code for labeling new images:
curl -L https://goo.gl/3lTKZs > label_image.py
- Locate the /tf_files inside your local machine and use:
python label_image.py Trainset/blip/[someimage].jpg
. As a result you should get that this image is blip
- When we are sure an image is a given class, that means when all 4 images are more then 50% sure to be of a specific class OR we have an image that is more then 90% sure that is of a curtain class, then add it to the Trainset/[the class that the algorithm have decided those 4 images have to go]
- After that train the algorithm again with the newly added images in the Trainset
The purpose of this file to collect images from https://www.zooniverse.org/projects/zooniverse/gravity-spy/ website and put them into folders with the name of the subject
. The subject is used to group 4 images of the same type in one. We are using this subject number and we are creating 10 folders with each folder having inside 4 images which all images should be the same class. The only difference of those images are that they are of different measuring size, but they are all belong to one class. Code flow:
- Using Guzzle to get all 40 images grouping them by there subject number
- Deleting all images and folders in the folder 'testImages'
- Saving all 40 images in 10 folders with the name of the subject and each folder containing 4 images.
This file is the second stage in the API im building of automatically providing the correct classification using deep learning with TensorFlow. Code flow:
- Converting all images from png to jpg as all images have to be in jpg format
- Deleting all png images as we dont need them anymore
- Displaying every image and calculating using the algorithm what class this image belong to
- At every 4 images (the whole folder) decide whatever those 4 images belong to a class or not. The decision is base on whatever all images belong to 1 class and the average percent of all 4 images is more then 50% OR if the algorithm decide that there is one image that is more then 90% of a certain class. Then it is copying all 4 images to the folder tf_files/Trainset/[the class the algorithm decided this image belongs]
- Put all subjects into one json file as an array so when there is the same subject the code is not retraining again on the same images. The file is located at results/results.json
This file is used for sending the data to the server. This is very important page and when you have researched some subjects and assign them to there current class you can use this file to send the data to the server. The file is using curl, but it will be good if i do it using Guzzle as i was not able to see how to send the raw json data with the request. Code flow:
- Checking which subjects are NOT send to the server already
- Send data using curl to the server with the current date and time, subject ID and class name
- When the data is successfully sand store the subject and class name to the json file results/sendResults.json