Versión en español disponible aquí.
A better version of this algorithm is available at https://github.com/javirk/Person-remover-partial-convolutions.
Would you like to travel to a touristic spot and yet appear alone in the photos?
Person remover is a project that combines Pix2Pix and YOLO arhitectures in order to remove people or other objects from photos. For Pix2Pix, the code from Tensorflow has been adapted, whereas for YOLO, the code has been adapted from https://github.com/zzh8829/yolov3-tf2.
This project is capable of removing objects in images and video.
Python 3.7 and Tensorflow 2.0-beta have been used in this project.
Try it in Google Colab.
YOLO has been combined with Pix2Pix. A pre-trained YOLO network has been used for object detection (generating a bounding box around them), and its output is fed to a Pix2Pix's generator that has learned how to fill holes in the center of images, using the images without holes as a reference:
- YOLO detects the objects
- A subimage of every object is taken, adding the pixels around it
- Out of every subimage, the center pixels are removed (replaced by ones) and the result is sent to the generator, whose task is to fill it with the surrounding pixels.
In order to illustrate the training process of Pix2Pix, the following images can be observed. A hole has been drilled and the generator has learnt how to fill it.
These instructions will you train a model in your local machine. However, the training dataset that has been used for Pix2Pix are not publicly available. This dataset consists of 14900, 256x256x3 images. The code handles the creation of a hole in the center of the images and learns how to fill it with the surrounding data.
In order to use the program Python 3.7 and the libraries specified in requirements.txt
should be installed.
Clone the repository
git clone https://github.com/javirk/Person_remover.git
Download and save the YOLO weights in the folder ./yolo
, convert them and move them to ./yolo/data
wget https://pjreddie.com/media/files/yolov3.weights -O data/yolov3.weights
python convert.py
Download the weights for Pix2Pix from Google Drive
and put them in ./pix2pix/checkpoint/
.
To get results of images, run person_remover.py
:
python person_remover.py -i /dir/of/input/images
In a video, in contrast:
python person_remover.py -v /dir/of/video
It is also possible to specify the type of object to remove (people, bags and handbags are chosen by default):
python person_remover.py -i /dir/to/input/images -ob 1 2 3
Which will remove the objects specified as 1, 2 and 3 (starting from 0) that appear in the file yolo/data/coco.names
.
In this case bikes, cars and motorbikes.
YOLO network is taken pretrained. For Pix2Pix networks, the training has spanned 23 epochs in a dataset of 14900 training and 100 test images using the default parameters. It is worth noticing that the training process is extremely sensitive, so the best results might not come in the first run.
Training with the default parameters is performed as follows:
python image_inpainting.py -train /dir/of/training/images -test /dir/of/test/images -mode /train
A walking tour of Paris video has been used.
Results can be improved replacing the object detector network (YOLO) by a semantic segmentator. In this way, the generator will have to fill just the part relative to the person, not the whole bounding box. Due to a matter of time and processing capacity, this improvement could not be developed yet.
Modification of Pix2Pix by a more advanced architecture, such as Pix2PixHD.
This project is under Apache license. See LICENSE.md for more details.
- zzh8829 for YOLO's code
- Tensorflow for Pix2Pix' code