- To close domain gap between the synthetic traffic cones and the real ones, first perceptual loss is added to CycleGAN. Next, distillation loss is used to enforce the CycleGAN to mimic the real traffic cones via a pretrained YOLO.
- An example result - transferring night to day scene of traffic cones.
This repo is inspired from Clarifying Rainy traffic photos. You can check the presentation of the first results here
- Linux or OSX.
- Python 2 or Python 3.
- CPU or NVIDIA GPU + CUDA CuDNN.
- Install PyTorch and dependencies from http://pytorch.org/
- Install Torch vision from the source.
git clone https://github.com/pytorch/vision
cd vision
python setup.py install
pip install visdom
pip install dominate
- Clone this repo:
git clone https://github.com/EliasVansteenkiste/CycleGANwithPerceptionLoss
cd CycleGANwithPerceptionLoss
- Download a CycleGAN dataset (e.g. maps):
bash ./datasets/download_cyclegan_dataset.sh maps
- Train a model:
#!./scripts/train_cyclegan.sh
python train.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan --no_dropout
- To view training results and loss plots, run
python -m visdom.server
and click the URL http://localhost:8097. To see more intermediate results, check out./checkpoints/maps_cyclegan/web/index.html
- Test the model:
#!./scripts/test_cyclegan.sh
python test.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan --phase test --no_dropout
The test results will be saved to a html file here: ./results/maps_cyclegan/latest_test/index.html
.
First download the dataset and rename the two sets. For example rainy -> A, suny -> B. I have four folders in ./datasets/rainy_sunny/ testA, testB, trainA and trainB.
- Training with the following parameters led to the best results;
python3 train.py --dataroot ./datasets/rainy_sunny --name rainy_sunny_cyclegan --model cycle_gan --no_dropout --batchSize 3 --display_id 0 --niter 200 --niter_decay 200 --lambda_A 10.0 --lambda_B 10.0 --lambda_feat 1.0
- Testing the model
python3 test.py --dataroot ./datasets/rainy_sunny --name rainy_sunny_cyclegan --model cycle_gan --phase test --no_dropout --display_id 0 --how_many 600
Tiling the images is done with the img_concat.py script
- Download a pix2pix dataset (e.g.facades):
bash ./datasets/download_pix2pix_dataset.sh facades
- Train a model:
#!./scripts/train_pix2pix.sh
python train.py --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --which_model_netG unet_256 --which_direction BtoA --lambda_A 100 --dataset_mode aligned --no_lsgan --norm batch
- To view training results and loss plots, run
python -m visdom.server
and click the URL http://localhost:8097. To see more intermediate results, check out./checkpoints/facades_pix2pix/web/index.html
- Test the model (
bash ./scripts/test_pix2pix.sh
):
#!./scripts/test_pix2pix.sh
python test.py --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --which_model_netG unet_256 --which_direction BtoA --dataset_mode aligned --norm batch
The test results will be saved to a html file here: ./results/facades_pix2pix/latest_val/index.html
.
More example scripts can be found at scripts
directory.
If you would like to apply a pre-trained model to a collection of input photos (without image pairs), please use --dataset_mode single
and --model test
options. Here is a script to apply a pix2pix model to facade label maps (stored in the directory facades/testB
).
#!./scripts/test_single.sh
python test.py --dataroot ./datasets/facades/testB/ --name facades_pix2pix --model test --which_model_netG unet_256 --which_direction BtoA --dataset_mode single
- See
options/train_options.py
andoptions/base_options.py
for training flags; seeoptions/test_options.py
andoptions/base_options.py
for test flags. - CPU/GPU (default
--gpu_ids 0
): Set--gpu_ids -1
to use CPU mode; set--gpu_ids 0,1,2
for multi-GPU mode. You need a large batch size (e.g.--batchSize 32
) to benefit from multiple gpus. - During training, the current results can be viewed using two methods. First, if you set
--display_id
> 0, the results and loss plot will be shown on a local graphics web server launched by visdom. To do this, you should have visdom installed and a server running by the commandpython -m visdom.server
. The default server URL ishttp://localhost:8097
.display_id
corresponds to the window ID that is displayed on thevisdom
server. Thevisdom
display functionality is turned on by default. To avoid the extra overhead of communicating withvisdom
set--display_id 0
. Second, the intermediate results are saved to[opt.checkpoints_dir]/[opt.name]/web/
as an HTML file. To avoid this, set--no_html
. - Images can be resized and cropped in different ways using
--resize_or_crop
option. The default option'resize_and_crop'
resizes the image to be of size(opt.loadSize, opt.loadSize)
and does a random crop of size(opt.fineSize, opt.fineSize)
.'crop'
skips the resizing step and only performs random cropping.'scale_width'
resizes the image to have widthopt.fineSize
while keeping the aspect ratio.'scale_width_and_crop'
first resizes the image to have widthopt.loadSize
and then does random cropping of size(opt.fineSize, opt.fineSize)
.
Download the CycleGAN datasets using the following script. Some of the datasets are collected by other researchers. Please cite their papers if you use the data.
bash ./datasets/download_cyclegan_dataset.sh dataset_name
facades
: 400 images from the CMP Facades dataset. [Citation]cityscapes
: 2975 images from the Cityscapes training set. [Citation]maps
: 1096 training images scraped from Google Maps.horse2zebra
: 939 horse images and 1177 zebra images downloaded from ImageNet using keywordswild horse
andzebra
apple2orange
: 996 apple images and 1020 orange images downloaded from ImageNet using keywordsapple
andnavel orange
.summer2winter_yosemite
: 1273 summer Yosemite images and 854 winter Yosemite images were downloaded using Flickr API. See more details in our paper.monet2photo
,vangogh2photo
,ukiyoe2photo
,cezanne2photo
: The art images were downloaded from Wikiart. The real photos are downloaded from Flickr using the combination of the tags landscape and landscapephotography. The training set size of each class is Monet:1074, Cezanne:584, Van Gogh:401, Ukiyo-e:1433, Photographs:6853.iphone2dslr_flower
: both classes of images were downlaoded from Flickr. The training set size of each class is iPhone:1813, DSLR:3316. See more details in our paper.
To train a model on your own datasets, you need to create a data folder with two subdirectories trainA
and trainB
that contain images from domain A and B. You can test your model on your training set by setting phase='train'
in test.lua
. You can also create subdirectories testA
and testB
if you have test data.
You should not expect our method to work on just any random combination of input and output datasets (e.g. cats<->keyboards
). From our experiments, we find it works better if two datasets share similar visual content. For example, landscape painting<->landscape photographs
works much better than portrait painting <-> landscape photographs
. zebras<->horses
achieves compelling results while cats<->dogs
completely fails.
Download the pix2pix datasets using the following script. Some of the datasets are collected by other researchers. Please cite their papers if you use the data.
bash ./datasets/download_pix2pix_dataset.sh dataset_name
facades
: 400 images from CMP Facades dataset. [Citation]cityscapes
: 2975 images from the Cityscapes training set. [Citation]maps
: 1096 training images scraped from Google Mapsedges2shoes
: 50k training images from UT Zappos50K dataset. Edges are computed by HED edge detector + post-processing. [Citation]edges2handbags
: 137K Amazon Handbag images from iGAN project. Edges are computed by HED edge detector + post-processing. [Citation]
We provide a python script to generate pix2pix training data in the form of pairs of images {A,B}, where A and B are two different depictions of the same underlying scene. For example, these might be pairs {label map, photo} or {bw image, color image}. Then we can learn to translate A to B or B to A:
Create folder /path/to/data
with subfolders A
and B
. A
and B
should each have their own subfolders train
, val
, test
, etc. In /path/to/data/A/train
, put training images in style A. In /path/to/data/B/train
, put the corresponding images in style B. Repeat same for other data splits (val
, test
, etc).
Corresponding images in a pair {A,B} must be the same size and have the same filename, e.g., /path/to/data/A/train/1.jpg
is considered to correspond to /path/to/data/B/train/1.jpg
.
Once the data is formatted this way, call:
python datasets/combine_A_and_B.py --fold_A /path/to/data/A --fold_B /path/to/data/B --fold_AB /path/to/data
This will combine each pair of images (A,B) into a single image file, ready for training.
Code is inspired by pytorch-CycleGAN-and-pix2pix.