Skip to content

Latest commit

 

History

History
271 lines (234 loc) · 9.1 KB

README.md

File metadata and controls

271 lines (234 loc) · 9.1 KB

Down-Sampling Based Video Coding with Degradation-aware Restoration-Reconstruction Deep Neural Network (TIP'2021 - MMM'2020)

Man M. Ho, Gang He, Zheng Wang, and Jinjia Zhou.

News

Date News
2021/01/10 Our extended work has been accepted to and published in TIP'2021 (Open Access)
2020/07/05 We have updated the evaluation code for RR-DnCNN and RR-DnCNN v2.0
2020/01/12 Our RR-DnCNN will be published together with its version 2.0 at once
2020/01/07 We won Best Paper Runner-up Award at MMM'2020
2019/09/24 Our work has been accepted to MMM'2020 (Oral)

Papers:

Down-Sampling Based Video Coding with Degradation-aware Restoration-Reconstruction Deep Neural Network
Minh-Man Ho, Gang He, Zheng Wang, and Jinjia Zhou
In International Conference on Multimedia Modeling (MMM), 2020.

RR-DnCNN v2.0: Enhanced Restoration-Reconstruction Deep Neural Network for Down-Sampling Based Video Coding
(Open Access)
Man M. Ho, Jinjia Zhou, and Gang He
In IEEE Transactions on Image Processing (TIP), 2021.

Prerequisites

  • Ubuntu 16.04
  • OpenCV
  • PyTorch >= 1.1.0
  • Numpy

Getting Started.

Let's reproduce our result.

1. Clone this repo:

git clone https://github.com/minhmanho/rrdncnn.git
cd rrdncnn

2. Prepare data:

Since the files are large, URLs here need your confirmation; otherwise, just run the command lines with recursion '-r'.

+ Johnny 1280x720 in Class E (65 MB) compressed using Low Delay P (recommended).

wget --no-check-certificate -r 'https://docs.google.com/uc?export=download&id=1PZWTdTkMxLpaSTJ4A7Xxw2QK6tmYy-d9' -O ./data/class_e.zip
unzip -o ./data/class_e.zip -d ./data/

or

./data/download_class_e.sh

+ BasketballDrill 832x480 in Class C (384 MB) compressed using All Intra (optional).

wget --no-check-certificate -r 'https://docs.google.com/uc?export=download&id=1GsnzTPGEVS8v-aMfQyrKExqKx-jZP7uP' -O ./data/class_c.zip
unzip -o ./data/class_c.zip -d ./data/

or

./data/download_class_c.sh

The expected data structure (for Johnny 1280x720 compressed by HEVC 16.20 with Low Delay P - LDP) is as:

./data/class_e
│   DHR_LDP.csv (HEVC on HR as <video_name>,<bir-rate kbps>, <Y-PSNR>, <U-PSNR>, <V-PSNR>, <YUV-PSNR>)
│   DLR_LDP.csv (HEVC on LR as <video_name>,<bir-rate kbps>, <Y-PSNR>, <U-PSNR>, <V-PSNR>, <YUV-PSNR>)
│   DHR_LDP_QP[32-47].txt (HEVC running logs)
|   DLR_LDP_QP[32-47].txt (HEVC running logs)
|
└───DLR_LDP_QP[32-47] (Decoded Low-Resolution)
│   │   Johnny_1280x720_50.yuv
|
└───HR (Uncompressed High-Resolution)
│   │   Johnny_1280x720_50.yuv
|
└───LR (Uncompressed Low-Resolution down-sampled x2 by bicubic)
│   │   Johnny_1280x720_50.yuv

3. Run models:

CUDA_VISIBLE_DEVICES=0 python eval.py --vin <path/to/class/folder> --ckpt <path/to/model> --size <video_size HxW> --decoder_folder <folder/contains/videos> --log_dir ./logs/

For example:

+ Run with RR-DnCNN.

CUDA_VISIBLE_DEVICES=0 python eval.py --vin ./data/class_e/ --ckpt ./models/rrdncnn.pth.tar --size 360x640 --decoder_folder DLR_LDP_QP32

+ Run with RR-DnCNN v2.0.

CUDA_VISIBLE_DEVICES=0 python eval.py --vin ./data/class_e/ --ckpt ./models/rrdncnn_v2.pth.tar --size 360x640 --decoder_folder DLR_LDP_QP32

+ Run with our prepared script.

./run.sh

Results

After running, you will get the results as follows:

+ <path/to/class/folder> or <args.out>/<model>_SDLR_<configuration>_<QP> (Folder contains reconstructed videos)
+ <args.log_dir>/<configuration>_<model>.csv (CSV file contains quantitative results)

Beside printing, we also export the results in CSV to log_dir for copying/pasting, where the columns (left-to-right) in the CSV file represent:

video_name_QP, HR-Y-PSNR, HR-Y-SSIM, LR-Y-PSNR, LR-Y-SSIM

E.g.,

rrdncnn_Johnny_1280x720_50_DLR_LDP_QP32,35.0897,0.9118,38.2423,0.9594

Finally, we calculate the BD-rates based on the results of HEVC (attached in the class package) and our methods. The expected output on Johnny (LDP) and BasketballDrill (AI) should be:

Sequence_QP HEVC 16.20 Low Bit-rate (kbps) RR-DnCNN RR-DnCNN v2.0
Bit-rate (kbps) PSNR (dB) PSNR BD-BR (%) PSNR (dB) BD-BR (%)
Johnny_1280x720_50.yuv_32 87.7059 38.9999 34.4353 35.0897 -12.9307 35.4468 -15.8121
Johnny_1280x720_50.yuv_37 46.9529 36.7226 18.9812 33.4981 33.6468
Johnny_1280x720_50.yuv_42 25.5059 34.1018 11.4706 31.2232 31.3090
Johnny_1280x720_50.yuv_47 14.9129 31.5612 7.0847 28.7739 28.8354
BasketballDrill_832x480_50.yuv_32 1833.6642 35.4191 685.0295 31.0428 -10.9197 31.0627 -12.4616
BasketballDrill_832x480_50.yuv_37 1018.0177 32.7758 392.1459 29.2932 29.365
BasketballDrill_832x480_50.yuv_42 559.657 30.1624 208.8544 26.9593 27.0212
BasketballDrill_832x480_50.yuv_47 277.0484 27.5161 97.052 24.7342 24.7411

Let's check our pre-run version (26 KB) for more details.

wget 'https://docs.google.com/uc?export=download&id=1VkiMalZp_UxejxA3ScoE7fzDDZoWACFc' -O ./logs/result_sample.zip
unzip -o ./logs/result_sample.zip -d ./logs/

or

./logs/download_result_sample.sh

Citations

Please cite this work if you find it useful.

Version 1.0:

@inproceedings{ho2020down,
  title={Down-sampling based video coding with degradation-aware restoration-reconstruction deep neural network},
  author={Ho, Minh-Man and He, Gang and Wang, Zheng and Zhou, Jinjia},
  booktitle={International Conference on Multimedia Modeling},
  pages={99--110},
  year={2020},
  organization={Springer}
}

Version 2.0:

@article{ho2021rr,
  title={RR-DnCNN v2.0: Enhanced Restoration-Reconstruction Deep Neural Network for Down-Sampling-Based Video Coding},
  author={Ho, Man M and Zhou, Jinjia and He, Gang},
  journal={IEEE Transactions on Image Processing},
  volume={30},
  pages={1702--1715},
  year={2021},
  publisher={IEEE}
}

License

This repository (as well as its materials) is for non-commercial uses and research purposes only.

Acknowledgement

Thank G. Bjontegaard and S. Pateux for ETRO's Bjontegaard Metric Implementation.

G. Bjontegaard, Calculation of average PSNR differences between RD-curves (VCEG-M33)
S. Pateux, J. Jung, An excel add-in for computing Bjontegaard metric and its evolution

This work is supported by JST, PRESTO Grant Number JPMJPR1757 Japan.

Contact

If you have any questions, or the use of material violates your copyright / license, please contact me [email protected].