Skip to content

Latest commit

 

History

History
96 lines (90 loc) · 4.18 KB

README.md

File metadata and controls

96 lines (90 loc) · 4.18 KB

Binarization of Manuscripts using DE-GAN and PCA

Description

This is an implementation for the bachelor thesis [Binarization of Manuscripts using DE-GAN and PCA] by Philipp Leeb. It is used to perform a binarization of MSIs with PCA, DE-GAN and combinations of both.

Requirements

  • tensorflow
  • matplotlib
  • pillow
  • scipy
  • imageio
  • tqdm
  • time
  • pathlib
  • numpy
  • cv2

Usage

To enhance an MSI, use the enhance_msi.py script. There, the different modes and configurations are described. The other scripts are either the single components of the pipeline or provide an useful tool for processing images. A brief description can be found at the top of each script.

DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement (Stand-alone DE-GAN)

Description

This is an implementation for the paper DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement
DE-GAN is a conditional generative adversarial network designed to enhance the document quality before the recognition process. It could be used for document cleaning, binarization, deblurring and watermark removal. The weights are available to test the enhancement.

License

This work is only allowed for academic research use. For commercial use, please contact the author.

Requirements

  • tensorflow
  • matplotlib
  • pillow
  • scipy
  • imageio
  • tqdm

Installation

  • Clone this repo:
git clone https://github.com/dali92002/DE-GAN
cd DE-GAN

Using DE-GAN

Document binarization

  • To binarize an image use the followng command:
python enhance.py binarize ./image_to_binarize ./directory_to_binarized_image

image:

alt text

binarized image:

alt text

Document deblurring

  • To deblur an image use the followng command:
python enhance.py deblur ./image_to_deblur ./directory_to_deblurred_image

blurred image:

alt text

enhanced image:

alt text

Watermark removal

  • To remove a watermark from an image use the followng command:
python enhance.py unwatermark ./image_to_unwatermark ./directory_to_unwatermarked_image

watermarked image:

alt text

clean image:

alt text

Document cleaning

  • Will be added: degraded image:

    alt text

    cleaned image:

    alt text

Training with your own data

  • To train with your own data, place your degraded images in the folder "images/A/" and the corresponding ground-truth in the folder "images/B/". It is necessary that each degraded image and its corresponding gt are having the same name (could have different extentions), also, the number images should be the same in both folders.
  • Command to train:
python train.py 
  • Specifying the batch size and the number of epochs could be done inside the code.

Citation

  • If this work was useful for you, please cite it as:
@ARTICLE{Souibgui2020,
  author={Mohamed Ali Souibgui  and Yousri Kessentini},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement}, 
  year={2020},
  doi={10.1109/TPAMI.2020.3022406}}