This repository contains the code for the following two papers:
-
[1] Sezer Karaoglu, Ran Tao, Theo Gevers, Arnold W. M. Smeulders, Words Matter: Scene Text for Image Classification and Retrieval, in IEEE Transactions on Multimedia, 2017
-
[2] Sezer Karaoglu, Ran Tao, Jan van Gemert, Theo Gevers, Con-Text: Text Detection for Fine-grained Object Classification, in IEEE Transactions on Image Processing, 2017
[1] introduces a fully unsupervised word proposal method to detect words in images and shows the detected words are useful for image classification and retrieval. [2] proposes a novel text (character) detection method based on text saliency. If you find the word proposal method and the textual representation of the detected words useful in your research, please consider citing [1]. If you find the saliency based text detection method useful, please consider citing [2].
Contact: [email protected], [email protected]
[Dataset]: The Con-Text dataset can be found here https://staff.fnwi.uva.nl/s.karaoglu/datasetWeb/Dataset.html
'Finegrained_ImageNames.mat' is the list of images in the Con-Text dataset.
[Text detection]: The code in the folder 'text_detection/' is for generating word bounding box proposals. See 'text_detection/demo.m'.
[Generate textual representation]: Refer to 'EncodeTextualConTextScript.m' for how to generate representations of the word-level textual contents in images. Both the CPU version ('EncodeTextual.m') and the GPU version ('EncodeTextualGPU.m') are provided. To generate the representations of the word-level textual contents, the word recognition model provided by Jaderberg et al (http://www.robots.ox.ac.uk/~vgg/research/text/) is required. Go to folder 'NIPS2014DLW-Jaderberg/' and run 'download.sh' to download the word recognition model.
[Generate visual representation]: Refer to 'deep_visual_features/extract_googlenet_feat.py' for how to extract googlenet features. Caffe (https://github.com/BVLC/caffe) is needed.
[Fine tune googlenet on the Con-Text dataset]: See folder 'finetune_googlenet/'.
[Classification]: See 'run_classification.m'. libsvm (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) is required.