Lanuages | Num of Images | Num of Text | Baidu Drive | Google Drive |
---|---|---|---|---|
English/Latin | 728K | ~20M | Link password: 2h8d | Link |
Multilingual | 674K | ~18M | Link password: tddl | Link |
The multilingual version consists of the following 10 languages: Arabic, English, French, Chinese, German, Korean, Japanese, Italian, Bangla, Hindi
Both datasets are very large (~150GB). Therefore, I split them into "several" files (~130). They are organzied as follows:
./
+---sub_0
+---imgs
| 0.jpg
| 1.jpg
| ...
|
+---labels
| 0.json
| 1.json
| ...
|
+---sub_1
+---sub_2
+---sub_3
...
+---sub_100
...
The labels are stored in the following format:
{
"imgfile":str path to the corresponding image file, e.g. "imgs/0.jpg",
"bbox": List[
word_i(8 float):[x0, y0, x1, y1, x2, y2, x3, x4]
(from upper left corner, clockwise),
],
"cbox": List[
char_i(8 float):[x0, y0, x1, y1, x2, y2, x3, x4]
(from upper left corner, clockwise),
],
"text": List[str]
}
Scene Name | Baidu Drive | Google Drive |
---|---|---|
Realistic Rendering | Link password: wgja | Link |
How-to:
- download and uncompress the project
- in UE4.22, load the following file:
Demo/Demo.uproject
Resources | Baidu Drive | Google Drive |
---|---|---|
background images | Link password: 3x3r | Link |
fonts & corpus | Link password: ip8w | Link |
Scenes | Baidu Drive | Google Drive |
---|---|---|
All 30 scene executables | Link password: z3af | Link |
How-to:
- download and uncompress the project
- cd to
$Name/$Name/Binaries/Linux/
, and double-click the executable./Demo
- alternatively, you can launch it in terminal,
./$Name/$Name/Binaries/Linux/Demo