- Notifications
You must be signed in to change notification settings - Fork107
Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021
License
clovaai/synthtiger
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Synthetic Text Image Generator for OCR Model |Paper |Documentation |Datasets
The documentation is available athttps://clovaai.github.io/synthtiger/.
You can check API reference in this documentation.
SynthTIGER requirespython>=3.6 andlibraqm.
To install SynthTIGER from PyPI:
$ pip install synthtiger
If you see a dependency error when you install or run SynthTIGER, installdependencies.
# Set environment variable (for macOS)$export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
usage: synthtiger [-h] [-o DIR] [-c NUM] [-w NUM] [-s NUM] [-v] SCRIPT NAME [CONFIG]positional arguments: SCRIPT Script file path. NAME Template class name. CONFIG Config file path.optional arguments: -h, --help show this help message and exit -o DIR, --output DIR Directory path to save data. -c NUM, --count NUM Number of output data. [default: 100] -w NUM, --worker NUM Number of workers. If 0, It generates data in the main process. [default: 0] -s NUM, --seed NUM Random seed. [default: None] -v, --verbose Print error messages while generating data.# horizontalsynthtiger -o results -w 4 -v examples/synthtiger/template.py SynthTiger examples/synthtiger/config_horizontal.yaml# verticalsynthtiger -o results -w 4 -v examples/synthtiger/template.py SynthTiger examples/synthtiger/config_vertical.yaml
images: a directory containing images.gt.txt: a file containing text labels.coords.txt: a file containing bounding boxes of characters with text effect.glyph_coords.txt: a file containing bounding boxes of characters without text effect.masks: a directory containing mask images with text effect.glyph_masks: a directory containing mask images without text effect.
synthtiger -o results -w 4 -v examples/multiline/template.py Multiline examples/multiline/config.yaml
images: a directory containing images.gt.txt: a file containing text labels.
Prepare corpus
txtformat, line by line (example).Prepare fonts
Seefont customization for more details.
Edit corpus path and font path in config file (example)
Run synthtiger
Prepare fonts
ttf/otfformat (example).Extract renderable charsets
python tools/extract_font_charset.py -w 4 fonts/
This script extracts renderable charsets for all font files (example).
Text files are generated in the input path with the same names as the fonts.
Edit font path in config file (example)
Run synthtiger
Prepare images
jpg/jpeg/png/bmpformat.Create colormaps
python tools/create_colormap.py --max_k 3 -w 4 images/ colormap.txt
This script creates colormaps for all image files (example).
Edit colormap path in config file (example)
Run synthtiger
You can implement custom templates by inheriting the base template.
fromsynthtigerimporttemplatesclassMyTemplate(templates.Template):def__init__(self,config=None):# initialize template.defgenerate(self):# generate data.definit_save(self,root):# initialize something before save.defsave(self,root,data,idx):# save data to specific path.defend_save(self,root):# finalize something after save.
SynthTIGER is available for download atgoogle drive.
Dataset was split into several smaller files. Please download all files and run following command.
# for Linux, macOScat synthtiger_v1.0.zip.*> synthtiger_v1.0.zip# for Windowscopy /b synthtiger_v1.0.zip.* synthtiger_v1.0.zip
synthtiger_v1.0.zip (36G) (md5: 5b5365f4fe15de24e403a9256079be70)
- Original paper version.
- Used MJ and ST label.
synthtiger_v1.1.zip (38G) (md5: b2757a7e2b5040b14ed64c473533b592)
- Used MJ and ST lexicon instead of MJ and ST label.
- Fixed a bug that applies transformation twice on curved text.
- Fixed a bug that incorrectly converts grayscale to RGB.
| Version | IIIT5k | SVT | IC03 | IC13 | IC15 | SVTP | CUTE80 | Total |
|---|---|---|---|---|---|---|---|---|
| 1.0 | 93.2 | 87.3 | 90.5 | 92.9 | 72.1 | 77.7 | 80.6 | 85.9 |
| 1.1 | 93.4 | 87.6 | 91.4 | 93.2 | 73.9 | 77.8 | 80.6 | 86.6 |
The structure of the dataset is as follows. The dataset contains 10M images.
gt.txtimages/ 0/ 0.jpg 1.jpg ... 9998.jpg 9999.jpg 1/ ... 998/ 999/The format ofgt.txt is as follows. Image path and label are separated by tab. (<image_path>\t<label>)
images/0/0.jpg10images/0/1.jpgdate:...images/999/9999998.jpgSTUFFIERimages/999/9999999.jpgRe:@inproceedings{yim2021synthtiger,title={SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models},author={Yim, Moonbin and Kim, Yoonsik and Cho, Han-Cheol and Park, Sungrae},booktitle={International Conference on Document Analysis and Recognition},pages={109--124},year={2021},organization={Springer}}
SynthTIGERCopyright (c) 2021-present NAVER Corp.Permission is hereby granted, free of charge, to any person obtaining a copyof this software and associated documentation files (the "Software"), to dealin the Software without restriction, including without limitation the rightsto use, copy, modify, merge, publish, distribute, sublicense, and/or sellcopies of the Software, and to permit persons to whom the Software isfurnished to do so, subject to the following conditions:The above copyright notice and this permission notice shall be included inall copies or substantial portions of the Software.THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS ORIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THEAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHERLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS INTHE SOFTWARE.The following directories and their subdirectories are licensed the same as their origins. Please refer toNOTICE
docs/resources/font/About
Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.




