Sanster/text_rendererPublic

NotificationsYou must be signed in to change notification settings
Fork387
Star1.4k

Generate text images for training deep learning ocr model

License

MIT license

1.4k stars 387 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
configs		configs
data		data
gists		gists
imgs		imgs
libs		libs
textrenderer		textrenderer
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
help_runner.py		help_runner.py
main.py		main.py
parse_args.py		parse_args.py
requirements.txt		requirements.txt

Repository files navigation

text_renderer

Text Renderer

Generate text images for training deep learning OCR model (e.g.CRNN).Support both latin and non-latin text.

Setup

Ubuntu 16.04
python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Demo

By default, simply runpython3 main.py will generate 20 text imagesand a labels.txt file inoutput/default/.

Use your own data to generate image

Please runpython3 main.py --help to see all optional arguments and their meanings.And put your own data in corresponding folder.
Config text effects and fraction inconfigs/default.yaml file(or create anew config file and use it by--config_file option), here are some examples:

Effect name	Image
Origin(Font size 25)
Perspective Transform
Random Crop
Curve
Light border
Dark border
Random char space big
Random char space small
Middle line
Table line
Under line
Emboss
Reverse color
Blur
Text color
Line color

Runmain.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only supportlimited chars. In this case, you will get bad results like these:

Select fonts that support all chars in--chars_file is annoying.Runmain.py with--strict option, renderer will retry get text fromcorpus during generate processing until all chars are supported by a font.

Tools

You can usecheck_font.py script to check how many chars your font not support in--chars_file:

python3 tools/check_font.pychecking font ./data/fonts/eng/Hack-Regular.ttfchars not supported(4971):['第','朱','广','沪','联','自','治','县','驼','身','进','行','纳','税','防','火','墙','掏','心','内','容','万','警','钟','上','了','解'...]0 fonts support all chars(5071)in ./data/chars/chn.txt:[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA.Compiling OpenCV with CUDA support

Then build Cython part, and add--gpu option when runmain.py

cd libs/gpupython3 setup.py build_ext --inplace

Debug mode

Runpython3 main.py --debug will save images with extract information.You can see how perspectiveTransform works and all bounding/rotated boxes.

Todo

Seehttps://github.com/Sanster/text_renderer/projects/1

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,author ={weiqing.chu},title ={text_renderer},howpublished ={\url{https://github.com/Sanster/text_renderer}},year ={2021}}

About

Generate text images for training deep learning ocr model

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

New version release：https://github.com/oh-my-ocr/text_renderer

Text Renderer

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

Citing text_renderer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors6

Languages

Movatterモバイル変換

License

Sanster/text_renderer

Folders and files

Latest commit

History

Repository files navigation

New version release：https://github.com/oh-my-ocr/text_renderer

Text Renderer

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

Citing text_renderer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors6

Languages

Packages