- Notifications
You must be signed in to change notification settings - Fork154
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
License
MhLiao/TextBoxes
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Recommend:TextBoxes++ is an extended work of TextBoxes, which supports oriented scene text detection. The recognition part is also included inTextBoxes++.
This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard nonmaximum suppression. For more details, please refer to ourpaper.
Please cite TextBoxes in your publications if it helps your research:
@inproceedings{LiaoSBWL17, author = {Minghui Liao and Baoguang Shi and Xiang Bai and Xinggang Wang and Wenyu Liu}, title = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network}, booktitle = {AAAI}, year = {2017}}
- Get the code. We will call the directory that you cloned Caffe into
$CAFFE_ROOT
git clone https://github.com/MhLiao/TextBoxes.gitcd TextBoxesmake -j8make py
- Models trained on ICDAR 2013:Dropbox linkBaiduYun link
- Fully convolutional reduced (atrous) VGGNet:Dropbox linkBaiduYun link
- Compiled mex file for evaluation(for multi-scale test evaluation: evaluation_nms.m):Dropbox linkBaiduYun link
- run "python examples/demo.py".
- You can modify the "use_multi_scale" in the "examples/demo.py" script to control whether to use multi-scale or not.
- The results are saved in the "examples/results/".
- Train about 50k iterions on Synthetic data which refered in the paper.
- Train about 2k iterions on corresponding training data such as ICDAR 2013 and SVT.
- For more information, such as learning rate setting, please refer to the paper.
- Using the given test code, you can achieve an F-measure of about 80% on ICDAR 2013 with a single scale.
- Using the given multi-scale test code, you can achieve an F-measure of about 85% on ICDAR 2013 with a non-maximum suppression.
- More performance information, please refer to the paper and Task1 and Task4 of Challenge2 on the ICDAR 2015 website:http://rrc.cvc.uab.es/?ch=2&com=evaluation
The reference xml file is as following:
<?xml version="1.0" encoding="utf-8"?> <annotation> <object> <name>text</name> <bndbox> <xmin>158</xmin> <ymin>128</ymin> <xmax>411</xmax> <ymax>181</ymax> </bndbox> </object> <object> <name>text</name> <bndbox> <xmin>443</xmin> <ymin>128</ymin> <xmax>501</xmax> <ymax>169</ymax> </bndbox> </object> <folder></folder> <filename>100.jpg</filename> <size> <width>640</width> <height>480</height> <depth>3</depth> </size> </annotation>
Please let me know if you encounter any issues.
About
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Languages
- C++80.6%
- Python9.1%
- Cuda5.9%
- CMake2.4%
- MATLAB0.8%
- Makefile0.6%
- Other0.6%