Browse > Article
http://dx.doi.org/10.3837/tiis.2019.02.016

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning  

Kong, Jun (Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University)
Sun, Jinhua (Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University)
Jiang, Min (Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University)
Hou, Jian (Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.2, 2019 , pp. 771-789 More about this Journal
Abstract
Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.
Keywords
Text detection; bootstrap learning; image segmentation; text sample selection model;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 F. Yang, H. Lu, and Y. W. Chen, "Human tracking by multiple kernel boosting with locality affinity constraints," in Proc. of Computer Vision - ACCV 2010 - Asian Conference on Computer Vision, Queenstown, New Zealand, November 8-12, 2010, Revised Selected Papers, pp. 39-50, November, 2010.
2 H. Cho, M. Sung, and B. Jun, "Canny text detector: fast and robust scene text localization algorithm," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3566-3573, June, 2016.
3 X. Bai, B. Shi, C. Zhang, X. Cai, and L. Qi, "Text/non-text image classification in the wild with convolutional neural networks," Pattern Recognition, vol. 66, pp. 437-446, June, 2017.   DOI
4 L. Kang, Y. Li, and D. Doermann, "Orientation robust text line detection in natural images," Computer Vision and Pattern Recognition, pp. 4034-4041, June, 2014.
5 V. K. Pham and G. S. Lee, "Robust text detection in natural scene images," in Proc. of Australasian Joint Conference on Artificial Intelligence, pp. 720-725, December, 2016.
6 B. Epshtein, E. Ofek, and Y. Wexler, "Detecting text in natural scenes with stroke width transform," Computer Vision and Pattern Recognition, pp. 2963-2970, June, 2010.
7 Z. Tu, Y. Ma, W. Liu, X. Bai, and C. Yao, "Detecting texts of arbitrary orientations in natural images," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083-1090, June, 2012.
8 W. Huang, Z. Lin, J. Yang, and J. Wang, "Text localization in natural images using stroke feature transform and text covariance descriptors," in Proc. of IEEE International Conference on Computer Vision, pp. 1241-1248, December, 2013.
9 H. Chen, S. S. Tsai, G. Schroth, and D. M. Chen, "Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions," in Proc. of IEEE International Conference on Image Processing, pp. 2609-2612, September, 2011.
10 Y. Zheng, J. Liu, H. Liu, Q. Li, and G. Li, "Integrated method for text detection in natural scene images," Ksii Transactions on Internet & Information Systems, vol. 10, pp. 5583-5604, November, 2016.   DOI
11 C. Yi and Y. Tian, "Text string detection from natural scenes by structure-based partition and grouping," IEEE Transactions on Image Processing, vol. 20, pp. 2594-2605, March, 2011.   DOI
12 X. Chen and A. L. Yuille, "Detecting and reading text in natural scenes," in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 366-373, July, 2004.
13 L. Neumann and J. Matas, "Scene text localization and recognition with oriented stroke detection," in Proc. of IEEE International Conference on Computer Vision IEEE Computer Society, pp. 97-104, December, 2013.
14 S. M. Hanif and L. Prevost, "Text detection and localization in complex scene images using constrained adaboost algorithm," in Proc. of International Conference on Document Analysis and Recognition, pp. 1-5, July, 2009.
15 S. Tian, Y. Pan, C. Huang, S. Lu, K. Yu, and C. L. Tan, "Text flow: a unified text detection system in natural scene images," in Proc. of 2015 IEEE International Conference on Computer Vision pp. 4651-4659, April, 2015.
16 L. Neumann and J. Matas, "Real-time scene text localization and recognition," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538-3545, June, 2012.
17 J. Lee, J. S. Park, C. P. Hong, and Y. H. Seo, "Illumination-robust foreground extraction for text area detection in outdoor environment," Ksii Transactions on Internet &Information Systems, vol. 11, pp. 345-359, January, 2017.   DOI
18 X. C. Yin, X. Yin, K. Huang, and H. W. Hao, "Robust text detection in natural scene images," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 36, pp. 970-983, May, 2014.   DOI
19 X. C. Yin, W. Y. Pei, J. Zhang, and H. W. Hao, "Multi-orientation scene text detection with adaptive clustering," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, pp. 1930-1937, September, 2015.   DOI
20 C. Shi, C. Wang, B. Xiao, Y. Zhang, and S. Gao, "Scene text detection using graph model built upon maximally stable extremal regions," Pattern Recognition Letters, vol. 34, pp. 107-116, January, 2013.   DOI
21 A. Shahab, F. Shafait, and A. Dengel, "ICDAR 2011 robust reading competition challenge 2: reading text in scene images," in Proc. of International Conference on Document Analysis and Recognition, pp. 1491-1496, September, 2011.
22 K. I. Kim, K. Jung, and J. H. Kim, "Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm," Pattern Analysis & Machine Intelligence IEEE Transactions on, vol. 25, pp. 1631-1639, December, 2003.   DOI
23 T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 24, pp. 971-987, July, 2002.   DOI
24 Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, and X. Bai, "Multi-oriented text detection with fully convolutional networks," Computer Vision and Pattern Recognition, pp. 4159-4167, June, 2016.
25 A. Gupta, A. Vedaldi, and A. Zisserman, "Synthetic data for text localisation in natural images," Computer Vision and Pattern Recognition, pp. 2315-2324, June, 2016.
26 W. He, X. Y. Zhang, F. Yin, and C. L. Liu, "Deep direct regression for multi-oriented scene text detection," in Proc. of IEEE International Conference on Computer Vision, pp. 745-753, October, 2017.
27 J. Matas, O. Chum, M. Urban, and T. Pajdla, "Robust wide-baseline stereo from maximally stable extremal regions," in Proc. of the British Machine Vision Conference 2002, vol. 22, pp. 761-767, September, 2004.
28 Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 23, pp. 1222-1239, November, 2001.   DOI
29 K. He, J. Sun, and X. Tang, "Single image haze removal using dark channel prior," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 33, pp. 2341-2353, December, 2011.   DOI
30 H. Jiang, J. Wang, Z. Yuan, T. Liu, N. Zheng, and S. Li, "Automatic salient object segmentation based on context and shape prior," in Proc. of British Machine Vision Conference, pp. 110.1-110.12, January, 2011.
31 H. Freeman and R. Shapira, "Determining the minimum-area encasing rectangle for an arbitrary closed curve," Communications of the Acm, vol. 18, pp. 409-413, July, 1975.   DOI
32 V. Kolmogorov and R. Zabih, "What energy functions can be minimized via graph cuts?," in Proc. of European Conference on Computer Vision, pp. 65-81, April, 2002.
33 Y. Boykov and V. Kolmogorov, "An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 26, pp. 1124-1137, September, 2004.   DOI
34 F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan, "Multiple kernel learning, conic duality, and the SMO algorithm," in Proc. of International Conference, pp. 6-14, July, 2004.
35 H. I. Koo and D. H. Kim, "Scene text detection via connected component clustering and nontext filtering," IEEE Transactions on Image Processing, vol. 22, pp. 2296-2305, June, 2013.   DOI
36 D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. I. Bigorda, S. R. Mestre, et al., "ICDAR 2013 robust reading competition," in Proc. of International Conference on Document Analysis and Recognition, pp. 1484-1493, August, 2013.
37 Y. Zhu, C. Yao, and X. Bai, "Scene text detection and recognition: recent advances and future trends," Frontiers of Computer Science, vol. 10, pp. 19-36, February, 2016.   DOI
38 H. Turki, M. B. Halima, and A. M. Alimi, "Text detection based on MSER and CNN features," in Proc. of Iapr International Conference on Document Analysis and Recognition, pp. 949-954, January, 2018.
39 A. Zamberletti, L. Noce, and I. Gallo, "Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions," in Proc. of Asian Conference on Computer Vision, pp. 91-105, April, 2014.
40 S. Lu, T. Chen, S. Tian, J. H. Lim, and C. L. Tan, "Scene text extraction based on edges and support vector regression," International Journal on Document Analysis & Recognition, vol. 18, pp. 125-135, June, 2015.   DOI
41 Q. Ye and D. Doermann, "Text detection and recognition in imagery: a survey," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 37, pp. 1480-1500, July, 2015.   DOI
42 J. Berry, I. Fasel, L. Fadiga, and D. Archangeli, "Training deep nets with imbalanced and unlabeled data," in Proc. of 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, pp. 1754-1757, January, 2012. https://www.isca-speech.org/archive/interspeech_2012/i12_1756.html
43 N. Tong, H. Lu, R. Xiang, and M. H. Yang, "Salient object detection via bootstrap learning," Computer Vision and Pattern Recognition, pp. 1884-1892, June, 2015.