DOI QR코드

DOI QR Code

Tobacco Retail License Recognition Based on Dual Attention Mechanism

  • Shan, Yuxiang (Chinese Tobacco Zhejiang Industrial Company Limited) ;
  • Ren, Qin (Chinese Tobacco Zhejiang Industrial Company Limited) ;
  • Wang, Cheng (Chinese Tobacco Zhejiang Industrial Company Limited) ;
  • Wang, Xiuhui (Dept. of Computer, China Jiliang University)
  • Received : 2022.01.27
  • Accepted : 2022.06.15
  • Published : 2022.08.31

Abstract

Images of tobacco retail licenses have complex unstructured characteristics, which is an urgent technical problem in the robot process automation of tobacco marketing. In this paper, a novel recognition approach using a double attention mechanism is presented to realize the automatic recognition and information extraction from such images. First, we utilized a DenseNet network to extract the license information from the input tobacco retail license data. Second, bi-directional long short-term memory was used for coding and decoding using a continuous decoder integrating dual attention to realize the recognition and information extraction of tobacco retail license images without segmentation. Finally, several performance experiments were conducted using a largescale dataset of tobacco retail licenses. The experimental results show that the proposed approach achieves a correction accuracy of 98.36% on the ZY-LQ dataset, outperforming most existing methods.

Keywords

Acknowledgement

This work in this paper was supported by the Research on Key Technology and Application of Marketing Robot Process Automation (RPA) Based on Intelligent Image Recognition in Zhejiang China Tobacco Industry Co. Ltd. (No. ZJZY2021E001).

References

  1. M. Deng, Z. Li, Y. Kang, C. P. Chen, and X. Chu, "A learning-based hierarchical control scheme for an exoskeleton robot in human-robot cooperative manipulation," IEEE Transactions on Cybernetics, vol. 50, no. 1, pp. 112-125, 2020. https://doi.org/10.1109/tcyb.2018.2864784
  2. A. Ravendran, M. Bryson, and D. G. Dansereau, "Burst imaging for light-constrained structure-frommotion," IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 1040-1047, 2022. https://doi.org/10.1109/LRA.2021.3137520
  3. Y. Zhang, S. Nie, S. Liang, and W. Liu, "Robust text image recognition via adversarial sequence-to-sequence domain adaptation," IEEE Transactions on Image Processing, vol. 30, pp. 3922-3933, 2021. https://doi.org/10.1109/TIP.2021.3066903
  4. Y. S. Chernyshova, A. V. Sheshkus, and V. V. Arlazarov, "Two-step CNN framework for text line recognition in camera-captured images," IEEE Access, vol. 8, pp. 32587-32600, 2020. https://doi.org/10.1109/access.2020.2974051
  5. Z. Ou, B. Xiong, F. Xiao, and M. Song, "ERCS: an efficient and robust card recognition system for camerabased image," China Communications, vol. 17, no. 12, pp. 247-264, 2020.
  6. A. Bera, Z. Wharton, Y. Liu, N. Bessis, and A. Behera, "Attend and guide (AG-Net): a keypoints-driven attention-based deep network for image recognition," IEEE Transactions on Image Processing, vol. 30, pp. 3691-3704, 2021. https://doi.org/10.1109/TIP.2021.3064256
  7. R. Islam, M. R. Islam, and K. H. Talukder, "An efficient ROI detection algorithm for Bangla text extraction and recognition from natural scene images," Journal of King Saud University-Computer and Information Sciences, 2022. https://doi.org/10.1016/j.jksuci.2022.02.001
  8. Q. Lai, S. Khan, Y. Nie, H. Sun, J. Shen, and L. Shao, "Understanding more about human and machine attention in deep neural networks," IEEE Transactions on Multimedia, 23, 2086-2099, 2020.
  9. Z. Luo, J. Li, and Y. Zhu, "A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition," IEEE Signal Processing Letters, vol. 28, pp. 1060-1064, 2021. https://doi.org/10.1109/LSP.2021.3079850