Browse > Article
http://dx.doi.org/10.3837/tiis.2021.01.010

Location-Based Saliency Maps from a Fully Connected Layer using Multi-Shapes  

Kim, Hoseung (Visual Information Processing, Korea University)
Han, Seong-Soo (Division of Liberal Studies, Kangwon National University)
Jeong, Chang-Sung (Department of Electrical Engineering, Korea University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.1, 2021 , pp. 166-179 More about this Journal
Abstract
Recently, with the development of technology, computer vision research based on the human visual system has been actively conducted. Saliency maps have been used to highlight areas that are visually interesting within the image, but they can suffer from low performance due to external factors, such as an indistinct background or light source. In this study, existing color, brightness, and contrast feature maps are subjected to multiple shape and orientation filters and then connected to a fully connected layer to determine pixel intensities within the image based on location-based weights. The proposed method demonstrates better performance in separating the background from the area of interest in terms of color and brightness in the presence of external elements and noise. Location-based weight normalization is also effective in removing pixels with high intensity that are outside of the image or in non-interest regions. Our proposed method also demonstrates that multi-filter normalization can be processed faster using parallel processing.
Keywords
Saliency Map; Contrast; Fully Connected Layer; Multi Shape; Location-Based Normalization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Itti and C. Koch, "Learning to detect salient objects in natural scenes using visual attention," in Proc. of Image Understanding Workshop, pp. 1201-1206, 1999.
2 Y. Wu, N. Zheng, Z. Yuan, H. Jiang, and T. Liu, "Detection of salient objects with focused attention based on spatial and temporal coherence," Chinese Science Bulletin, vol. 56, pp. 1055-1062, 2011.   DOI
3 B. Schauerte and G. A. Fink, "Focusing computational visual attention in multi-modal humanrobot interaction," in Proc. of International conference on multimodal interfaces and the workshop on machine learning for multimodal interaction, vol. 6, pp. 1-8, 2010.
4 Y. Wang, X. Zhao, X. Hu, Y. Li, and K. Huang, "Focal boundary guided salient object detection," IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2813-2824, 2019.   DOI
5 H. Alipour, F. Towhidkhah, S. Jafari, A. Menon, and H. Namazi, "Complexity-based analysis of the relation between fractal visual stimuli and fractal eye movements," Fluctuation and Noise Letters, vol. 18, no. 3, 2019.
6 M. Costa, L. Bonetti, V. Vignali, A. Bichicchi, C. Lantieri, and A. Simone, "Driver's visual attention to different categories of roadside advertising signs," Applied ergonomics, vol. 78, pp. 127-136, 2019.   DOI
7 A. W. Toga, R. L. Goo, R. Murphy, and R. C. Collins, "Neuroscience application of interactive image analysis," Optical Engineering, vol. 23, no. 3, 1984.
8 D. D. Cox and T. Dean, "Neural networks and neuroscience-inspired computer vision," Current Biology, vol. 24, no. 18, 2014.
9 H. A. Alhaija, S. K. Mustikovela, L. Mescheder, A. Geiger, and C. Rother, "Augmented reality meets computer vision: Efficient data generation for urban driving scenes," International Journal of Computer Vision, vol. 129, pp. 961-972, 2018.
10 B. Zhou, H. Zhao, X. Puig, T. Xiao, S. Fidler, A. Barriuso, and A. Torralba, "Semantic understanding of scenes through the ade20k dataset," International Journal of Computer Vision, vol. 127, pp. 302-321, 2019.   DOI
11 T. Xiao, Y. Liu, B. Zhou, Y. Jiang, and J. Sun, "Unified perceptual parsing for scene understanding," in Proc. of the European Conference on Computer Vision (ECCV), pp. 418-434, 2018.
12 P. Zhou, Z. Qi, S. Zheng, J. Xu, H. Bao, and B. Xu, "Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling," arXiv preprint arXiv:1611.06639, 2016.
13 Z. Li, N. Teng, M. Jin, and H. Lu, "Building efficient CNN architecture for offline handwritten Chinese character recognition," International Journal on Document Analysis and Recognition (IJDAR), vol. 21, no. 4, pp. 233-240, 2018.   DOI
14 J. H. Cho and C. G. Park, "Additional feature CNN based automatic target recognition in SAR image," in Proc. of 2017 Fourth Asian Conference on Defence Technology(ACDT), pp. 1-4, 2017.
15 Y. Lavinia, H. H. Vo, and A. Verma, "Fusion based deep CNN for improved large-scale image action recognition," in Proc. of 2016 IEEE International Symposium on Multimedia (ISM), pp. 609-614, 2016.
16 S. Hershey, S. Chaudhuri, D. P. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, and M. Slaney, "CNN architectures for large-scale audio classification," in Proc. of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 131-135, 2017.
17 T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, long short-term memory, fully connected deep neural networks," in Proc. of 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580-4584, 2015.
18 A. Kasagi, T. Tabaru, and H. Tamura, "Fast algorithm using summed area tables with unified layer performing convolution and average pooling," in Proc. of 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing, pp. 1-6, 2017.
19 M. Rezaei, H. Yang, and C. Meinel, "Deep neural network with l2-norm unit for brain lesions detection," in Proc. of International Conference on Neural Information Processing, pp. 798-807, 2017.
20 M. Kuchnik and V. Smith, "Efficient augmentation via data subsampling," arXiv preprint arXiv:1810.05222, 2018.
21 K. Bernardin and R. Stiefelhagen, "Evaluating multiple object tracking performance: the CLEAR MOT metrics," EURASIP Journal on Image and Video Processing, 2008.
22 W. Liu, Y. Wen, Z. Yu, and M. Yang, "Large-margin softmax loss for convolutional neural networks," arXiv:1612.02295, 2016.
23 X. Liang, X. Wang, Z. Lei, S. Liao, and S. Z. Li, "Soft-margin softmax for deep classification," in Proc. of International Conference on Neural Information Processing, pp. 413-421, 2017.
24 D. Marin, Z. He, P. Vajda, P. Chatterjee, S. Tsai, F. Yang, and Y. Boykov, "Efficient segmentation: Learning downsampling near semantic boundaries," in Proc. of the IEEE International Conference on Computer Vision, pp. 2131-2141, 2019.
25 A. Borji, M. M. Cheng, H. Jiang, and J. Li, "Salient object detection: A benchmark," IEEE transactions on image processing, vol. 24, no. 12, pp. 5706-5722, 2015.   DOI
26 P. Kapsalas, K. Rapantzikos, A. Sofou, and Y. Avrithis, "Regions of interest for accurate object detection," in Proc. of 2008 International Workshop on Content-Based Multimedia Indexing, pp. 147-154, 2008.
27 A. J. Fredo, R. S. Abilash, R. Femi, A. Mythili, and C. S. Kumar, "Classification of damages in composite images using Zernike moments and support vector machines," Composites Part B: Engineering, vol. 168, pp. 77-86, 2019.   DOI
28 J. F. Canny, "A Variational Approach to Edge Detection," AAAI-83 Proceedings, pp. 54-58, 1983.
29 M. Sharifi, M. Fathy, and M. T. Mahmoudi, "A classified and comparative study of edge detection algorithms," in Proc. of International conference on information technology: Coding and computing, pp. 117-120, 2002.
30 Y. Yang, M. Yang, S. Huang, Y. Que, M. Ding, and J. Sun, "Multifocus image fusion based on extreme learning machine and human visual system," IEEE access, vol. 5, pp. 6989-7000, 2017.   DOI
31 S. J. D. Lawrence, D. G. Norris, and F. P de Lange, "Dissociable laminar profiles of concurrent bottom-up and top-down modulation in the human visual cortex," Elife, 2019.
32 L. Isik, E. M. Meyers, J. Z. Leibo, and T. Poggio, "The dynamics of invariant object recognition in the human visual system," Journal of Neurophysiology, vol. 111, no. 1, pp. 91-102, 2014.   DOI
33 C. S. Konen and S. Kastner, "Two hierarchically organized neural systems for object information in human visual cortex," Nature neuroscience, vol. 11, pp. 224-231, 2008.   DOI
34 R. F. Schwarzlose, J. D. Swisher, S. Dang, and N. Kanwisher, "The distribution of category and location information across object-selective regions in human visual cortex," National Academy of Sciences, vol. 105, no. 11, pp. 4447-4452, 2008.   DOI