DOI QR코드

DOI QR Code

Modeling the Visual Target Search in Natural Scenes

  • Park, Daecheol (School of Industrial Management Engineering, Korea University) ;
  • Myung, Rohae (School of Industrial Management Engineering, Korea University) ;
  • Kim, Sang-Hyeob (BT Convergence Technology Research Department, Electronics and Telecommunications Research Institute) ;
  • Jang, Eun-Hye (BT Convergence Technology Research Department, Electronics and Telecommunications Research Institute) ;
  • Park, Byoung-Jun (BT Convergence Technology Research Department, Electronics and Telecommunications Research Institute)
  • 투고 : 2012.05.11
  • 심사 : 2012.10.22
  • 발행 : 2012.12.31

초록

Objective: The aim of this study is to predict human visual target search using ACT-R cognitive architecture in real scene images. Background: Human uses both the method of bottom-up and top-down process at the same time using characteristics of image itself and knowledge about images. Modeling of human visual search also needs to include both processes. Method: In this study, visual target object search performance in real scene images was analyzed comparing experimental data and result of ACT-R model. 10 students participated in this experiment and the model was simulated ten times. This experiment was conducted in two conditions, indoor images and outdoor images. The ACT-R model considering the first saccade region through calculating the saliency map and spatial layout was established. Proposed model in this study used the guide of visual search and adopted visual search strategies according to the guide. Results: In the analysis results, no significant difference on performance time between model prediction and empirical data was found. Conclusion: The proposed ACT-R model is able to predict the human visual search process in real scene images using salience map and spatial layout. Application: This study is useful in conducting model-based evaluation in visual search, particularly in real images. Also, this study is able to adopt in diverse image processing program such as helper of the visually impaired.

키워드

참고문헌

  1. Anderson, J. R., The Architecture of Cognition. Cambridge, MA: Harvard University Press, 1983.
  2. Anderson, J. R., Matessa, M. and Douglass, S., "The ACT-R theory and visual attention.", Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, (pp. 61-65), Hillsdale, NJ: Lawrence Erlbaum, 1995.
  3. Anderson, J. R., Matessa, M. and Lebiere, C., ACT-R: A theory of higher level cognition and its relation to visual attention, Human-Computer Interaction, 12(4), 439-462, 1997. https://doi.org/10.1207/s15327051hci1204_5
  4. Biderman, I., Meaanotte, R. J. and Rabinowitz, J. C., Scene perception: detecting and judging objects undergoing relational violations. Cognitive Psychology, 14(2), 143-177, 1982. https://doi.org/10.1016/0010-0285(82)90007-X
  5. Choi, K. J. and Lee, Y. B., A Saliency Map Model for Color Images using Statistical Information and Local Competitive Relations of Extracted Features, Korean Brain Society, 2(1), 69-78, 2002.
  6. Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A. and Hebert, M., An empirical study of context in object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp.1271-1278), 2009.
  7. Elazary, L. and Itti, L., A Bayesian model for efficient visual search and recognition, Vision Research, 50(14), 1338-1352, 2010. https://doi.org/10.1016/j.visres.2010.01.002
  8. Fleetwood, M. D. and Byrne, M. D., Modeling icon search in ACT-R/PM, Cognitive Systems Research, 3(1), 25-33, 2002. https://doi.org/10.1016/S1389-0417(01)00041-9
  9. Fleetwood, M. D. and Byrne, M. D., Modeling the visual search of displays: a revised ACT-R model of icon search based on eye-tracking data, Human-Computer Interaction, 21(2), 153-197, 2006. https://doi.org/10.1207/s15327051hci2102_1
  10. http://www.saliencytoolbox.net/index.html (retrieved March 2, 2012)
  11. http://act-r.psy.cmu.edu (retrieved January 10, 2011)
  12. Itti, L., Koch, C. and Niebur, E., A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254-1259, 1998. https://doi.org/10.1109/34.730558
  13. Kieras, D. E. and Meyer, D. E., An overview of the EPIC architecture for cognition and performance with application to human-computer interaction, Human-Computer Interaction, 12, 391-438, 1997. https://doi.org/10.1207/s15327051hci1204_4
  14. Koch, C. and Ullman, S., Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry, Human Neurobiology, 4, 219-227, 1985.
  15. Laird, J. E., Newell, A. and Rosenbloom, P. S., SOAR: An architecture for general intelligence, Artificial Intelligence, 33, 1-64, 1987. https://doi.org/10.1016/0004-3702(87)90050-6
  16. Lee, M. H., Artificial vision system of selective attention, Journal of the Korean Institute of Electronics Engineers, 36(11), 52-65, 2009.
  17. Meur, O. L. and Callet, P. L., What we see is most likely to be what matters: visual attention and application, ICIP2009, pp.3085-3088.
  18. Nilsen, E. L., Perceptual-motor control in human-computer interaction, Technical Report, No. 37, Ann Arbor, University of Michigan, Cognitive Science and Machine Intelligence Laboratory, 1991.
  19. Russell, B. C., Torralba, A., Murphy, K. P. and Freeman, W. T., LabelMe: a database and web-based tool for image annotation, International Journal of computer vision, 77(1), 157-173, 2008. https://doi.org/10.1007/s11263-007-0090-8
  20. Salvucci, D. D., A model of eye movements and visual attention, Proceedings of the International Conference on Cognitive Modeling, (pp. 252-259). Veenendaal, TheNetherlands: Universal Press, 2000.
  21. Sanocki, T. and Epstein, W., Priming Spatial Layout of Scenes, Psychological Science, 8(5), 374-378, 1997. https://doi.org/10.1111/j.1467-9280.1997.tb00428.x
  22. Torralba, A., Olivia, A., Castelhano, M. S. and Henderson, J. M., Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search, Psychological Review, 113(4), 766-786, 2006. https://doi.org/10.1037/0033-295X.113.4.766
  23. Torralba, A., Choi, M. J., Lim, J. J. and Willsky, A. S., A Tree-Based Context Model for Object Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 2012.
  24. Treisman, A. M. and Gelade, G. A., A Feature-integration Theory of Attention, Cognitive Psychology, 12(1), 97-136, 1980. https://doi.org/10.1016/0010-0285(80)90005-5
  25. Walther, D., Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics. PhD thesis, California Institute of Technology, Pasadena, CA, 23th February 2006.
  26. Walther, D. B. and Koch, C., Attention in Hierarchical Models of Object Recognition, Progress in Brain Research, 165, 57-78, 2007. https://doi.org/10.1016/S0079-6123(06)65005-X
  27. Silfverberg, M., MacKenzie, I. S. & Korhonen, P., Predicting Text Entry Speed on Mobile Phones. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 9-16, The Hague, The Netherlands, 2000.
  28. St. Amant, R., Horton, T. E. & Ritter, F. E., Model-Based Evaluation of Cell Phone Menu Interaction. In Proceedings of the Conference on Human Factors in Computing Systems, 343-350, 2004.
  29. St. Amant, R., Horton, T. E. & Ritter, F. E., Model-Based Evaluation of Expert Cell Phone Menu Interaction. ACM Transactions on Computer- Human Interaction, Vol. 14, No. 1, Article 1, 2007.
  30. TextwareSolution., The FITALY one-finger keyboards, http://fitaly.com/fitaly, 1998.
  31. Zhai, S., Hunter, M., & Smith, B. A., The Metropolis Keyboard - An Exploration of Quantitative Techniques for Virtual Keyboard Design, In Proceedings of the UIST 2000, CHI Letters 2(2), ACM Press, 119 -128, 2000.