Browse > Article
http://dx.doi.org/10.3743/KOSIM.2019.36.3.131

Towards the Generation of Language-based Sound Summaries Using Electroencephalogram Measurements  

Kim, Hyun-Hee (명지대학교 문헌정보학과)
Kim, Yong-Ho (부경대학교 신문방송학과)
Publication Information
Journal of the Korean Society for information Management / v.36, no.3, 2019 , pp. 131-148 More about this Journal
Abstract
This study constructed a cognitive model of information processing to understand the topic of a sound material and its characteristics. It then proposed methods to generate sound summaries, by incorporating anterior-posterior N400/P600 components of event-related potential (ERP) response, into the language representation of the cognitive model of information processing. For this end, research hypotheses were established and verified them through ERP experiments, finding that P600 is crucial in screening topic-relevant shots from topic-irrelevant shots. The results of this study can be applied to the design of classification algorithm, which can then be used to generate the content-based metadata, such as generic or personalized sound summaries and video skims.
Keywords
cognitive model; event-related potential response; anterior-posterior N400; anterior-posterior P600; sound summaries; video skims; artificial neural networks; content-based metadata;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Baddeley, A. (2007). Working memory, thought, and action. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198528012.001.0001
2 Burmester, J., Spalek, K., & Wartenburger, I. (2014). Context updating during sentence comprehension: The effect of aboutness topic. Brain and Language, 137, 62-76. https://doi.org/10.1016/j.bandl.2014.08.001   DOI
3 Buzzetto-More, N. (2015). Student attitudes towards the integration of youtube in online, hybrid, and web-assisted courses: An examination of the impact of course modality on perception. MERLOT Journal of Online Learning and Teaching, 11, 55-73.
4 DeLong, K. A., Quante, L., & Kutas, M. (2014). Predictability, plausibility, and two late ERP positivities during written sentence comprehension. Neuropsychologia, 61, 150-162. https://doi.org/10.1016/j.neuropsychologia.2014.06.016   DOI
5 Evans, W. J., Cui, L., & Starr, A. (1995). Olfactory event-related potentials in normal human subjects: Effects of age and gender. Electroencephalography and Clinical Neurophysiology, 95(4), 293-301. https://doi.org/10.1016/0013-4694(95)00055-4   DOI
6 Geyer, A., Holcomb, P., Kuperberg, G., & Perlmutter, N. (2006). Plausibility and sentence comprehension. An ERP study. Cognitive Neuroscience Supplement, Abstract, 1-1.
7 Hakoda, Y. (2010). Cognitive psychology: Brain, modeling and evidence. 강윤봉 (역). (2014). 인지심리학. 서울: 한국뇌기반교육연구소.
8 Hu, W., Xie, N., Li, L., Zeng, X., & Maybank, S. (2011). A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41(6), 797-819. https://doi.org/10.1109/TSMCC.2011.2109710   DOI
9 Kim, A., & Osterhout, L. (2005). The independence of combinatory semantic processing: Evidence from event-related potentials. Journal of Memory and Language, 52(2), 205-225. https://doi.org/10.1016/j.jml.2004.10.002   DOI
10 Kim, H. H., & Kim, Y. H. (2016). Generic speech summarization of transcribed lecture videos: Using tags and their semantic relations. Journal of the Association for Information Science and Technology, 67(2), 366-379. https://doi.org/10.1002/asi.23391   DOI
11 Kim, H. H., & Kim, Y. H. (2019a). Video summarization using event-related potential responses to shot boundaries in real-time video watching. Journal of the Association for Information Science and Technology, 70(2), 164-175. http://doi.org/10.1002/asi.24103   DOI
12 Kim, H. H., & Kim, Y. H. (2019b). ERP/MMR algorithm for classifying topic-relevant and topic-irrelevant visual shots of documentary videos. Journal of the Association for Information Science and Technology, 70(9), 931-941. https://doi.org/10.1002/asi.24179   DOI
13 Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the event-related brain potential (ERP). Annual Review of Psychology, 62, 621-647. https://doi.org/10.1146/annurev.psych.093008.131123   DOI
14 Mayer, R. E. (2005). Cognitive theory of multimedia learning. The Cambridge handbook of multimedia learning (pp. 134-146). New York: Cambridge University Press.
15 Luck, S. J. (2014). An introduction to the event-related potential technique. Cambridge, MA: MIT Press.
16 Maskey, S., & Hirschberg, J. (2006). Summarizing speech without text using hidden markov models. In Proceedings of the Human Language Technology Conference of the NAACL (Companion Volume: Short Papers, pp. 89-92). Association for Computational Linguistics, Stroudsburg, PA, USA. https://doi.org/10.3115/1614049.1614072
17 Martin, D. (2018). YouTube: The ultimate 2018 guide to grow your youtube channel, make money fast with proven techniques and foolproof step by step strategies. Cambridge: CreateSpace Independent Publishing Platform.
18 Moon, J., Kwon, Y., Park, J., & Yoon, W. C. (2019). Detecting user attention to video segments using interval EEG features. Expert Systems with Applications, 115, 578-592. https://doi.org/10.1016/j.eswa.2018.08.016   DOI
19 Nakano, H., Rosario, M. A. M., Oshima-Takane, Y., Pierce, L., & Tate, S. G. (2014). Electrophysiological response to omitted stimulus in sentence processing. Neuroreport, 25(14), 1169-1174. https://doi.org/10.1097/WNR.0000000000000250   DOI
20 Nieuwland, M. S., & Martin, A. E. (2012). If the real world were irrelevant, so to speak: The role of propositional truth-value in counterfactual sentence comprehension. Cognition, 122(1), 102-109. https://doi.org/10.1016/j.cognition.2011.09.001   DOI
21 van Berkum, J. J., Hagoort, P., & Brown, C. M. (1999). Semantic integration in sentences and discourse: Evidence from the N400. Journal of Cognitive Neuroscience, 11(6), 657-671. https://doi.org/10.1162/089892999563724   DOI
22 Alwehaibi, H. (2015). The impact of using youtube in EFL classroom on enhancing EFL students' content learning. Journal of College Teaching and Learning, 12(2), 121-126. https://doi.org/10.19030/tlc.v12i2.9182   DOI
23 Wang, L., & Schumacher, P. B. (2013). New is not always costly: Evidence from online processing of topic and contrast in Japanese. Frontiers in Psychology, 4, 363. https://doi.org/10.3389/fpsyg.2013.00363   DOI
24 Wilson, S. M., Bautista, A., & McCarron, A. (2018). Convergence of spoken and written language processing in the superior temporal sulcus. NeuroImage, 171, 62-74. https://doi.org/10.1016/j.neuroimage.2017.12.068   DOI
25 Zhang, Z., & Fung, P. (2012). Active learning with semi-automatic annotation for extractive speech summarization. ACM Transactions on Speech and Language Processing, 8(4), 1-25. https://doi.org/10.1145/2093153.2093155   DOI