Browse > Article
http://dx.doi.org/10.7236/JIIBC.2019.19.6.161

Analysis and Comparison of Query focused Korean Document Summarization using Word Embedding  

Heu, Jee-Uk (Dept. of Computer Engineering, Hanyang University)
Publication Information
The Journal of the Institute of Internet, Broadcasting and Communication / v.19, no.6, 2019 , pp. 161-167 More about this Journal
Abstract
Recently, the amount of created information has been rising rapidly by dissemination of state of the art and developing of the various web service based on ICT. In additionally, the user has to need a lot of times and effort to find the necessary information which is the user want to know it in the mount of information. Document summarization is the technique that making and providing the summary of given document efficiently by analyzing and extracting the key sentences and words. However, it is hard to apply the previous of word embedding technique to the document which is composed by korean language for analyzing contents in the document due to the character of language. In this paper, we propose the new query-focused korean document summarization by exploiting word embedding technique such as Word2Vec and FastText, and then compare the both result of performance.
Keywords
FastText; Korean; Query-Based-Document-Summarization; Word2Vec;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T., "Enriching word vectors with subword information". Transactions of the Association for Computational Linguistics, pp. 135-146, 2017.
2 Devlin, J., Chang, M. W., Lee, K., & Toutanova, K., "Bert: Pre-training of deep bidirectional transformers for language understanding". arXiv preprint arXiv:1810.04805. 2018.
3 Sahlgren, M., "The distributional hypothesis", Italian Journal of Disability Studies, Vol. 20, pp. 33-53, 2008
4 Lin, C. Y., "Rouge: A package for automatic evaluation of summaries". In Text summarization branches out, pp. 74-81, 2004.
5 D. Reinsel, J. Gantz, J. Rydning, "The Evolution of Data to Life-Critical", Data Age 2025, 2017.
6 D. J. Shiin, J. H. Park, J. H. Kim, K. J. Kwak, J. M. Park, J. J. Kim, "Dig Data-based Processing and Analysis for IoT Environment", The Journal of The Institute of Internet, Broadcasting and Communication, Vol. 13, No. 5, pp. 37-47, Oct 2013. DOI: https://doi.org/10.7236/JIIBC.2019.19.1.117   DOI
7 N. G. Kim. S. J. Kang, "Relevant Image Retrieval of Korean Documents based on Sentence and Word Importance," Journal of the Korea Academia-Industrial cooperation Society(JKAIS), Vol. 20, No. 3, pp. 43-48, 2019. DOI: https://dx.doi.org/10.5762/KAIS.2019.20.3.43   DOI
8 Baumel, Tal, Matan Eyal, and Michael Elhadad. "Query focused abstractive summarization: Incorporating query relevance, multi-document coverage, and summary length constraints into seq2seq models." arXiv preprint arXiv:1801.07704, 2018.
9 D. S. Park, and H. J. Kim, "A Proposal of Join Vector for Semantic Factor Reflection in TF-IDF Based Keyword Extraction", The Journal of KIIT, Vol. 16, No. 2, pp. 1-16, Feb 2018 DOI: https://dx.doi.org/10.14801/jkiit.2018.16.2.1
10 E. J. Park, and S. Z. Cho, "KoNLPy: Korean natural language processing in Python", Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, pp. 133-136, Oct 2014.
11 Chopra, S., Auli, M., & Rush, A. M. "Abstractive sentence summarization with attentive recurrent 0neural networks", In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93-98, Jun 2016. DOI: https://doi.org/10.18653/v1/n16-1012
12 J. S. Seol, S. G. Lee, "lexrankr: LexRank based Korean multi-document summarization", The Korean Institute of Information Scientists and Engineers, pp. 458-460, Dec 2016.
13 K. H. Choi, C. Lee, "End-to-end Korean Document Summarization using Copy Mechanism and Input-feeding", The Korean Institute of Information Scientists and Engineers, Vol. 44, No. 5, pp. 503-509, 2017.
14 Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J, "Distributed representations of words and phrases and their compositionality", In Advances in neural information processing systems, pp. 3111-3119, 2013.
15 Pennington, J., Socher, R., & Manning, C., "Glove: Global vectors for word representation", In Proceedings of the 2014 conference on empirical methods in natural language processing, pp. 1532-1543, Oct 2014.