Search | Korea Science

Chang, Jae-Young
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.5
- /
- pp.37-47
- /
- 2013
Text Mining is a research area of retrieving high quality hidden information such as patterns, trends, or distributions through analyzing unformatted text. Basically, since text mining assumes an unstructured text, it needs to be represented as a simple text model for analyzing it. So far, most frequently used model is VSM(Vector Space Model), in which a text is represented as a bag of words. However, recently much researches tried to apply a graph-based text model for representing semantic relationships between words. In this paper, we survey research trends of graph-based text representation models for text mining. Additionally, we also discuss about future models of graph-based text mining.
https://doi.org/10.7236/JIIBC.2013.13.5.37 인용 PDF KSCI

Lim, Pu-reum;Kim, Han-joon
- Database Research
- /
- v.34 no.3
- /
- pp.3-13
- /
- 2018
Text classification is one of the text mining technologies that classifies a given textual document into its appropriate categories and is used in various fields such as spam email detection, news classification, question answering, emotional analysis, and chat bot. In general, the text classification system utilizes machine learning algorithms, and among a number of algorithms, naïve Bayes and support vector machine, which are suitable for text data, are known to have reasonable performance. Recently, with the development of deep learning technology, several researches on applying deep neural networks such as recurrent neural networks (RNN) and convolutional neural networks (CNN) have been introduced to improve the performance of text classification system. However, the current text classification techniques have not yet reached the perfect level of text classification. This paper focuses on the fact that the text data is expressed as a vector only with the word dimensions, which impairs the semantic information inherent in the text, and proposes a neural network architecture based upon the semantic tensor space model.

Dong-hun, Lee;Chan, Hur;Hyeyoung, Park;Sang-hyo, Park
- IEMEK Journal of Embedded Systems and Applications
- /
- v.17 no.6
- /
- pp.347-353
- /
- 2022
In this paper, we propose a method that performs a text-video retrieval model by replacing video properties using captions. In general, the exisiting embedding-based models consist of both joint embedding space construction and the CNN-based video encoding process, which requires a lot of computation in the training as well as the inference process. To overcome this problem, we introduce a video-captioning module to replace the visual property of video with captions generated by the video-captioning module. To be specific, we adopt the caption generator that converts candidate videos into captions in the inference process, thereby enabling direct comparison between the text given as a query and candidate videos without joint embedding space. Through the experiment, the proposed model successfully reduces the amount of computation and inference time by skipping the visual processing process and joint embedding space construction on two benchmark dataset, MSR-VTT and VATEX.
https://doi.org/10.14372/IEMEK.2022.17.6.347 인용 PDF KSCI