• Title/Summary/Keyword: 문서감정

Search Result 65, Processing Time 0.033 seconds

Sentiment Prediction using Emotion and Context Information in Unstructured Documents (비정형 문서에서 감정과 상황 정보를 이용한 감성 예측)

  • Kim, Jin-Su
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.10
    • /
    • pp.40-46
    • /
    • 2020
  • With the development of the Internet, users share their experiences and opinions. Since related keywords are used witho0ut considering information such as the general emotion or genre of an unstructured document such as a movie review, the sensitivity accuracy according to the appropriate emotional situation is impaired. Therefore, we propose a system that predicts emotions based on information such as the genre to which the unstructured document created by users belongs or overall emotions. First, representative keyword related to emotion sets such as Joy, Anger, Fear, and Sadness are extracted from the unstructured document, and the normalized weights of the emotional feature words and information of the unstructured document are trained in a system that combines CNN and LSTM as a training set. Finally, by testing the refined words extracted through movie information, morpheme analyzer and n-gram, emoticons, and emojis, it was shown that the accuracy of emotion prediction using emotions and F-measure were improved. The proposed prediction system can predict sentiment appropriately according to the situation by avoiding the error of judging negative due to the use of sad words in sad movies and scary words in horror movies.

Sentiment Analysis System by Using BERT Language Model (BERT 언어 모델을 이용한 감정 분석 시스템)

  • Kim, Taek-Hyun;Cho, Dan-Bi;Lee, Hyun-Young;Won, Hye-Jin;Kang, Seung-Shik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.975-977
    • /
    • 2020
  • 감정 분석은 문서의 주관적인 감정, 의견, 기분을 파악하기 위한 방법으로 소셜 미디어, 온라인 리뷰 등 다양한 분야에서 활용된다. 문서 내 텍스트가 나타내는 단어와 문맥을 기반으로 감정 수치를 계산하여 긍정 또는 부정 감정을 결정한다. 2015년에 구축된 네이버 영화평 데이터 20 만개에 12 만개를 추가 구축하여 감정 분석 연구를 진행하였으며 언어 모델로는 최근 자연어처리 분야에서 높은 성능을 보여주는 BERT 모델을 이용하였다. 감정 분석 기법으로는 LSTM(Long Short-Term Memory) 등 기존의 기계학습 기법과 구글의 다국어 BERT 모델, 그리고 KoBERT 모델을 이용하여 감정 분석의 성능을 비교하였으며, KoBERT 모델이 89.90%로 가장 높은 성능을 보여주었다.

Analysis on Sequence of Ball-pen and Pencil by using Digital Infrared Photography -with Emphasis on the Documents Authentication- (적외선 사진술을 이용한 볼펜과 연필의 선후 관계 분석 -문서감정을 중심으로-)

  • Kim, Yoo-Jin;Youn, Sung-Bin;Har, Dong-Hwan
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.5
    • /
    • pp.481-488
    • /
    • 2011
  • Generally speaking, a document is a mutual promise between two parties and functions as a legally-binding trust for a transaction. A document should be produced on a mutual agreement basis, and its credibility shall be attained if the transparency of a document production is ensured. Therefore, sequence analysis of the procedures in a document production is very important for appraisal of a document. The purpose of this research is to distinguish sequence association between the erased carbon ingredients of a pencil and the ingredients left in a ball-point pen and thus suggest a method that determines whether mutual agreement was applied or not in signing an insurance policy. This method analyzes if the carbon ingredients of a pencil are left in the bottom section of a ball-point pen through infrared photography. If the carbon ingredients of a pencil are left in the bottom section of a pen, the pen shall absorb infrared rays and mark a dense concentration. This method applies a relatively simple infrared photography system and therefore shall be beneficial to a personal appraisal store.

A Korean Document Sentiment Classification System based on Semantic Properties of Sentiment Words (감정 단어의 의미적 특성을 반영한 한국어 문서 감정분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.317-322
    • /
    • 2010
  • This paper proposes how to improve performance of the Korean document sentiment-classification system using semantic properties of the sentiment words. A sentiment word means a word with sentiment, and sentiment features are defined by a set of the sentiment words which are important lexical resource for the sentiment classification. Sentiment feature represents different sentiment intensity in general field and in specific domain. In general field, we can estimate the sentiment intensity using a snippet from a search engine, while in specific domain, training data can be used for this estimation. When the sentiment intensity of the sentiment features are estimated, it is called semantic orientation and is used to estimate the sentiment intensity of the sentences in the text documents. After estimating sentiment intensity of the sentences, we apply that to the weights of sentiment features. In this paper, we evaluate our system in three different cases such as general, domain-specific, and general/domain-specific semantic orientation using support vector machine. Our experimental results show the improved performance in all cases, and, especially in general/domain-specific semantic orientation, our proposed method performs 3.1% better than a baseline system indexed by only content words.

Sentiment Classification for Korean Tweets via Semi-Supervised Learning (준지도 학습을 이용한 트윗 감정 분류)

  • Seo, Hyeong-Won;Noh, Kyung-Mok;Cheon, Min-A;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.123-125
    • /
    • 2012
  • 본 논문은 기계 학습을 이용한 감정 분류에 필요한 학습 말뭉치를 효율적으로 확장하는 방법에 대하여 기술한다. 학습 말뭉치는 일반적으로 그에 알맞은 레이블을 정해야 하는데, 그 양이 어마어마하기 때문에 이 과정을 일일이 사람이 할 수는 없다. 그에 대한 해결책으로써 이미 많은 준지도학습 방법이 연구되었고, 그것을 트윗이라는 짧은 문서를 감정 분류하는 것에 적용해도 감정 문서 분류기의 성능이 좋다는 결과를 확인하였다.

  • PDF

An emotional speech synthesis markup language processor for multi-speaker and emotional text-to-speech applications (다음색 감정 음성합성 응용을 위한 감정 SSML 처리기)

  • Ryu, Se-Hui;Cho, Hee;Lee, Ju-Hyun;Hong, Ki-Hyung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.523-529
    • /
    • 2021
  • In this paper, we designed and developed an Emotional Speech Synthesis Markup Language (SSML) processor. Multi-speaker emotional speech synthesis technology that can express multiple voice colors and emotional expressions have been developed, and we designed Emotional SSML by extending SSML for multiple voice colors and emotional expressions. The Emotional SSML processor has a graphic user interface and consists of following four components. First, a multi-speaker emotional text editor that can easily mark specific voice colors and emotions on desired positions. Second, an Emotional SSML document generator that creates an Emotional SSML document automatically from the result of the multi-speaker emotional text editor. Third, an Emotional SSML parser that parses the Emotional SSML document. Last, a sequencer to control a multi-speaker and emotional Text-to-Speech (TTS) engine based on the result of the Emotional SSML parser. Based on SSML which is a programming language and platform independent open standard, the Emotional SSML processor can easily integrate with various speech synthesis engines and facilitates the development of multi-speaker emotional text-to-speech applications.

An Attention Method-based Deep Learning Encoder for the Sentiment Classification of Documents (문서의 감정 분류를 위한 주목 방법 기반의 딥러닝 인코더)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.268-273
    • /
    • 2017
  • Recently, deep learning encoder-based approach has been actively applied in the field of sentiment classification. However, Long Short-Term Memory network deep learning encoder, the commonly used architecture, lacks the quality of vector representation when the length of the documents is prolonged. In this study, for effective classification of the sentiment documents, we suggest the use of attention method-based deep learning encoder that generates document vector representation by weighted sum of the outputs of Long Short-Term Memory network based on importance. In addition, we propose methods to modify the attention method-based deep learning encoder to suit the sentiment classification field, which consist of a part that is to applied to window attention method and an attention weight adjustment part. In the window attention method part, the weights are obtained in the window units to effectively recognize feeling features that consist of more than one word. In the attention weight adjustment part, the learned weights are smoothened. Experimental results revealed that the performance of the proposed method outperformed Long Short-Term Memory network encoder, showing 89.67% in accuracy criteria.

Feature Weighting for Opinion Classification of Comments on News Articles (뉴스 댓글의 감정 분류를 위한 자질 가중치 설정)

  • Lee, Kong-Joo;Kim, Jae-Hoon;Seo, Hyung-Won;Rhyu, Keel-Soo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.871-879
    • /
    • 2010
  • In this paper, we present a system that classifies comments on a news article into a user opinion called a polarity (positive or negative). The system is a kind of document classification system for comments and is based on machine learning techniques like support vector machine. Unlike normal documents, comments have their body that can influence classifying their opinions as polarities. In this paper, we propose a feature weighting scheme using such characteristics of comments and several resources for opinion classification. Through our experiments, the weighting scheme have turned out to be useful for opinion classification in comments on Korean news articles. Also Korean character n-grams (bigram or trigram) have been revealed to be helpful for opinion classification in comments including lots of Internet words or typos. In the future, we will apply this scheme to opinion analysis of comments of product reviews as well as news articles.

A Sentiment Analysis of Internet Movie Reviews Using String Kernels (문자열 커널을 이용한 인터넷 영화평의 감정 분석)

  • Kim, Sang-Do;Yoon, Hee-Geun;Park, Seong-Bae;Park, Se-Young;Lee, Sang-Jo
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.56-60
    • /
    • 2009
  • 오늘날 인터넷은 개인의 감정, 의견을 서로 공유할 수 있는 공간이 되고 있다. 하지만 인터넷에는 너무나 방대한 문서가 존재하기 때문에 다른 사용자들의 감정, 의견 정보를 개인의 의사 결정에 활용하기가 쉽지 않다. 최근 들어 감정이나 의견을 자동으로 추출하기 위한 연구가 활발하게 진행되고 있으며, 감정 분석에 관한 기존 연구들은 대부분 어구의 극성(polarity) 정보가 있는 감정 사전을 사용하고 있다. 하지만 인터넷에는 나날이 신조어가 새로 생기고 언어 파괴 현상이 자주 일어나기 때문에 사전에 기반한 방법은 한계가 있다. 본 논문은 감정 분석 문제를 긍정과 부정으로 구분하는 이진 분류 문제로 본다. 이진 분류 문제에서 탁월한 성능을 보이는 Support Vector Machines(SVM)을 사용하며, 문서들 간의 유사도 계산을 위해 문장의 부분 문자열을 비교하는 문자열 커널을 사용한다. 실험 결과, 실제 영화평에서 제안된 모델이 비교 대상으로 삼은 Bag of Words(BOW) 모델보다 안정적인 성능을 보였다.

  • PDF

Mark-up for Representing Emotion (감정의 표현을 휘한 마크업)

  • 박성은;이용규
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.487-490
    • /
    • 2004
  • 이메일과 같은 텍스트 기반의 서비스 둥이 점차 대중화되고 있지만, 이러한 텍스트 기반의 서비스에서는 메시지를 전달할 때 수신자가 필자의 감정 상태를 정확하게 파악하기 어려운 문제가 있다. 이러한 문제를 단편적으로 해결하기 위하여 감정 상태를 나타내는 이모티콘(emoticon)을 사용하기도 하지만 이는 보편적이지 않아서 사용하기에 불편한 점이 있다. 따라서 본 논문에서는 이러한 문제를 해결하기 위한 방안으로 일반 텍스트 문서에 감정 태그를 삽입하여 필자의 감정을 표현할 수 있도록 새로운 마크업 언어인 EmoXML(Emotion XML)을 정의한다. 그리고 문장 내에 포함되어 있는 감정 어휘를 인식하여, 관련 감정 태그를 자동으로 생성하고 처리할 수 있는 시스템을 설계한다.

  • PDF