• 제목/요약/키워드: Sentiment Analysis

검색결과 660건 처리시간 0.023초

A Survey of Arabic Thematic Sentiment Analysis Based on Topic Modeling

  • Basabain, Seham
    • International Journal of Computer Science & Network Security
    • /
    • 제21권9호
    • /
    • pp.155-162
    • /
    • 2021
  • The expansion of the world wide web has led to a huge amount of user generated content over different forums and social media platforms, these rich data resources offer the opportunity to reflect, and track changing public sentiments and help to develop proactive reactions strategies for decision and policy makers. Analysis of public emotions and opinions towards events and sentimental trends can help to address unforeseen areas of public concerns. The need of developing systems to analyze these sentiments and the topics behind them has emerged tremendously. While most existing works reported in the literature have been carried out in English, this paper, in contrast, aims to review recent research works in Arabic language in the field of thematic sentiment analysis and which techniques they have utilized to accomplish this task. The findings show that the prevailing techniques in Arabic topic-based sentiment analysis are based on traditional approaches and machine learning methods. In addition, it has been found that considerably limited recent studies have utilized deep learning approaches to build high performance models.

영어 트위터 감성 분석을 위한 SentiWordNet 활용 기법 비교 (A Comparative Study on Using SentiWordNet for English Twitter Sentiment Analysis)

  • 강인수
    • 한국지능시스템학회논문지
    • /
    • 제23권4호
    • /
    • pp.317-324
    • /
    • 2013
  • 트위터 감성 분석은 트윗글의 감성을 긍정과 부정으로 분류하는 작업이다. 이 연구에서는 SentiWordNet(SWN) 감성 사전에 기반한 트윗글 감성 분석을 다룬다. SWN은 전체 영어 단어에 대해 단어의 의미별로 긍정, 부정의 감성 강도를 저장해 둔 감성 사전이다. 기존 SWN 기반 감성 분석 연구들은 문서에 출현하는 각 용어의 감성을 SWN으로부터 결정한 다음 이를 바탕으로 문서 전체의 감성을 결정하였는데, 그 방법들이 매우 다양하다. 예를 들어, 한 용어의 감성 결정 시 해당 용어의 SWN 내 의미별 긍정, 부정 감성 강도 차이들의 평균을 계산하거나 긍정과 부정 각각의 감성 강도 평균 혹은 최대값을 구하기도 하며, 문서 전체의 감성을 결정하는 경우에도 문서 내 용어들의 감성 값들에 대해 평균 혹은 최대값을 취하기도 하였다. 또한 SWN 내 형용사, 동사, 명사, 부사의 품사 집합 전체 혹은 특정 부분집합에 대해 위의 감성 결정 작업을 적용하기도 한다. 이처럼 기존 연구에서는 SWN 기반의 다양한 감성 자질 추출 절차가 시도되고 있으나 이들 자질 추출 기법 전반에 대한 성능 비교 연구는 찾기 힘들다. 이 연구에서는 SWN을 트위터 감성 분석에 활용하는 다양한 방법들을 일반화하는 절차들을 소개하고 각 방법들의 성능 비교 및 분석 결과를 제시한다.

A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

  • Song, Junseok;Kim, Kyung Tae;Lee, Byungjun;Kim, Sangyoung;Youn, Hee Yong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권6호
    • /
    • pp.2996-3011
    • /
    • 2017
  • With rapid growth of web technology and dissemination of smart devices, social networking service(SNS) is widely used. As a result, huge amount of data are generated from SNS such as Twitter, and sentiment analysis of SNS data is very important for various applications and services. In the existing sentiment analysis based on the $Na{\ddot{i}}ve$ Bayes algorithm, a same number of attributes is usually employed to estimate the weight of each class. Moreover, uncountable and meaningless attributes are included. This results in decreased accuracy of sentiment analysis. In this paper two methods are proposed to resolve these issues, which reflect the difference of the number of positive words and negative words in calculating the weights, and eliminate insignificant words in the feature selection step using Multinomial $Na{\ddot{i}}ve$ Bayes(MNB) algorithm. Performance comparison demonstrates that the proposed scheme significantly increases the accuracy compared to the existing Multivariate Bernoulli $Na{\ddot{i}}ve$ Bayes(BNB) algorithm and MNB scheme.

LDA를 이용한 온라인 리뷰의 다중 토픽별 감성분석 - TripAdvisor 사례를 중심으로 - (Multi-Topic Sentiment Analysis using LDA for Online Review)

  • 홍태호;니우한잉;임강;박지영
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제27권1호
    • /
    • pp.89-110
    • /
    • 2018
  • Purpose There is much information in customer reviews, but finding key information in many texts is not easy. Business decision makers need a model to solve this problem. In this study we propose a multi-topic sentiment analysis approach using Latent Dirichlet Allocation (LDA) for user-generated contents (UGC). Design/methodology/approach In this paper, we collected a total of 104,039 hotel reviews in seven of the world's top tourist destinations from TripAdvisor (www.tripadvisor.com) and extracted 30 topics related to the hotel from all customer reviews using the LDA model. Six major dimensions (value, cleanliness, rooms, service, location, and sleep quality) were selected from the 30 extracted topics. To analyze data, we employed R language. Findings This study contributes to propose a lexicon-based sentiment analysis approach for the keywords-embedded sentences related to the six dimensions within a review. The performance of the proposed model was evaluated by comparing the sentiment analysis results of each topic with the real attribute ratings provided by the platform. The results show its outperformance, with a high ratio of accuracy and recall. Through our proposed model, it is expected to analyze the customers' sentiments over different topics for those reviews with an absence of the detailed attribute ratings.

감정분석 기반 심리상담 AI 챗봇 시스템에 대한 연구 (A Study on the Psychological Counseling AI Chatbot System based on Sentiment Analysis)

  • 안세훈;정옥란
    • 한국IT서비스학회지
    • /
    • 제20권3호
    • /
    • pp.75-86
    • /
    • 2021
  • As artificial intelligence is actively studied, chatbot systems are being applied to various fields. In particular, many chatbot systems for psychological counseling have been studied that can comfort modern people. However, while most psychological counseling chatbots are studied as rule-base and deep learning-based chatbots, there are large limitations for each chatbot. To overcome the limitations of psychological counseling using such chatbots, we proposes a novel psychological counseling AI chatbot system. The proposed system consists of a GPT-2 model that generates output sentence for Korean input sentences and an Electra model that serves as sentiment analysis and anxiety cause classification, which can be provided with psychological tests and collective intelligence functions. At the same time as deep learning-based chatbots and conversations take place, sentiment analysis of input sentences simultaneously recognizes user's emotions and presents psychological tests and collective intelligence solutions to solve the limitations of psychological counseling that can only be done with chatbots. Since the role of sentiment analysis and anxiety cause classification, which are the links of each function, is important for the progression of the proposed system, we experiment the performance of those parts. We verify the novelty and accuracy of the proposed system. It also shows that the AI chatbot system can perform counseling excellently.

Sentiment Analysis on Global Events under Pandemic of COVID-19

  • Junjun, Zhang;Noh, Giseop
    • International Journal of Advanced Culture Technology
    • /
    • 제10권3호
    • /
    • pp.272-280
    • /
    • 2022
  • During last few years, pandemic of COVID-19 has been a global issue. Under the COVID-19, global events have been restricted or canceled to secure public hygiene and safety. Since one of the largest global events is Olympic Games, we selected recent Olympic Games as our case of analysis. Tokyo Olympic Games (TOG) was held in 2021, but it encountered a millennium disaster, the pandemic of COVID-19. In such a special period, it is of great significance to explore the emotional tendency of global views before and TOG via artificial intelligence. This paper vastly collects the TOG comment data of mainstream websites in South Korea, China, and the United States by implementing crawler program for sentiment analysis (SA). And we use a variety of sentiment analysis models to compare the accuracy of the experimental results, to obtain more reliable SA results. In addition, in the prediction results, to reduce the distortion of opinion by a minority, we introduce an algorithm called "Removing Biased Minority Opinions (RBMO)" and provide how to apply this method to the interpretation domain. Through our method, more authoritative SA results were obtained, which in turn provided a basis for predicting the sentiment tendency of countries around the world in TOG during the COVID-19 epidemic.

스탠포드 감성 트리 말뭉치를 이용한 감성 분류 시스템 (Sentiment Analysis System Using Stanford Sentiment Treebank)

  • 이성욱
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제39권3호
    • /
    • pp.274-279
    • /
    • 2015
  • 본 연구는 스탠포드 감성 트리 말뭉치를 이용하여 감성 분류 시스템을 구현하였으며, 분류기로는 지지벡터기계(Support Vector Machines)를 이용하여 긍정, 중립, 부정 등의 3가지 감성으로 분류하였다. 먼저 감성 문장의 품사를 부착한 후 의존구조를 부착하였다. 트리 말뭉치의 모든 노드와 감성 태그를 자동으로 추출하여 문장 레벨의 지지벡터 분류 시스템과 노드 레벨의 지지벡터 분류 시스템을 각각 구현하였다. 자질로는 어휘, 품사, 감성어휘, 의존관계, 형제관계 등 다양한 자질의 조합을 이용하였다. 평가 말뭉치를 이용하여 3클래스로 분류한 결과, 노드 단위에서는 74.2%, 문장 단위에서는 67.0%의 정확도를 얻었으나 2클래스 분류에서는 현재 알려진 최고의 시스템에 어느 정도 필적하는 성능을 거두었다.

우리나라 소비자 특성별 체감경기와 거시경제지표 간의 관계 분석 (Analysis on the Relationship between Consumer Sentiment and Macro-economic Indices by Consumer's Characteristics)

  • 김영준;신석하
    • 한국산학기술학회논문지
    • /
    • 제17권11호
    • /
    • pp.474-482
    • /
    • 2016
  • 본고에서는 소비자들이 느끼는 체감경기가 경제성장률 등 통상적인 거시경제지표와 괴리될 수 있으며, 특히 소득, 연령, 종사상의 지위 등 각 소비자들의 특성에 따라 차이가 난다는 점에 주목하여 소비자의 체감경기에 영향을 미치는 요인이 무엇인지 살펴보았다. 소비자들의 체감경기를 나타내는 변수로는 한국은행의 소비자동향조사를 통해 파악되는 현재생활 형편 소비자심리지수(CSI)를 사용하였으며 전반적인 실물경기 상황을 나타내는 지표로는 국내총생산(계절조정) 전기대비 증가율을 이용하였다. 이 외에 개별 소비자들의 체감경기에 영향을 미치는 요인으로 임금, 구인배율, 주택매매가격, 주가지수, 생활물가지수, 가계부채상환부담 등을 고려하였는데 분석결과 상기한 거시경제지표들이 소비자들의 체감경기에 상당한 영향을 미치는 것으로 확인되었다. 특히 소득, 연령, 종사상의 지위 등 소비자의 특성별로 그룹을 나누어 분석하여 본 결과, 상기한 거시경제지표들이 체감경기에 영향을 미치는 정도는 소비자 그룹별로 차이가 있는 것으로 확인되었다. 또한 소비자들이 느끼는 체감경기가 경제성장률로 대표되는 실물경기 지표와 크게 괴리되었던 과거의 경우에도 임금, 구인배율 등 상기한 거시경제지표들을 추가적으로 고려하면 이러한 괴리의 상당 부분을 설명할 수 있음을 확인할 수 있었다.

Sentiment Analysis of Korean Using Effective Linguistic Features and Adjustment of Word Senses

  • Jang, Ha-Yeon;Shin, Hyo-Pil
    • 한국언어정보학회지:언어와정보
    • /
    • 제14권2호
    • /
    • pp.33-46
    • /
    • 2010
  • This paper introduces a new linguistic-focused approach for sentiment analysis (SA) of Korean. In order to overcome shortcomings of previous works that focused mainly on statistical methods, we made effective use of various linguistic features reflecting the nature of Korean. These features include contextual shifters, modal affixes, and the morphological dependency of chunk structures. Moreover, in order to eschew possible confusion caused by ambiguous words and to improve the results of SA, we also proposed simple adjustment methods of word senses using KOLON ontology mapping information. Through experiments we contend that effective use of linguistic features and ontological information can improve the results of sentiment analysis of Korean.

  • PDF

Comparison of Sentiment Analysis from Large Twitter Datasets by Naïve Bayes and Natural Language Processing Methods

  • Back, Bong-Hyun;Ha, Il-Kyu
    • Journal of information and communication convergence engineering
    • /
    • 제17권4호
    • /
    • pp.239-245
    • /
    • 2019
  • Recently, effort to obtain various information from the vast amount of social network services (SNS) big data generated in daily life has expanded. SNS big data comprise sentences classified as unstructured data, which complicates data processing. As the amount of processing increases, a rapid processing technique is required to extract valuable information from SNS big data. We herein propose a system that can extract human sentiment information from vast amounts of SNS unstructured big data using the naïve Bayes algorithm and natural language processing (NLP). Furthermore, we analyze the effectiveness of the proposed method through various experiments. Based on sentiment accuracy analysis, experimental results showed that the machine learning method using the naïve Bayes algorithm afforded a 63.5% accuracy, which was lower than that yielded by the NLP method. However, based on data processing speed analysis, the machine learning method by the naïve Bayes algorithm demonstrated a processing performance that was approximately 5.4 times higher than that by the NLP method.