• Title/Summary/Keyword: Sentiment Evaluation

Search Result 96, Processing Time 0.024 seconds

Evaluation of the Discordance between Sentence Polarities and Keyword Polarities by Using MUSE Sentiment-Annotated Corpora (MUSE 감성주석코퍼스를 활용한 문장 극성과 키워드 극성간의 불일치 현상에 대한 분석)

  • Cho, Donghee;Shin, Donghyok;Joo, Heejin;Chae, Byoungyeol;Cao, Wenkai;Nam, Jeesun
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.195-200
    • /
    • 2016
  • 본 연구는 MUSE 감성 코퍼스를 활용하여 문장의 극성과 키워드의 극성이 얼마만큼 일치하고 일치하지 않은지를 분석함으로써 특히 문장의 극성과 키워드의 극성이 불일치하는 유형에 대한 연구의 필요성을 역설하고자 한다. 본 연구를 위하여 DICORA에서 구축한 MUSE 감성주석코퍼스 가운데 IT 리뷰글 도메인으로부터 긍정 1,257문장, 부정 1,935문장을, 맛집 리뷰글 도메인으로부터는 긍정 2,418문장, 부정 432문장을 추출하였다. UNITEX를 이용하여 LGG를 구축한 후 이를 위의 코퍼스에 적용하여 나타난 양상을 살펴 본 결과, 긍 부정 문장에서 반대 극성의 키워드가 실현된 경우는 두 도메인에서 약 4~16%의 비율로 나타났으며, 단일 키워드가 아닌 구나 문장 차원으로 극성이 표현된 경우는 두 도메인에서 약 25~40%의 비교적 높은 비율로 나타났음을 확인하였다. 이를 통해 키워드의 극성에 의존하기 보다는 문장과 키워드의 극성이 일치하지 않는 경우들, 가령 문장 전체의 극성을 전환시키는 극성전환장치(PSD)가 실현된 유형이나 문장 내 극성 어휘가 존재하지 않지만 구 또는 문장 차원의 극성이 표현되는 유형들에 대한 유의미한 연구가 수행되어야 비로소 신뢰할만한 오피니언 자동 분류 시스템의 구현이 가능하다는 것을 알 수 있다.

  • PDF

Evaluation of Sentimental Texts Automatically Generated by a Generative Adversarial Network (생성적 적대 네트워크로 자동 생성한 감성 텍스트의 성능 평가)

  • Park, Cheon-Young;Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.6
    • /
    • pp.257-264
    • /
    • 2019
  • Recently, deep neural network based approaches have shown a good performance for various fields of natural language processing. A huge amount of training data is essential for building a deep neural network model. However, collecting a large size of training data is a costly and time-consuming job. A data augmentation is one of the solutions to this problem. The data augmentation of text data is more difficult than that of image data because texts consist of tokens with discrete values. Generative adversarial networks (GANs) are widely used for image generation. In this work, we generate sentimental texts by using one of the GANs, CS-GAN model that has a discriminator as well as a classifier. We evaluate the usefulness of generated sentimental texts according to various measurements. CS-GAN model not only can generate texts with more diversity but also can improve the performance of its classifier.

A Study on the Characteristic Analysis of Local Informatization in Chungcheongbuk-do: Focus on text mining (충청북도의 지역정보화 특성 분석에 관한 연구: 텍스트마이닝 중심)

  • Lee, Junghwan;Park, Soochang;Lee, Euisin
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.10
    • /
    • pp.67-77
    • /
    • 2021
  • This study conducted topic modeling, association analysis, and sentiment analysis focused on text mining in order to reflect regional characteristics in the process of establishing an information plan in Chungcheongbuk-do. As a result of the analysis, it was confirmed that Chungcheongbuk-do occupies a relatively high proportion of educational activities to bridge the information gap, and is interested in improving infrastructure to provide non-face-to-face, untouched administrative services, and bridge the gap between urban and rural areas. In addition, it is necessary to refer to the fact that there is a positive evaluation of the combination of bio and IT in the regional strategic industry and examples of ICT innovation services. It has been confirmed that smart cities have high expectations for the establishment of various cooperation systems with IT companies, but continuous crisis management is necessary so that they are not related to political issues. It is hoped that the results of this study can be used as one of the methods to specifically reflect regional changes in the process of informatization.

Establishing the Strategy of Effective Construction VE for Construction Firms (건설기업 관점의 효과적인 시공 VE 수행을 위한 전략 도출 연구)

  • Park, Chan Young;Yun, Sungmin;Lee, Dong-Eun
    • Korean Journal of Construction Engineering and Management
    • /
    • v.22 no.2
    • /
    • pp.80-87
    • /
    • 2021
  • Shortage of SOC budget and inappropriate initial construction cost planning have worsened the economic sentiment of the construction firm. Construction VE can be one of the solutions for improving the profitability of construction projects. This study identifies the strong and weak points of construction firms for establishing the strategy of effective construction VE by using importance-performance analysis. As a result, construction firms have strong points on support, cooperation, and knowledge about construction VE, but have weak points on 'VE experience of VE leader', 'Detailed cost estimation', and 'Idea generation and evaluation'. This paper contributes to establishing the strategy of effective construction VE from the perspective of the construction firm, which is differentiated from previous studies that have focused on the institutional approach for construction VE.

Development of Deep Learning Ensemble Modeling for Cryptocurrency Price Prediction : Deep 4-LSTM Ensemble Model (암호화폐 가격 예측을 위한 딥러닝 앙상블 모델링 : Deep 4-LSTM Ensemble Model)

  • Choi, Soo-bin;Shin, Dong-hoon;Yoon, Sang-Hyeak;Kim, Hee-Woong
    • Journal of Information Technology Services
    • /
    • v.19 no.6
    • /
    • pp.131-144
    • /
    • 2020
  • As the blockchain technology attracts attention, interest in cryptocurrency that is received as a reward is also increasing. Currently, investments and transactions are continuing with the expectation and increasing value of cryptocurrency. Accordingly, prediction for cryptocurrency price has been attempted through artificial intelligence technology and social sentiment analysis. The purpose of this paper is to develop a deep learning ensemble model for predicting the price fluctuations and one-day lag price of cryptocurrency based on the design science research method. This paper intends to perform predictive modeling on Ethereum among cryptocurrencies to make predictions more efficiently and accurately than existing models. Therefore, it collects data for five years related to Ethereum price and performs pre-processing through customized functions. In the model development stage, four LSTM models, which are efficient for time series data processing, are utilized to build an ensemble model with the optimal combination of hyperparameters found in the experimental process. Then, based on the performance evaluation scale, the superiority of the model is evaluated through comparison with other deep learning models. The results of this paper have a practical contribution that can be used as a model that shows high performance and predictive rate for cryptocurrency price prediction and price fluctuations. Besides, it shows academic contribution in that it improves the quality of research by following scientific design research procedures that solve scientific problems and create and evaluate new and innovative products in the field of information systems.

An Efficient Estimation of Place Brand Image Power Based on Text Mining Technology (텍스트마이닝 기반의 효율적인 장소 브랜드 이미지 강도 측정 방법)

  • Choi, Sukjae;Jeon, Jongshik;Subrata, Biswas;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.113-129
    • /
    • 2015
  • Location branding is a very important income making activity, by giving special meanings to a specific location while producing identity and communal value which are based around the understanding of a place's location branding concept methodology. Many other areas, such as marketing, architecture, and city construction, exert an influence creating an impressive brand image. A place brand which shows great recognition to both native people of S. Korea and foreigners creates significant economic effects. There has been research on creating a strategically and detailed place brand image, and the representative research has been carried out by Anholt who surveyed two million people from 50 different countries. However, the investigation, including survey research, required a great deal of effort from the workforce and required significant expense. As a result, there is a need to make more affordable, objective and effective research methods. The purpose of this paper is to find a way to measure the intensity of the image of the brand objective and at a low cost through text mining purposes. The proposed method extracts the keyword and the factors constructing the location brand image from the related web documents. In this way, we can measure the brand image intensity of the specific location. The performance of the proposed methodology was verified through comparison with Anholt's 50 city image consistency index ranking around the world. Four methods are applied to the test. First, RNADOM method artificially ranks the cities included in the experiment. HUMAN method firstly makes a questionnaire and selects 9 volunteers who are well acquainted with brand management and at the same time cities to evaluate. Then they are requested to rank the cities and compared with the Anholt's evaluation results. TM method applies the proposed method to evaluate the cities with all evaluation criteria. TM-LEARN, which is the extended method of TM, selects significant evaluation items from the items in every criterion. Then the method evaluates the cities with all selected evaluation criteria. RMSE is used to as a metric to compare the evaluation results. Experimental results suggested by this paper's methodology are as follows: Firstly, compared to the evaluation method that targets ordinary people, this method appeared to be more accurate. Secondly, compared to the traditional survey method, the time and the cost are much less because in this research we used automated means. Thirdly, this proposed methodology is very timely because it can be evaluated from time to time. Fourthly, compared to Anholt's method which evaluated only for an already specified city, this proposed methodology is applicable to any location. Finally, this proposed methodology has a relatively high objectivity because our research was conducted based on open source data. As a result, our city image evaluation text mining approach has found validity in terms of accuracy, cost-effectiveness, timeliness, scalability, and reliability. The proposed method provides managers with clear guidelines regarding brand management in public and private sectors. As public sectors such as local officers, the proposed method could be used to formulate strategies and enhance the image of their places in an efficient manner. Rather than conducting heavy questionnaires, the local officers could monitor the current place image very shortly a priori, than may make decisions to go over the formal place image test only if the evaluation results from the proposed method are not ordinary no matter what the results indicate opportunity or threat to the place. Moreover, with co-using the morphological analysis, extracting meaningful facets of place brand from text, sentiment analysis and more with the proposed method, marketing strategy planners or civil engineering professionals may obtain deeper and more abundant insights for better place rand images. In the future, a prototype system will be implemented to show the feasibility of the idea proposed in this paper.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.

Monitoring Mood Trends of Twitter Users using Multi-modal Analysis method of Texts and Images (텍스트 및 영상의 멀티모달분석을 이용한 트위터 사용자의 감성 흐름 모니터링 기술)

  • Kim, Eun Yi;Ko, Eunjeong
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.1
    • /
    • pp.419-431
    • /
    • 2018
  • In this paper, we propose a novel method for monitoring mood trend of Twitter users by analyzing their daily tweets for a long period. Then, to more accurately understand their tweets, we analyze all types of content in tweets, i.e., texts and emoticons, and images, thus develop a multimodal sentiment analysis method. In the proposed method, two single-modal analyses first are performed to extract the users' moods hidden in texts and images: a lexicon-based and learning-based text classifier and a learning-based image classifier. Thereafter, the extracted moods from the respective analyses are combined into a tweet mood and aggregated a daily mood. As a result, the proposed method generates a user daily mood flow graph, which allows us for monitoring the mood trend of users more intuitively. For evaluation, we perform two sets of experiment. First, we collect the data sets of 40,447 data. We evaluate our method via comparing the state-of-the-art techniques. In our experiments, we demonstrate that the proposed multimodal analysis method outperforms other baselines and our own methods using text-based tweets or images only. Furthermore, to evaluate the potential of the proposed method in monitoring users' mood trend, we tested the proposed method with 40 depressive users and 40 normal users. It proves that the proposed method can be effectively used in finding depressed users.

Comparative Assessment of Corporate Philanthropy by the IPA Method: Service and Manufacturing Industries (IPA기법을 활용한 기업의 사회공헌활동 비교 평가: 서비스업 및 제조업을 중심으로)

  • Ko, Jeong-Yong;Park, Hyeon-Suk
    • Journal of Distribution Science
    • /
    • v.13 no.4
    • /
    • pp.89-98
    • /
    • 2015
  • Purpose - In today's globalized and modern business environment, corporate social responsibility (CSR) activities are considered to be essential for the sustainable development of enterprises. In addition, the corporate philanthropy that is related to CSR practices, as well as their being capable of reducing the anti-corporate sentiment of people have facilitated a qualitative forward leap into the quantitative growth phase. This study aims to undertake a comparative evaluation of corporate philanthropy through the Importance-Performance Analysis (IPA) method focusing on service and manufacturing industries, and to eventually determine a differentiated approach that is needed for corporate philanthropy. Research design, data, and methodology - The survey responses were collected through online research on specialized companies from consumers nationwide who were aged from 20 to 60 and who are aware of corporate philanthropy. A total of 408 sheets of questionnaire survey were used. Frequency analysis was undertaken in this study. The interviewees had demographic characteristics of gender: 206 males (50.5%) and 202 females (49.5%). They also had demographic characteristics of age: 82 people were over 20 (20.1%), 96 over 30 (23.5%), 105 over 40 (25.7%), and 125 over 50 (30.7%) years of age. The distribution of interviewees' residences is as follows: 154 persons (37.7%) in the Special City, 102 persons (25.0%) in the Metropolitan City, and 152 persons (37.3%) in the Provincial Region. The interviewees have been working for the following companies: 34 persons (8.3%) in LG Display, 80 (19.6%) in KT&G, 49 (12.0%) in Amore Pacific, 42 (10.3%) in KIA Motors, 47 (11.5%) in SBS, 52 (12.8%) in Shinhan Bank, 86 (21.1%) in Asiana Airlines, and 18 (4.4%) in Hyundai Department Store. We applied the paired t-test for the IPA analysis. PASW Statistics 18 was used for statistical analysis. Results - The results of IPA analysis indicated that the importance and performance degrees in both manufacturing and service industries were significantly different. Major empirical results showed that, in consumer, social, economic, philanthropic, and environmental dimensions, in the sub-factors of philanthropy activities in both manufacturing and service industries, the importance degree was found to be higher than performance degree. Further, the average difference between importance degree and performance degree by the sub-factors of philanthropy activities. On the other hand, the average difference of environmental dimension was found to be highest in both service and manufacturing industries. Thus, while consumers consider the philanthropy activities of the environmental dimension as most important, actual companies treat performance of philanthropy activities of the environmental dimension insufficiently or negligibly to some degree. Conclusions - The differentiated approach method that is required for corporate philanthropy may be proposed to uplift corporate accomplishments by analyzing the IPA of the attributes of the sub-factors of corporate philanthropy. This is, to an extent, insufficient in the existing studies related to the use of the IPA technique, and it shows the items that are to be conducted intensively.