• Title/Summary/Keyword: 부정어휘

Search Result 48, Processing Time 0.024 seconds

Influence analysis of Internet buzz to corporate performance : Individual stock price prediction using sentiment analysis of online news (온라인 언급이 기업 성과에 미치는 영향 분석 : 뉴스 감성분석을 통한 기업별 주가 예측)

  • Jeong, Ji Seon;Kim, Dong Sung;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.37-51
    • /
    • 2015
  • Due to the development of internet technology and the rapid increase of internet data, various studies are actively conducted on how to use and analyze internet data for various purposes. In particular, in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of the current application of structured data. Especially, there are various studies on sentimental analysis to score opinions based on the distribution of polarity such as positivity or negativity of vocabularies or sentences of the texts in documents. As a part of such studies, this study tries to predict ups and downs of stock prices of companies by performing sentimental analysis on news contexts of the particular companies in the Internet. A variety of news on companies is produced online by different economic agents, and it is diffused quickly and accessed easily in the Internet. So, based on inefficient market hypothesis, we can expect that news information of an individual company can be used to predict the fluctuations of stock prices of the company if we apply proper data analysis techniques. However, as the areas of corporate management activity are different, an analysis considering characteristics of each company is required in the analysis of text data based on machine-learning. In addition, since the news including positive or negative information on certain companies have various impacts on other companies or industry fields, an analysis for the prediction of the stock price of each company is necessary. Therefore, this study attempted to predict changes in the stock prices of the individual companies that applied a sentimental analysis of the online news data. Accordingly, this study chose top company in KOSPI 200 as the subjects of the analysis, and collected and analyzed online news data by each company produced for two years on a representative domestic search portal service, Naver. In addition, considering the differences in the meanings of vocabularies for each of the certain economic subjects, it aims to improve performance by building up a lexicon for each individual company and applying that to an analysis. As a result of the analysis, the accuracy of the prediction by each company are different, and the prediction accurate rate turned out to be 56% on average. Comparing the accuracy of the prediction of stock prices on industry sectors, 'energy/chemical', 'consumer goods for living' and 'consumer discretionary' showed a relatively higher accuracy of the prediction of stock prices than other industries, while it was found that the sectors such as 'information technology' and 'shipbuilding/transportation' industry had lower accuracy of prediction. The number of the representative companies in each industry collected was five each, so it is somewhat difficult to generalize, but it could be confirmed that there was a difference in the accuracy of the prediction of stock prices depending on industry sectors. In addition, at the individual company level, the companies such as 'Kangwon Land', 'KT & G' and 'SK Innovation' showed a relatively higher prediction accuracy as compared to other companies, while it showed that the companies such as 'Young Poong', 'LG', 'Samsung Life Insurance', and 'Doosan' had a low prediction accuracy of less than 50%. In this paper, we performed an analysis of the share price performance relative to the prediction of individual companies through the vocabulary of pre-built company to take advantage of the online news information. In this paper, we aim to improve performance of the stock prices prediction, applying online news information, through the stock price prediction of individual companies. Based on this, in the future, it will be possible to find ways to increase the stock price prediction accuracy by complementing the problem of unnecessary words that are added to the sentiment dictionary.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.

Sentiment Classification considering Korean Features (한국어 특성을 고려한 감성 분류)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.3
    • /
    • pp.449-458
    • /
    • 2010
  • As occasion demands to obtain efficient information from many documents and reviews on the Internet in many kinds of fields, automatic classification of opinion or thought is required. These automatic classification is called sentiment classification, which can be divided into three steps, such as subjective expression classification to extract subjective sentences from documents, sentiment classification to classify whether the polarity of documents is positive or negative, and strength classification to classify whether the documents have weak polarity or strong polarity. The latest studies in Opinion Mining have used N-gram words, lexical phrase pattern, and syntactic phrase pattern, etc. They have not used single word as feature for classification. Especially, patterns have been used frequently as feature because they are more flexible than N-gram words and are also more deterministic than single word. Theses studies are mainly concerned with English, other studies using patterns for Korean are still at an early stage. Although Korean has a slight difference in the meaning between predicates by the change of endings, which is 'Eomi' in Korean, of declinable words, the earlier studies about Korean opinion classification removed endings from predicates only to extract stems. Finally, this study introduces the earlier studies and methods using pattern for English, uses extracted sentimental patterns from Korean documents, and classifies polarities of these documents. In this paper, it also analyses the influence of the change of endings on performances of opinion classification.

  • PDF

Zur Valenz deutscher verbaler Somatismen mit der Komponente ${\lceil}hand{\rfloor}$ (독일어의 신체부위 "손" 관련 관용구의 결합가 연구)

  • Kim Soo-Nam
    • Koreanishche Zeitschrift fur Deutsche Sprachwissenschaft
    • /
    • v.4
    • /
    • pp.1-27
    • /
    • 2001
  • 이 글의 목적은 독일어 신체어휘 관련 관용구들 가운데 ${\lceil}$Duden Band 11${\rfloor}$에 수록된 108개의 $\lceil$$\rfloor$ 관련 관용구를 대상으로 이들의 형태$\cdot$통사구조를 파악하고, 그들을 모형화하는 것이다. 우리는 연구 대상을 문장에서 결합가 보유어로서 술어의 기능을 하는 관용구에 한정했다. 우리는 $\lceil$$\rfloor$ 관련 관용구를 보충어의 수와 형태에 따라 크게 세 가지 부류, 즉 1가, 2가, 3가의 관용구로 구분하였다 보충어의 형태는 명사구(Sn, Sd, Sa)와 전치사구(pS)에 한정했으며 문장형태의 보충어, 예를 들어 부문장(NS)과 부정사문(Inf) 형태는 고려하지 않았다. 이들이 보충어로 간주될 수 있는지의 여부는 아직 더 많은 연구를 필요로 하기 때문에 다음 과제로 남겨두었다. 일차적으로 외적 결합가($\"{a}u{\beta}ere\;Valenz)$에 따라, 이차적으로는 내적 결합가(innere Valenz)에 따라 108개의 $\lceil$$\rfloor$ 관련 관용구를 분석한 결과 우리는 다음과 같은 형태$\cdot$통사적 문형을 얻을 수 있었다. $\cdot$ 1가 동사 관용구: 1) PL-Sn : (1) PL[VPL - Sa] - Sn (2) PL(VPL - pS) - Sn (3) PL[VPL - Sa - pS] - Sn (4) PL[VPL - pS - pS] - Sn Sondergruppen: PL[VPL - Sa - Inf] - Sn PL[VPL - pS - Inf] - Sn 2) PL - Sd: (1) PL[VPL - Sn] - Sd (2) PL[VPL - Sn(es) - pS] - Sd $\cdot$ 2가 동사 관용구1) PL - Sn - Sd: (1) PL[VPL - Sa] - Sn - Sd (2) PL[VPL - pS] - Sn - Sd (3) PL[VPL - Sa - pS) - Sn - Sd 2) PL - Sn - pS: (1) PL[VPL - Sa] - Sn - pS (2) PL[VPL - pS] - Sn - pS (3) PL(VPL - Sa - pS) - Sn - pS 3) PL[VPL - pS) - Sn -Sa $\cdot$ 3가 동사 관용구: (1) PL[VPL - pS] - Sn - Sd - Sa (2) PL[VPL - pS] - Sn - Sa - pS (3) PL[VPL - Sa] - Sn - Sd - pS 이러한 분류가 보여주듯이, 독일어에는 1가, 2가, 3가의 관용구가 있으며, 구조 외적으로 동일한 통사적 결합가를 갖는다 하더라도 구조 내적 성분구조가 다르다는 것을 알 수 있다. 우리는 이 글이 외국어로서의 독일어를 배우는 이들에게 독일어의 관용구를 보다 올바르게 이해할 수 있는 방법론적인 토대를 제공함은 물론, (관용어) 사전에서 외국인 학습자를 고려하여 관용구를 알기 쉽게 기술하는 데 도움을 줄 수 있기를 바란다.

  • PDF

Constructing an Evaluation Set for Korean Sentiment Analysis Systems Incorporating the Category and the Strength of Sentiment (감성 강도를 고려한 감성 분석 평가집합 구축)

  • Kim, Do-Yeon;Wu, Yong;Park, Hyuk-Ro
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.11
    • /
    • pp.30-38
    • /
    • 2012
  • Sentiment analysis is concerned with extracting and analyzing different kinds of user sentiment expressed in a variety of social media such as blog and twitter. Although sentiment analysis techniques are actively studied for these days, evaluation sets are not developed yet for Korean sentiment analysis. In this paper, we constructed an evaluation set for Korean sentiment analysis. To evaluate sentiment analysis systems more throughly, each sentence in our evaluation set is tagged with the polarity of the sentiment as well as the category and the strength of the sentiment. We divide kinds of sentiment into 7 positive categories and 15 negative categories. Each category is given the strength of the sentiment from 1 to 3. Our evaluation set consists of 3,270 sentences extracted from various social media. For each sentence, 5 human taggers assigned the category and the strength of the sentiment expressed in the sentence. The ratio of inter-taggers agreement was 93% in the polarity, 70% in the category, 58% in the strength of sentiment. The ratio of inter-taggers agreement our evaluation set is a bit higher than other evaluation sets developed for German and Spanish. This result shows our evaluation set can be used as a reliable resource for the evaluation of sentiment analysis systems.

Joke-Related Aspects and their Significance in Traditional Korean Funny Performing Arts (한국 전통연희에서의 재담의 양상과 그 의의)

  • Son, Tae-do
    • Journal of Korean Classical Literature and Education
    • /
    • no.32
    • /
    • pp.29-61
    • /
    • 2016
  • A joke (才談, 재담) is "the most interesting and witty language unit" in our speech. However, the search of a joke is still starting. Although joke are related to the witty and interesting talks, stories, songs and plays, the actual object of a joke is only the witty and interesting talk. A joke is witty talk that is interesting or laughter-inducing. Many Jokes can be found in the traditional Korean funny performing arts (演戱, 연희). This is because these art forms are performed in open yards, which necessitated amusing the audience, amusement, in its turn, required jokes. Jokes in the traditional funny performing arts can generally be classified as follows: 1) Jokes related to a situation: These include right words at a given situation, exaggerating words, diminishing words, deviancy words, and cause-effect words. 2) Jokes related to discourse: These include enumerating words, amplificatory words, contrasting words, fluently lying words, undeniable words, purposely unknowing words, and deliberately incorrect words. 3) Jokes related to vocabulary: These include synonym, similar words, changed word-ordering words, and incorrect words. 4) Jokes related to pronunciation: These include homonyms, and anti-homonyms. Although there may be other jokes, those presented above are typical ones. A joke is "the result that human being can achieve when he/she has overcome natural and social difficulties and is left with only a free and creative spirit." Jokes are necessary in all ages and everywhere. Today, more varied and high-level jokes can be created by developing the diversity of jokes in traditional funny performing arts. Also, I expect new sorts of jokes, because a joke always demands a creative spirit.

Exploring user experience factors through generational online review analysis of AI speakers (인공지능 스피커의 세대별 온라인 리뷰 분석을 통한 사용자 경험 요인 탐색)

  • Park, Jeongeun;Yang, Dong-Uk;Kim, Ha-Young
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.193-205
    • /
    • 2021
  • The AI speaker market is growing steadily. However, the satisfaction of actual users is only 42%. Therefore, in this paper, we collected reviews on Amazon Echo Dot 3rd and 4th generation models to analyze what hinders the user experience through the topic changes and emotional changes of each generation of AI speakers. By using topic modeling analysis techniques, we found changes in topics and topics that make up reviews for each generation, and examined how user sentiment on topics changed according to generation through deep learning-based sentiment analysis. As a result of topic modeling, five topics were derived for each generation. In the case of the 3rd generation, the topic representing general features of the speaker acted as a positive factor for the product, while user convenience features acted as negative factor. Conversely, in the 4th generation, general features were negatively, and convenience features were positively derived. This analysis is significant in that it can present analysis results that take into account not only lexical features but also contextual features of the entire sentence in terms of methodology.

Sentiment analysis on movie review through building modified sentiment dictionary by movie genre (영역별 맞춤형 감성사전 구축을 통한 영화리뷰 감성분석)

  • Lee, Sang Hoon;Cui, Jing;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.97-113
    • /
    • 2016
  • Due to the growth of internet data and the rapid development of internet technology, "big data" analysis is actively conducted to analyze enormous data for various purposes. Especially in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of existing structured data analysis. Various studies on sentiment analysis, the part of text mining techniques, are actively studied to score opinions based on the distribution of polarity of words in documents. Usually, the sentiment analysis uses sentiment dictionary contains positivity and negativity of vocabularies. As a part of such studies, this study tries to construct sentiment dictionary which is customized to specific data domain. Using a common sentiment dictionary for sentiment analysis without considering data domain characteristic cannot reflect contextual expression only used in the specific data domain. So, we can expect using a modified sentiment dictionary customized to data domain can lead the improvement of sentiment analysis efficiency. Therefore, this study aims to suggest a way to construct customized dictionary to reflect characteristics of data domain. Especially, in this study, movie review data are divided by genre and construct genre-customized dictionaries. The performance of customized dictionary in sentiment analysis is compared with a common sentiment dictionary. In this study, IMDb data are chosen as the subject of analysis, and movie reviews are categorized by genre. Six genres in IMDb, 'action', 'animation', 'comedy', 'drama', 'horror', and 'sci-fi' are selected. Five highest ranking movies and five lowest ranking movies per genre are selected as training data set and two years' movie data from 2012 September 2012 to June 2014 are collected as test data set. Using SO-PMI (Semantic Orientation from Point-wise Mutual Information) technique, we build customized sentiment dictionary per genre and compare prediction accuracy on review rating. As a result of the analysis, the prediction using customized dictionaries improves prediction accuracy. The performance improvement is 2.82% in overall and is statistical significant. Especially, the customized dictionary on 'sci-fi' leads the highest accuracy improvement among six genres. Even though this study shows the usefulness of customized dictionaries in sentiment analysis, further studies are required to generalize the results. In this study, we only consider adjectives as additional terms in customized sentiment dictionary. Other part of text such as verb and adverb can be considered to improve sentiment analysis performance. Also, we need to apply customized sentiment dictionary to other domain such as product reviews.