Search | Korea Science

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.59-83
- /
- 2018
With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.
https://doi.org/10.13088/jiis.2018.24.2.059 인용 PDF KSCI

The Exploratory Study of the Dynamic Price Changing under the On-line Context (온라인 환경 하에서 제품가격의 동적인 변화에 대한 탐색적 연구)

Na, Kyung Soo;Son, Young Seok
- The Journal of the Korea Contents Association
- /
- v.20 no.2
- /
- pp.511-521
- /
- 2020
The purpose of this study is to research the effects of the same brand item sold online at different prices. To that end, the exploratory study is focused on four factors that influence price dispersion online is suggested. First, the products are divided into utilitarian products and hedonic products in line with the product concept, and laptops, washing machines are included in the utility concept products. Also the backpacks and sneakers are included in the hedonic concept products. A total of 400 effective objectives were selected and analyzed for each concept products, 200 objectives. The research analysis revealed that the price dispersion is significant relationship with an average price, the highest price, the lowest price, the year of release, the number of retailers, and the number of reviews in case of the utility concept product. On the contrary, in case of hedonic concept products, average price, highest price, lowest price, year of release, retailer, and reviews and ratings is found to be positively related to price dispersion. Based on these results, theoretical and practical implications are discussed about the influence price dispersion on marketing.
https://doi.org/10.5392/JKCA.2020.20.02.511 인용 PDF KSCI HTML

Review of the Improved Moving Frame Acoustic Holography and Its Application to the Visualization of Moving Noise Sources (개선된 이동 프레임 음향 홀로그래피 방법과 이동 음원의 방사 소음의 가시화에 대한 응용)

박순홍;김양한
- Journal of KSNVE
- /
- v.10 no.4
- /
- pp.669-678
- /
- 2000
This paper reviews the improved moving frame acoustic holography (MFAH) method and its application. Moving frame acoustic holography was originally proposed to increase the aperture size and the spatial resolution of hologram by using a moving line array of microphones. The hologram of scanned plane can be obtained by assuming the sound field to be product of spatial and temporal information. Although conventional MFAH was only applied to sinusoidal signals, it allows us to visualize the noise generated by moving noise sources by employing a vertical line array of microphones affixed to the ground. However, the sound field generated by moving sources becomes different from that of stationary ones due to the movement of the sources. Firstly, this paper introduces the effect of moving noise sources on the obtained hologram by MFAH and the applicability of MFAH to the visualization of moving sources. Secondly, this paper also reviews improved MFAH that can visualize a coherent narrow band noise and a pass-by noise. The practical applicability of the improved MFAH was demonstrated by visualizing tire noise during a pass-by test.
PDF

The Impact of Senders' Identity to the Acceptance of Electronic Word-of-Mouth of Consumers in Vietnam

DINH, Hung;DOAN, Thanh Ha
- The Journal of Asian Finance, Economics and Business
- /
- v.7 no.2
- /
- pp.213-219
- /
- 2020
Studies related to Electronic Word-of-Mouth (eWOM) show that the acceptance of eWOM information is an important factor in customer purchase decisions. When consumers accept eWOM information, they tend to use that information in considering before making purchase decisions. In Viet Nam, there are few studies about eWOM information, especially on the acceptance of eWOM information. Research is conducted to test the influence of consumers on the perception of the senders' identity to the acceptance of online reviews (a kind of eWOM) in Viet Nam - a case study in Ho Chi Minh City. Using adjustment techniques, inspecting the scales and a theoretical model represent the relationship among the influential factors. The research is based on a sample of 522 consumers who use the Internet to search for product reviews before buying and used Structural Equation Modeling (SEM) to test the relationships among the variables. The research results show that the scales of the variables: Message Quality, Source Credibility, Perceived Message Usefulness, Perceived Senders' Identity, Perceived Message Credibility, Message Acceptance attain the validity and reliability in the research. The research contributes to the understanding of the determinants that influence the acceptance of eWOM information, which are informational factors, and factors related to consumer skepticism.
https://doi.org/10.13106/jafeb.2020.vol7.no2.213 인용 PDF KSCI HTML

Comparison of Sentiment Classification Performance of for RNN and Transformer-Based Models on Korean Reviews (RNN과 트랜스포머 기반 모델들의 한국어 리뷰 감성분류 비교)

Jae-Hong Lee
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.4
- /
- pp.693-700
- /
- 2023
Sentiment analysis, a branch of natural language processing that classifies and identifies subjective opinions and emotions in text documents as positive or negative, can be used for various promotions and services through customer preference analysis. To this end, recent research has been conducted utilizing various techniques in machine learning and deep learning. In this study, we propose an optimal language model by comparing the accuracy of sentiment analysis for movie, product, and game reviews using existing RNN-based models and recent Transformer-based language models. In our experiments, LMKorBERT and GPT3 showed relatively good accuracy among the models pre-trained on the Korean corpus.
https://doi.org/10.13067/JKIECS.2023.18.4.693 인용 PDF

An Enhanced Text Mining Approach using Ensemble Algorithm for Detecting Cyber Bullying

Z.Sunitha Bai;Sreelatha Malempati
- International Journal of Computer Science & Network Security
- /
- v.23 no.5
- /
- pp.1-6
- /
- 2023
Text mining (TM) is most widely used to process the various unstructured text documents and process the data present in the various domains. The other name for text mining is text classification. This domain is most popular in many domains such as movie reviews, product reviews on various E-commerce websites, sentiment analysis, topic modeling and cyber bullying on social media messages. Cyber-bullying is the type of abusing someone with the insulting language. Personal abusing, sexual harassment, other types of abusing come under cyber-bullying. Several existing systems are developed to detect the bullying words based on their situation in the social networking sites (SNS). SNS becomes platform for bully someone. In this paper, An Enhanced text mining approach is developed by using Ensemble Algorithm (ETMA) to solve several problems in traditional algorithms and improve the accuracy, processing time and quality of the result. ETMA is the algorithm used to analyze the bullying text within the social networking sites (SNS) such as facebook, twitter etc. The ETMA is applied on synthetic dataset collected from various data a source which consists of 5k messages belongs to bullying and non-bullying. The performance is analyzed by showing Precision, Recall, F1-Score and Accuracy.
https://doi.org/10.22937/IJCSNS.2023.23.5.1 인용 PDF

A Study on the Impact of the Organization Traits and New product Creativity on Development Performance (신제품개발 조직특성이 신제품 창조성과 개발성과에 미치는 영향에 관한 연구)

Jung, Duk-Hwa;Kim, Hyung-Jun
- Journal of Global Scholars of Marketing Science
- /
- v.16 no.2
- /
- pp.109-132
- /
- 2006
A Major aim of this study is to test the hypothesis that there is an association between empowerment, organizational memory, and new product creativity. In addition to exploring these relationships, this study examines the effect of new product creativity on new product performance, and identify the moderating effects of market uncertainty in the relationships between new product creativity and performance. For this purposes, we developed a research model based on the literature reviews of empowerment, organizational memory, market uncertainty, and new product creativity. A total of 121 usable survey responses has been used in the empirical research for foods manufacturing industry. The findings indicate that (1) Empowerment has a positive effect on new product creativity, (2) Organizational memory has a positive effect on new product creativity, (3) New product novelty has a positive effect on new product performance, and (4) Only competition uncertainty has a moderating effects between the new product meaningfulness and performance. The findings have implications for managers wishing to acquire the new product creativity and to better the new product development performance.
PDF

Overview of Methodological Quality of Systematic Reviews about Gastric Cancer Risk and Protective Factors

Li, Lun;Ying, Xiang-Ji;Sun, Tian-Tian;Yi, Kang;Tian, Hong-Liang;Sun, Rao;Tian, Jin-Hui;Yang, Ke-Hu
- Asian Pacific Journal of Cancer Prevention
- /
- v.13 no.5
- /
- pp.2069-2079
- /
- 2012
Background and Objective: A comprehensive overall review of gastric cancer (GC) risk and protective factors is a high priority, so we conducted the present study. Methods: Systematic searches in common medical electronic databases along with reference tracking were conducted to include all kinds of systematic reviews (SRs) about GC risk and protective factors. Two authors independently selected studies, extracted data, and evaluated the methodological qualities and the quality of evidence using R-AMSTAR and GRADE approaches. Results: Beta-carotene below 20 mg/day, fruit, vegetables, non-fermented soy-foods, whole-grain, and dairy product were GC protective factors, while beta-carotene 20 mg/day or above, pickled vegetables, fermented soy-foods, processed meat 30g/d or above, or salty foods, exposure to alcohol or smoking, occupational exposure to Pb, overweight and obesity, helicobacter pylori infection were GC risk factors. So we suggested screening and treating H. pylori infection, limiting the amount of food containing risk factors (processed meat consumption, beta-carotene, pickled vegetables, fermented soy-foods, salty foods, alcohol), stopping smoking, avoiding excessive weight gain, avoidance of Pb, and increasing the quantity of food containing protective components (fresh fruit and vegetables, non-fermented soy-foods, whole-grain, dairy products). Conclusions: The conclusions and recommendations of our study were limited by including SRs with poor methodological bases and low quality of evidence, so that more research applying checklists about assessing the methodological qualities and reporting are needed for the future.
https://doi.org/10.7314/APJCP.2012.13.5.2069 인용 PDF KSCI

Impact of Negative Review Type, Brand Reputation, and Opportunity Scarcity Perception on Preferences of Fashion Products in Social Commerce (소셜커머스에서 부정적 리뷰 유형, 브랜드 명성, 기회희소성지각이 패션제품 선호도에 미치는 영향)

Joo, Bora;Hwang, Sunjin
- Journal of Fashion Business
- /
- v.20 no.4
- /
- pp.207-225
- /
- 2016
This study aims to analyze the impact of negative review type, brand reputation and opportunity scarcity perception, on preferences of fashion products in social commerce. For the above evaluation, we used the 2 (negative review type: objective/subjective) ${\times}2$ (brand reputation: high/low) ${\times}2$ (opportunity scarcity perception: high/low) model, designed with three mixed elements. We enrolled 260 women in their 20s and 30s, who live in Seoul and have used social commerce; a final total of 207 subjects were considered for analysis. The data were analyzed using the SPSS 18 program and reliability test, t-test and three-way ANOVA were performed. Following observations were made: First, preferences were higher when the subjects read objective negative reviews than subjective negative reviews, and when a fashion product was from a brand of high reputation than a brand of low reputation. Second, the interaction effect between negative review type and brand reputation was greater among the subjects whose opportunity scarcity perception is high, than those having low opportunity scarcity perception. Thus, we conclude that the social commerce should encourage consumers to write more objective reviews, and fashion brands should manage their reputations well. Also, social commerce can use scarcity messages aggressively to increase preferences of global fashion luxury goods, which is actively marketed in social commerce since 2015.
https://doi.org/10.12940/jfb.2016.20.4.207 인용 PDF KSCI

Sentiment Analysis and Star Rating Prediction Based on Big Data Analysis of Online Reviews of Foreign Tourists Visiting Korea (방한 관광객의 온라인 리뷰에 대한 빅데이터 분석 기반의 감성분석 및 평점 예측모형)

Hong, Taeho
- Knowledge Management Research
- /
- v.23 no.1
- /
- pp.187-201
- /
- 2022
Online reviews written by tourists provide important information for the management and operation of the tourism industry. The star rating of online reviews is a simple quantitative evaluation of a product or service, but it is difficult to reflect the sincere attitude of tourists. There is also an issue; the star rating and review content are not matched. In this study, a star rating prediction model based on online review content was proposed to solve the discrepancy problem. We compared the differences in star ratings and sentiment by continent through sentiment analysis on tourist attractions and hotels written by foreign tourists who visited Korea. Variables were selected through TF-IDF vectorization and sentiment analysis results. Logit, artificial neural network, and SVM(Support Vector Machine) were used for the classification model, and artificial neural network and SVR(Support Vector regression) were applied for the rating prediction model. The online review rating prediction model proposed in this study could solve inconsistency problems and also could be applied even if when there is no star rating.
https://doi.org/10.15813/kmr.2022.23.1.010 인용 PDF KSCI

Search Result 395, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)