• Title/Summary/Keyword: 리뷰데이터

Search Result 323, Processing Time 0.028 seconds

Improving Accuracy of Noise Review Filtering for Places with Insufficient Training Data

  • Hyeon Gyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.7
    • /
    • pp.19-27
    • /
    • 2023
  • In the process of collecting social reviews, a number of noise reviews irrelevant to a given search keyword can be included in the search results. To filter out such reviews, machine learning can be used. However, if the number of reviews is insufficient for a target place to be analyzed, filtering accuracy can be degraded due to the lack of training data. To resolve this issue, we propose a supervised learning method to improve accuracy of the noise review filtering for the places with insufficient reviews. In the proposed method, training is not performed by an individual place, but by a group including several places with similar characteristics. The classifier obtained through the training can be used for the noise review filtering of an arbitrary place belonging to the group, so the problem of insufficient training data can be resolved. To verify the proposed method, a noise review filtering model was implemented using LSTM and BERT, and filtering accuracy was checked through experiments using real data collected online. The experimental results show that the accuracy of the proposed method was 92.4% on the average, and it provided 87.5% accuracy when targeting places with less than 100 reviews.

Explainable Artificial Intelligence Applied in Deep Learning for Review Helpfulness Prediction (XAI 기법을 이용한 리뷰 유용성 예측 결과 설명에 관한 연구)

  • Dongyeop Ryu;Xinzhe Li;Jaekyeong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.35-56
    • /
    • 2023
  • With the development of information and communication technology, numerous reviews are continuously posted on websites, which causes information overload problems. Therefore, users face difficulty in exploring reviews for their decision-making. To solve such a problem, many studies on review helpfulness prediction have been actively conducted to provide users with helpful and reliable reviews. Existing studies predict review helpfulness mainly based on the features included in the review. However, such studies disable providing the reason why predicted reviews are helpful. Therefore, this study aims to propose a methodology for applying eXplainable Artificial Intelligence (XAI) techniques in review helpfulness prediction to address such a limitation. This study uses restaurant reviews collected from Yelp.com to compare the prediction performance of six models widely used in previous studies. Next, we propose an explainable review helpfulness prediction model by applying the XAI technique to the model with the best prediction performance. Therefore, the methodology proposed in this study can recommend helpful reviews in the user's purchasing decision-making process and provide the interpretation of why such predicted reviews are helpful.

Spatial analysis based on topic modeling using foreign tourist review data: Case of Daegu (외국인 관광객 리뷰데이터를 활용한 토픽모델링 기반의 공간분석: 대구광역시를 사례로)

  • Jung, Ji-Woo;Kim, Seo-Yun;Kim, Hyeon-Yu;Yoon, Ju-Hyeok;Jang, Won-Jun;Kim, Keun-Wook
    • Journal of Digital Convergence
    • /
    • v.19 no.8
    • /
    • pp.33-42
    • /
    • 2021
  • As smartphone-based tourism platforms have become active, policy establishment and service enhancement using review data are being made in various fields. In the case of the preceding studies using tourism review data, most of the studies centered on domestic tourists were conducted, and in the case of foreign tourist studies, studies were conducted only on data collected in some languages and text mining techniques. In this study, 3,515 review data written by foreigners were collected by designating the "Daegu attractions" keyword through the online review site. And LDA-based topic modeling was performed to derive tourism topics. The spatial approach through global and local spatial autocorrelation analysis for each topic can be said to be different from previous studies. As a result of the analysis, it was confirmed that there is a global spatial autocorrelation, and that tourist destinations mainly visited by foreigners are concentrated locally. In addition, hot spots have been drawn around Jung-gu in most of the topics. Based on the analysis results, it is expected to be used as a basic research for spatial analysis based on local government foreign tourism policy establishment and topic modeling. And The limitations of this study were also presented.

A multi-channel CNN based online review helpfulness prediction model (Multi-channel CNN 기반 온라인 리뷰 유용성 예측 모델 개발에 관한 연구)

  • Li, Xinzhe;Yun, Hyorim;Li, Qinglong;Kim, Jaekyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.171-189
    • /
    • 2022
  • Online reviews play an essential role in the consumer's purchasing decision-making process, and thus, providing helpful and reliable reviews is essential to consumers. Previous online review helpfulness prediction studies mainly predicted review helpfulness based on the consistency of text and rating information of online reviews. However, there is a limitation in that representation capacity or review text and rating interaction. We propose a CNN-RHP model that effectively learns the interaction between review text and rating information to improve the limitations of previous studies. Multi-channel CNNs were applied to extract the semantic representation of the review text. We also converted rating into independent high-dimensional embedding vectors representing the same dimension as the text vector. The consistency between the review text and the rating information is learned based on element-wise operations between the review text and the star rating vector. To evaluate the performance of the proposed CNN-RHP model in this study, we used online reviews collected from Amazom.com. Experimental results show that the CNN-RHP model indicates excellent performance compared to several benchmark models. The results of this study can provide practical implications when providing services related to review helpfulness on online e-commerce platforms.

A Visualization of Movie Review based on a Semantic Network Analysis (의미연결망 분석을 활용한 영화 리뷰 시각화)

  • Kim, Seul-gi;Kim, Jang Hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.197-200
    • /
    • 2018
  • The aim of current research is to suggest a interface for movie reviews at a glance through semantic network analysis. The implication of this study is to systematically investigate the structure of eWoM. Specifically, by visualizing semantic networks of movie reviews this study attempts to provide a prototype of a possible review system that can check the response of movie viewer at a glance.

  • PDF

Two-Stage Contrastive Learning for Representation Learning of Korean Review Opinion (두 단계 대조 학습 기반 한국어 리뷰 의견 표현벡터 학습)

  • Jisu Seo;Seung-Hoon Na
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.262-267
    • /
    • 2022
  • 이커머스 리뷰와 같은 특정 도메인의 경우, 텍스트 표현벡터 학습을 위한 양질의 오픈 학습 데이터를 구하기 어렵다. 또한 사람이 수동으로 검수하며 학습데이터를 만드는 경우, 많은 시간과 비용을 소모하게 된다. 따라서 본 논문에서는 수동으로 검수된 데이터없이 양질의 텍스트 표현벡터를 만들 수 있도록 두 단계의 대조 학습 시스템을 제안한다. 이 두 단계 대조 학습 시스템은 레이블링 된 학습데이터가 필요하지 않은 자기지도 학습 단계와 리뷰의 특성을 고려한 자동 레이블링 기반의 지도 학습 단계로 구성된다. 또한 노이즈에 강한 오류함수와 한국어에 유효한 데이터 증강 기법을 적용한다. 그 결과 스피어먼 상관 계수 기반의 성능 평가를 통해, 베이스 모델과 비교하여 성능을 14.03 향상하였다.

  • PDF

Semantic analysis via application of deep learning using Naver movie review data (네이버 영화 리뷰 데이터를 이용한 의미 분석(semantic analysis))

  • Kim, Sojin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.19-33
    • /
    • 2022
  • With the explosive growth of social media, its abundant text-based data generated by web users has become an important source for data analysis. For example, we often witness online movie reviews from the 'Naver Movie' affecting the general public to decide whether they should watch the movie or not. This study has conducted analysis on the Naver Movie's text-based review data to predict the actual ratings. After examining the distribution of movie ratings, we performed semantics analysis using Korean Natural Language Processing. This research sought to find the best review rating prediction model by comparing machine learning and deep learning models. We also compared various regression and classification models in 2-class and multi-class cases. Lastly we explained the causes of review misclassification related to movie review data characteristics.

The Effect of Text Consistency between the Review Title and Content on Review Helpfulness (온라인 리뷰의 제목과 내용의 일치성이 리뷰 유용성에 미치는 영향)

  • Li, Qinglong;Kim, Jaekyeong
    • Knowledge Management Research
    • /
    • v.23 no.3
    • /
    • pp.193-212
    • /
    • 2022
  • Many studies have proposed several factors that affect review helpfulness. Previous studies have investigated the effect of quantitative factors (e.g., star ratings) and affective factors (e.g., sentiment scores) on review helpfulness. Online reviews contain titles and contents, but existing studies focus on the review content. However, there is a limitation to investigating the factors that affect review helpfulness based on the review content without considering the review title. However, previous studies independently investigated the effect of review content and title on review helpfulness. However, it may ignore the potential impact of similarity between review titles and content on review helpfulness. This study used text consistency between review titles and content affect review helpfulness based on the mere exposure effect theory. We also considered the role of information clearness, review length, and source reliability. The results show that text consistency between the review title and the content negatively affects the review helpfulness. Furthermore, we found that information clearness and source reliability weaken the negative effects of text consistency on review helpfulness.

Keyword Extraction Technique for Attractions using Online Reviews - Topic Modeling and Markov Chain (온라인 리뷰를 활용한 관광지 키워드 추출 기법 - 토픽 모델링과 Markov Chain)

  • Kim, MyeongSeon;Lee, KangWoo;Lim, JiWon;Hong, Soon-Goo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.521-523
    • /
    • 2021
  • 관광 분야에서 온라인 리뷰의 중요성이 커지고 있다. 온라인 리뷰의 텍스트 데이터는 파악이 어렵다. 이에 본 연구에서는 특정 관광지에 대한 온라인 리뷰 텍스트 데이터가 나타내는 전반적인 의견을 직관적으로 도출하는 방법에 대해 알아보고자, 토픽 모델링과 Markov Chain을 시행했다. '해운대'에 대한 온라인 리뷰를 수집한 후, LDA와 BTM을 활용하여 주제를 도출하고, Markov Chain을 시각화하여 키워드 간의 관계와 전체적인 평가 내용을 확인했다. 사용된 기법은 각자 특징적인 결과를 제시했기 때문에 다양한 기법을 상보적으로 이용하기를 제안하였다.

A Study of Factors Influencing Helpfulness of Game Reviews: Analyzing STEAM Game Review Data (게임 유용성 평가에 미치는 요인에 관한 연구: 스팀(STEAM) 게임 리뷰데이터 분석)

  • Kang, Ha-Na;Yong, Hye-Ryeon;Hwang, Hyun-Seok
    • Journal of Korea Game Society
    • /
    • v.17 no.3
    • /
    • pp.33-44
    • /
    • 2017
  • With the development of the Internet environment, various types of online reviews are being generated and exchanged among consumers to share their opinions. In line with this trend, companies are making efforts to analyze online reviews and use the results in various business activities such as marketing, sales, and product development. However, research on online review in industry related to 'Video Game' which is representative experience goods has not been performed enough. Therefore, this study analyzed STEAM community review data using machine learning techniques. We analyzed the factors affecting the opinion of other users' game review. We also propose managerial implications to incease user loyalty and usability.