• Title/Summary/Keyword: online comments

Search Result 158, Processing Time 0.022 seconds

A Sentiment Classification Approach of Sentences Clustering in Webcast Barrages

  • Li, Jun;Huang, Guimin;Zhou, Ya
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.718-732
    • /
    • 2020
  • Conducting sentiment analysis and opinion mining are challenging tasks in natural language processing. Many of the sentiment analysis and opinion mining applications focus on product reviews, social media reviews, forums and microblogs whose reviews are topic-similar and opinion-rich. In this paper, we try to analyze the sentiments of sentences from online webcast reviews that scroll across the screen, which we call live barrages. Contrary to social media comments or product reviews, the topics in live barrages are more fragmented, and there are plenty of invalid comments that we must remove in the preprocessing phase. To extract evaluative sentiment sentences, we proposed a novel approach that clusters the barrages from the same commenter to solve the problem of scattering the information for each barrage. The method developed in this paper contains two subtasks: in the data preprocessing phase, we cluster the sentences from the same commenter and remove unavailable sentences; and we use a semi-supervised machine learning approach, the naïve Bayes algorithm, to analyze the sentiment of the barrage. According to our experimental results, this method shows that it performs well in analyzing the sentiment of online webcast barrages.

Bias & Hate Speech Detection Using Deep Learning: Multi-channel CNN Modeling with Attention (딥러닝 기술을 활용한 차별 및 혐오 표현 탐지 : 어텐션 기반 다중 채널 CNN 모델링)

  • Lee, Wonseok;Lee, Hyunsang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1595-1603
    • /
    • 2020
  • Online defamation incidents such as Internet news comments on portal sites, SNS, and community sites are increasing in recent years. Bias and hate expressions threaten online service users in various forms, such as invasion of privacy and personal attacks, and defamation issues. In the past few years, academia and industry have been approaching in various ways to solve this problem The purpose of this study is to build a dataset and experiment with deep learning classification modeling for detecting various bias expressions as well as hate expressions. The dataset was annotated 7 labels that 10 personnel cross-checked. In this study, each of the 7 classes in a dataset of about 137,111 Korean internet news comments is binary classified and analyzed through deep learning techniques. The Proposed technique used in this study is multi-channel CNN model with attention. As a result of the experiment, the weighted average f1 score was 70.32% of performance.

Analysis of Topics Related to Population Aging Using Natural Language Processing Techniques (자연어 처리 기술을 활용한 인구 고령화 관련 토픽 분석)

  • Hyunjung Park;Taemin Lee;Heuiseok Lim
    • Journal of Information Technology Services
    • /
    • v.23 no.1
    • /
    • pp.55-79
    • /
    • 2024
  • Korea, which is expected to enter a super-aged society in 2025, is facing the most worrisome crisis worldwide. Efforts are urgently required to examine problems and countermeasures from various angles and to improve the shortcomings. In this regard, from a new viewpoint, we intend to derive useful implications by applying the recent natural language processing techniques to online articles. More specifically, we derive three research questions: First, what topics are being reported in the online media and what is the public's response to them? Second, what is the relationship between these aging-related topics and individual happiness factors? Third, what are the strategic directions and implications for benchmarking discussed to solve the problem of population aging? To find answers to these, we collect Naver portal articles related to population aging and their classification categories, comments, and number of comments, including other numerical data. From the data, we firstly derive 33 topics with a semi-supervised BERTopic by reflecting article classification information that was not used in previous studies, conducting sentiment analysis of comments on them with a current open-source large language model. We also examine the relationship between the derived topics and personal happiness factors extended to Alderfer's ERG dimension, carrying out additional 3~4-gram keyword frequency analysis, trend analysis, text network analysis based on 3~4-gram keywords, etc. Through this multifaceted approach, we present diverse fresh insights from practical and theoretical perspectives.

F_MixBERT: Sentiment Analysis Model using Focal Loss for Imbalanced E-commerce Reviews

  • Fengqian Pang;Xi Chen;Letong Li;Xin Xu;Zhiqiang Xing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.263-283
    • /
    • 2024
  • Users' comments after online shopping are critical to product reputation and business improvement. These comments, sometimes known as e-commerce reviews, influence other customers' purchasing decisions. To confront large amounts of e-commerce reviews, automatic analysis based on machine learning and deep learning draws more and more attention. A core task therein is sentiment analysis. However, the e-commerce reviews exhibit the following characteristics: (1) inconsistency between comment content and the star rating; (2) a large number of unlabeled data, i.e., comments without a star rating, and (3) the data imbalance caused by the sparse negative comments. This paper employs Bidirectional Encoder Representation from Transformers (BERT), one of the best natural language processing models, as the base model. According to the above data characteristics, we propose the F_MixBERT framework, to more effectively use inconsistently low-quality and unlabeled data and resolve the problem of data imbalance. In the framework, the proposed MixBERT incorporates the MixMatch approach into BERT's high-dimensional vectors to train the unlabeled and low-quality data with generated pseudo labels. Meanwhile, data imbalance is resolved by Focal loss, which penalizes the contribution of large-scale data and easily-identifiable data to total loss. Comparative experiments demonstrate that the proposed framework outperforms BERT and MixBERT for sentiment analysis of e-commerce comments.

A Study of Users' Ideological Propensity in the Comments of Online News: Focusing upon the Stories of the Web Portal Sites and the Press Website News Related to the 20th presidential Election (온라인 뉴스 댓글에 나타난 뉴스 이용자들의 이념적 성향에 관한 연구: 포털과 언론사닷컴의 20대 대선 관련 뉴스기사를 중심으로)

  • Kwang Soon Park;Jong Mook Ahn
    • Journal of Industrial Convergence
    • /
    • v.20 no.12
    • /
    • pp.135-143
    • /
    • 2022
  • This paper aims to grasp what propensity users have in their ideology from the comments in the Web Portal News and the Press Website News. Through these analytical results, the political propensities of not only the Web Portal News and the Press Website News but also the voters who use these news media could be grasped. The collection of data necessary for this study has been made from the comments of 174 news stories for about 90 days before the election day. For the analysis, T-test has been used in order to compare Naver News with Daum News, the Minjoo Party of Korea with the People Power Party, and the Press Web Site News with Naver News. As a result of the analysis, the comments of Naver News took the higher percentage in the positive writings about the candidates of the conservative party. but, in contrast, those of Daum News in that percentage were higher about the ones of the progressive party. Accordingly, it can be found that Naver News is mainly used by users with the politically conservative propensity, while Daum News is mostly used by those with progressive one.

The Political Recognition Surrounding Candlelight Rally and Taegeukgi Rally: A Big Data Analytics on Online News Comments (촛불 집회와 태극기 집회를 둘러싼 정국 인식: 온라인 뉴스 댓글에 대한 빅데이터 분석)

  • Kim, ChanWoo;Jung, Byungkee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.6
    • /
    • pp.875-885
    • /
    • 2018
  • This study analyzed the major issues of the Candlelight Rally and Taegukgi Rally registered in news comments of the politics section of the portal site from October 24, 2016 to March 19, 2017. We examined the political recognition of the two rallies with the Named Entity Recognition. The main analytical items are the responsibility for impeachment, the subject and method of settlement, and other major issues. As a result of the analysis, the comments of the Candlelight Rally focused on the impeachment support and the legal penalties of the regime ministers, and insisted on resolving the political situation through the next election after impeachment. The comments of the Taegukgi Rally focused on the rejection of the impeachment to maintain the regime and insisted on rejecting the impeachment of the Constitutional Court. The conflicts between the group that supported Candlelight Rallis and the group that supported Taegukgi rallies are predicted to last at least for the time being (Park Geun-hye's trial period) after the presidential election. After the impeachment of the President and replacement of the regime this conflict will develop into the confrontation between the pursuit of liquidation and new politics and the attempt to influence the trial of Park Geun-hye. Therefore, the efforts to integrate society in the aftermath are necessary.

Online Word-of-Mouth: Motivation for Writing Product Reviews on Internet Shopping Sites (온라인 구전 커뮤니케이션: 온라인 쇼핑몰에서의 소비자 사용후기 작성동기)

  • Kim, Sung-Hee
    • Journal of Fashion Business
    • /
    • v.14 no.2
    • /
    • pp.81-94
    • /
    • 2010
  • The online shopping environment has radically changed consumer shopping behavior. Without the actual physical shopping experience in a brick-and-mortar store, consumers make purchasing decisions over the Internet. They make an effort to obtain product information not only from online merchants, but also from previous purchasers in order to make an informed decision. Accordingly, customer comments are expected to have a significant impact on decisions to purchase goods and services online. This paper focuses on one type of electronic word-of-mouth, the online consumer review. It derives several motivations why customers post product reviews on shopping mall sites. Customer motives were identified through an in depth one-on-one interview with twenty female respondents conducted twice from June $17^{th}$ to September $11^{th}$, 2009. The interviews lasted between 40 and 60 minutes. The results showed that consumers write product reviews based on six motivations: to receive a reward or remuneration for writing a product review, to share information with other customers, to improve the quality of goods and services, to reduce customer dissatisfaction, to recommend products and services, and to derive pleasure.

A Study on the Impact of Negativity Bias on Online Spread of Reputation : With a Case Study of Election Campaign (온라인상에서 부정적 편향에 따른 평판 확산 차이에 관한 연구 : 선거 사례를 중심으로)

  • Kim, Na-Ra;Shin, Kyung-Shik
    • Journal of Information Technology Services
    • /
    • v.14 no.1
    • /
    • pp.263-276
    • /
    • 2015
  • As a social being, people can cooperate and control one another through the power of reputation, which is a critical opinion of someone given by others. Nevertheless, there have been obstacles in clarifying the identity of traditional types of reputation, for they are mostly words of mouth passed among members of a society. However, due to dramatic technological advancement and widespread use of the Internet and social media, now we can clearly see and analyze written reputations, which used to be passed only from mouth to mouth. Against this background, this study examines whether a negativity bias-a notion that an event of a more negative nature has a greater effect on one's psychological state than a positive event-applies to spread of reputation online, and examines related factors and effects. To this end, reputation-related online comments left by social media users during the election period of Korea's 6th provincial election on 4 June 2014 were analyzed. For the analysis, a Bass diffusion model was used, which is based on the innovation diffusion theory. The analysis results confirmed that, at online forum, negative reputations spread more quickly and more widely than positive ones, had a greater impact, and mass media such as online news outlets had a significant influence on spread of reputation online.

Identifying Regional Tourism Resources Using Webometric Network Analysis: A case of Suseong-gu in Daegu, South Korea (웹보메트릭스를 활용한 지역관광자원 발굴 및 네트워크 분석: 대구 수성구를 중심으로)

  • Song, Hwa Young;Zhu, Yu Peng;Kim, Ji Eun;Oh, Jung Hyun;Park, Han Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.475-486
    • /
    • 2020
  • The purpose of present study is to identify the regional tourism resources using Webometric network analysis. The study focuses on Suseong area in Daegu metropolitan city. Various kinds of web-based data, for example, hit counts, online news, and public comments, were used to discover hot places and people's responses. The research question is, 'First, what is the optimum level of the search engine for suseong? Second, what is the online appearance of tourist resources in suseong? Which region is the center of tourism with high levels of emergence? Third, what are the main contents of news articles and comments related to the Suseong pond?'. The results show that the search engine optimization level in Suseong is lower than that in other areas in Daegu. In other words, tourism information and contents regarding Suseong are not highly visible on cyber space. Importantly, Suseong pond had the highest online presence. A close analysis of both online news and users' comments on Suseong pond, however, revealed the biggest concern as calling for improving public accessibility to tourism infrastructure. The findings are expected to contribute to policy development and service operation related to tourism resources in Suseong.

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.