• Title/Summary/Keyword: Online mining

Search Result 395, Processing Time 0.033 seconds

Exploring an Optimal Feature Selection Method for Effective Opinion Mining Tasks

  • Eo, Kyun Sun;Lee, Kun Chang
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권2호
    • /
    • pp.171-177
    • /
    • 2019
  • This paper aims to find the most effective feature selection method for the sake of opinion mining tasks. Basically, opinion mining tasks belong to sentiment analysis, which is to categorize opinions of the online texts into positive and negative from a text mining point of view. By using the five product groups dataset such as apparel, books, DVDs, electronics, and kitchen, TF-IDF and Bag-of-Words(BOW) fare calculated to form the product review feature sets. Next, we applied the feature selection methods to see which method reveals most robust results. The results show that the stacking classifier based on those features out of applying Information Gain feature selection method yields best result.

프로세스 마이닝을 이용한 쇼핑몰 웹로그 데이터 분석 (Analyzing the weblog data of a shopping mall using process mining)

  • 김채영;용혜련;황현석
    • 한국산학기술학회논문지
    • /
    • 제21권11호
    • /
    • pp.777-787
    • /
    • 2020
  • 인터넷의 발전과 모바일 기기 보급의 확산으로 온라인 시장이 급속하게 성장하였다. 특히 쇼핑몰 이용이 폭발적으로 증가함에 따라 데이터를 활용한 이용자 행태 분석, 개인화된 상품 추천 및 서비스 개발 등의 연구가 이루어지고 있다. 이에 본 논문은 프로세스 마이닝을 통해 온라인 쇼핑몰의 전반적인 프로세스를 분석하고, 사용자의 구매에 영향을 미치는 요소를 파악하고자 하였다. 분석에는 대형 온라인 쇼핑몰인 모 기업의 데이터를 사용하였으며 분석 도구로는 R을 활용하였다. 분석 결과 파격세일, 월경품행사와 같은 이벤트 요소를 가진 카테고리에서의 고객 활동이 가장 두드러졌다. 이에 반해 검색, 로그인, 캠페인 액티비티는 중요도에 비해 적절한 활동이 이루어지지 않은 것으로 나타났다. 해당 액티비티는 고객의 정보와 니즈를 파악할 수 있는 단서가 될 수 있어 매우 중요하다. 따라서 연관검색어 추천의 정교화, 로그인 시 제공되는 쿠폰 등의 액티비티 관리가 필요하다고 사료된다. 본 논문에서는 앞서 논의된 내용 이외에도 쇼핑몰의 경쟁력 제고 및 이윤 증대를 위한 다양한 비즈니스 전략을 제안한다.

국내 소비자의 일본 패션제품에 대한 정치적 소비 연구 (Korean Consumers' Political Consumption of Japanese Fashion Products)

  • 최영현;이규혜
    • 한국의류학회지
    • /
    • 제44권2호
    • /
    • pp.295-309
    • /
    • 2020
  • In 2019, Japan announced trade regulations against Korean products; consequently, the sales of Japanese products in Korea dropped due to a Korean consumers' boycott. This study measured the Korean consumers' political consumption behavior toward Japanese fashion products. Unstructured text data from online media sources and consumer posted sources such as blog and SNS were collected. Text mining techniques and semantic network analysis were used to process unstructured data. This study used text mining techniques and semantic network analysis to process data. The results identified boycotting Japanese fashion products and buycotting alternative products and Korean brands due to consumers' political consumption. Two brand cases were investigated in detail. Online text data before and after the political action were compared and significant changes in consumption as well as emotional expressions were identified. Product related industry sectors were identified in terms of the political consumption of fashion: liquor, automobile and tourism industry sectors were closely linked to the fashion sector in terms of boycotting. More "boycott" and "buycott" fashion brands (reflected in consumer attitudes and feelings) were detected in consumer driven texts than in media driven sources.

키워드 기반 주제중심 분석을 이용한 비정형데이터 처리 (Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis)

  • 고명숙
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제6권11호
    • /
    • pp.521-526
    • /
    • 2017
  • 데이터는 데이터 형식이 다양하고 방대할 뿐만 아니라 그 생성 속도가 매우 빨라 기존의 데이터 처리 방식이 아닌 새로운 관리 및 분석 방법이 요구된다. 소셜 네트워크 상의 온라인 문서에서 인간의 언어로 쓰여진 비정형 텍스트에서 Text Mining기법을 사용하여 유용한 정보를 추출할 수 있다. 소셜미디어에 남긴 정치, 경제, 문화에 대한 메시지에 대한 경향을 파악하는 것이 어떤 주제에 관심을 가지고 있는지를 파악할 수 있는 요소가 된다. 본 연구에서는 주제 중심 분석 기법을 이용하여 주어진 키워드에 관한 온라인 뉴스를 대상으로 텍스트 마이닝을 수행하였다. LDA(Latent Dirichiet Allocation)를 이용하여 웹문서로부터 정보를 추출하고 이로부터 사람들이 실제로 주어진 키워드에 대하여 어떤 주제에 관심이 있고 관련된 핵심 가치 중 어떤 주제를 중심으로 전파되고 있는지를 분석하였다.

텍스트마이닝을 활용한 온라인 판매 여성 청바지 상품명에 나타난 키워드의 정보 특성 분석 (A Study on Keyword Information Characteristics of Product Names for Online Sales of Women's Jeans Using Text Mining)

  • 강여선
    • 한국의류학회지
    • /
    • 제47권1호
    • /
    • pp.35-51
    • /
    • 2023
  • This study used text mining to extract 2,842 keywords from 7,397 product names and organized them into categories in order to analyze the characteristics of keywords appearing in the product names of jeans after 2020. The item category included denim and Chungbaji [청바지], and Ilja [일자], while the silhouette category included wide and bootcut. In addition, high-waist and banding comprised the making sector, and the materials category consisted of napping, spandex, and soft blue. Denim surpassed the others in frequency, co-occurrence frequency, and centrality, and co-appeared with various other keywords. Also, the co-appearance of item and silhouette was prominent, and there were many keyword combinations that showed characteristics related to (a) high waist; (b) hemline detail; (c) rubber band; and (d) partial tearing. Furthermore, idiom expressions such as 'slim fit' and 'back tearing', which were not highlighted in the co-occurrence frequency, were additionally confirmed through correlation. Therefore, the product name analysis effectively identified the detailed characteristics of the silhouette and the making of jeans preferred by consumers.

텍스트 마이닝을 활용한 고객 리뷰의 유용성 지수 개선에 관한 연구 (A Study on Classifications of Useful Customer Reviews by Applying Text Mining Approach)

  • 이홍주
    • 한국IT서비스학회지
    • /
    • 제14권4호
    • /
    • pp.159-169
    • /
    • 2015
  • Customer reviews are one of the important sources for purchase decision makings in online stores. Online stores have tried to provide useful reviews in product pages to customers. To assess the usefulness of customer reviews before other users have voted enough on the reviews, diverse aspects of reviews were utilized in prevous studies. Style and semantic information were utilized in many studies. This study aims to test diverse alogrithms and datasets for identifying a proper classification method and threshold to classify useful reviews. In particular, most researches utilized ratio type helpfulness index as Amazon.com used. However, there is another type of usefulness index utilized in TripAdviser.com or Yelp.com, count type helpfulness index. There was no proper threshold to classify useful reviews yet for count type helpfulness index. This study used reivews and their usefulness votes on restaurnats from Yelp.com to devise diverse datasets and applied text mining approaches to classify useful reviews. Random Forest, SVM, and GLMNET showed the greater values of accuracy than other approaches.

Analysis on Review Data of Restaurants in Google Maps through Text Mining: Focusing on Sentiment Analysis

  • Shin, Bee;Ryu, Sohee;Kim, Yongjun;Kim, Dongwhan
    • Journal of Multimedia Information System
    • /
    • 제9권1호
    • /
    • pp.61-68
    • /
    • 2022
  • The importance of online reviews is prevalent as more people access goods or places online and make decisions to visit or purchase. However, such reviews are generally provided by short sentences or mere star ratings; failing to provide a general overview of customer preferences and decision factors. This study explored and broke down restaurant reviews found on Google Maps. After collecting and analyzing 5,427 reviews, we vectorized the importance of words using the TF-IDF. We used a random forest machine learning algorithm to calculate the coefficient of positivity and negativity of words used in reviews. As the result, we were able to build a dictionary of words for positive and negative sentiment using each word's coefficient. We classified words into four major evaluation categories and derived insights into sentiment in each criterion. We believe the dictionary of review words and analyzing the major evaluation categories can help prospective restaurant visitors to read between the lines on restaurant reviews found on the Web.

Understanding Brand Image from Consumer-generated Hashtags

  • Park, Keeyeon Ki-cheon;Kim, Hye-jin
    • Asia Marketing Journal
    • /
    • 제22권3호
    • /
    • pp.71-85
    • /
    • 2020
  • Social media has emerged as a major hub of engagement between brands and consumers in recent years, and allows user-generated content to serve as a powerful means of encouraging communication between the sides. However, it is challenging to negotiate user-generated content owing to its lack of structure and the enormous amount generated. This study focuses on the hashtag, a metadata tag that reflects customers' brand perception through social media platforms. Online users share their knowledge and impressions using a wide variety of hashtags. We examine hashtags that co-occur with particular branded hashtags on the social media platform, Instagram, to derive insights about brand perception. We apply text mining technology and network analysis to identify the perceptions of brand images among consumers on the site, where this helps distinguish among the diverse personalities of the brands. This study contributes to highlighting the value of hashtags in constructing brand personality in the context of online marketing.

Building Brand Loyalty and Recommendation through the Establishment of Brand Communities

  • Ulani Yunus;Yuniarti Rahayu;RA Christanti Taurina
    • Asian Journal for Public Opinion Research
    • /
    • 제12권3호
    • /
    • pp.184-213
    • /
    • 2024
  • This research investigates the intricate dynamics governing loyalty and recommendation behaviors. The primary objective is to discern the impact of community development on user loyalty and its subsequent influence on product recommendations, using the Indonesian online brand community of the software Micromine as a case study. The technology acceptance model, which argues that adoption is done because of perceived ease, and cognitive dissonance theory, which describes how individuals adjust to reduce discomfort, provide the framework for this study. Utilizing a quantitative methodology, all 300 members of the online Micromine Indonesia community population were surveyed. The findings reveal that community members establish emotional connections through active participation in community forums. Satisfaction with the software's solutions in mining endeavors is prevalent among Micromine community members. Regression analysis showed that a positive attitude about the brand community was positively correlated with both brand loyalty (R2 = .83) and the likelihood of recommending the brand (R2 = .78). This supports both theories, where brand community members adopt technology and reduce discomfort by supporting community activities.

랜드마크 윈도우 기반의 빈발 패턴 마이닝 기법의 분석 및 성능평가 (Analysis and Evaluation of Frequent Pattern Mining Technique based on Landmark Window)

  • 편광범;윤은일
    • 인터넷정보학회논문지
    • /
    • 제15권3호
    • /
    • pp.101-107
    • /
    • 2014
  • 본 논문에서는 랜드마크 윈도우 기반의 빈발 패턴 마이닝 기법을 분석하고 성능을 평가한다. 본 논문에서는 Lossy counting 알고리즘과 hMiner 알고리즘에 대한 분석을 진행한다. 최신의 랜드마크 알고리즘인 hMiner는 트랜잭션이 발생할 때 마다 빈발 패턴을 마이닝 하는 방법이다. 그래서 hMiner와 같은 랜드마크 기반의 빈발 패턴 마이닝을 온라인 마이닝이라고 한다. 본 논문에서는 랜드마크 윈도우 마이닝의 초기 알고리즘인 Lossy counting와 최신 알고리즘인 hMiner의 성능을 평가하고 분석한다. 우리는 성능평가의 척도로 마이닝 시간과 트랜잭션 당 평균 처리 시간을 평가한다. 그리고 우리는 저장 구조의 효율성을 평가하기 위하여 최대 메모리 사용량을 평가한다. 마지막으로 우리는 알고리즘이 안정적으로 마이닝이 가능한지 평가하기 위해 데이터베이스의 아이템 수를 변화시키면서 평가하는 확장성 평가를 수행한다. 두 알고리즘의 평가 결과로, 랜드마크 윈도우 기반의 빈발 패턴 마이닝은 실시간 시스템에 적합한 마이닝 방식을 가지고 있지만 메모리를 많이 사용했다.