• Title/Summary/Keyword: Online data mining

Search Result 293, Processing Time 0.029 seconds

Development of Online Fashion Thesaurus and Taxonomy for Text Mining (텍스트마이닝을 위한 패션 속성 분류체계 및 말뭉치 웹사전 구축)

  • Seyoon Jang;Ha Youn Kim;Songmee Kim;Woojin Choi;Jin Jeong;Yuri Lee
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.46 no.6
    • /
    • pp.1142-1160
    • /
    • 2022
  • Text data plays a significant role in understanding and analyzing trends in consumer, business, and social sectors. For text analysis, there must be a corpus that reflects specific domain knowledge. However, in the field of fashion, the professional corpus is insufficient. This study aims to develop a taxonomy and thesaurus that considers the specialty of fashion products. To this end, about 100,000 fashion vocabulary terms were collected by crawling text data from WSGN, Pantone, and online platforms; text subsequently was extracted through preprocessing with Python. The taxonomy was composed of items, silhouettes, details, styles, colors, textiles, and patterns/prints, which are seven attributes of clothes. The corpus was completed through processing synonyms of terms from fashion books such as dictionaries. Finally, 10,294 vocabulary words, including 1,956 standard Korean words, were classified in the taxonomy. All data was then developed into a web dictionary system. Quantitative and qualitative performance tests of the results were conducted through expert reviews. The performance of the thesaurus also was verified by comparing the results of text mining analysis through the previously developed corpus. This study contributes to achieving a text data standard and enables meaningful results of text mining analysis in the fashion field.

EXTENDED ONLINE DIVISIVE AGGLOMERATIVE CLUSTERING

  • Musa, Ibrahim Musa Ishag;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • Clustering data streams has an importance over many applications like sensor networks. Existing hierarchical methods follow a semi fuzzy clustering that yields duplicate clusters. In order to solve the problems, we propose an extended online divisive agglomerative clustering on data streams. It builds a tree-like top-down hierarchy of clusters that evolves with data streams using geometric time frame for snapshots. It is an enhancement of the Online Divisive Agglomerative Clustering (ODAC) with a pruning strategy to avoid duplicate clusters. Our main features are providing update time and memory space which is independent of the number of examples on data streams. It can be utilized for clustering sensor data and network monitoring as well as web click streams.

  • PDF

Causal model analysis between quantity and quality for deriving ranking model of Online reviews (온라인리뷰의 랭킹모델링을 위한 양과 질의 인과모형 분석)

  • Lee, Changyong;Kim, Keunhyung
    • The Journal of Information Systems
    • /
    • v.28 no.1
    • /
    • pp.1-16
    • /
    • 2019
  • Purpose The purpose of this study is to analyze causal relationship between quantity and quality for deriving ranking model of Online reviews. Thus, we propose implications for deriving the ranking model for retrieving Online reviews more effectively. Design/methodology/approach We collected Online review from Tripadvisor web sites which might be a kind of world-famous tourism web sites. We transformed the natural text reviews to quantified data which consists of quantified positive opinions, quantified negative opinions, quantified modification opinions, reviews lengths and grade scores by using opinion mining technologies in R package. We executed corelation and regression analysis about the data. Findings According to the empirical analysis result, this study confirmed that the review length influenced positive opinion, negative opinion and modification opinion. We also confirmed that negative opinion and modification opinion influenced the grade score.

Intelligent Marketing and Merchandising Techniques for an Internet Shopping Mall (인터넷 쇼핑몰에서의 지능화된 마케팅과 상품화 계획 기법)

  • Ha, Sung-Ho;Park, Sang-Chan
    • Asia pacific journal of information systems
    • /
    • v.12 no.3
    • /
    • pp.71-88
    • /
    • 2002
  • In this paper, intelligent marketing and merchandising methods utilizing data mining and Web mining techniques are proposed for online retailers to survive and succeed in gaining competitive advantage in a highly competitive environment. The first part of this paper explains the procedures of one-to-one marketing based on customer relationship management(CRM) techniques and personalized recommendation lists generation. The second part illustrates Web merchandising methods utilizing data mining techniques, such as association and sequential pattern mining. We expect that our Web marketing and merchandising methods will both provide a currently operating Internet shopping mall with more selling opportunities and give more useful product information to customers.

Brand Fandom Dynamic Analysis Framework based on Customer Data in Online Communities

  • Yu Cheng;Sangwoo Park;Inseop Lee;Changryong Kim;Sanghun Sul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2222-2240
    • /
    • 2023
  • Brand fandom refers to a collection of consumers with strong emotions toward a brand. Studying the dynamics of brand fandom can help brands understand which services or strategies influence their consumers to become a part of brand fandom. However, existing literature on fandom in the last three decades has mainly used qualitative methods, and there is still a lack of research on fandom using quantitative methods. Specifically, previous studies lack a framework for locating fandoms from online textual data and analyzing their dynamics. This study proposes a framework for exploring brand fandom dynamics based on online textual data. This framework consists of four phases based on the design thinking model: Preparing Data, Defining Fandom Categories, Generating Fandom Dynamics, and Analyzing Fandom Dynamics. This framework uses techniques such as social network analysis and process mining, combined with brand personality theory. We demonstrate the applicability of this framework using case studies of two Korean home appliance brands. The dataset contains 14,593 posts by consumers in 374 online communities. The results show that the proposed framework can analyze brand fandom dynamics using textual customer data. Our study contributes to the interdisciplinary research at the intersection of data-driven service design and consumer culture quantification.

Design and Implementation of a USN Middleware for Context-Aware and Sensor Stream Mining

  • Jin, Cheng-Hao;Lee, Yang-Koo;Lee, Seong-Ho;Yun, Un-il;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.19 no.1
    • /
    • pp.127-133
    • /
    • 2011
  • Recently, with the advances in sensor techniques and net work computing, Ubiquitous Sensor Network (USN) has been received a lot of attentions from various communities. The sensor nodes distributed in the sensor network tend to continuously generate a large amount of data, which is called stream data. Sensor stream data arrives in an online manner so that it is characterized as high-speed, real-time and unbounded and it requires fast data processing to get the up-to-date results. The data stream has many application domains such as traffic analysis, physical distribution, U-healthcare and so on. Therefore, there is an overwhelming need of a USN middleware for processing such online stream data to provide corresponding services to diverse applications. In this paper, we propose a novel USN middleware which can provide users both context-aware service and meaningful sequential patterns. Our proposed USN middleware is mainly focused on location based applications which use stream location data. We also show the implementation of our proposed USN middleware. By using the proposed USN middleware, we can save the developing cost of providing context aware services and stream sequential patterns mainly in location based applications.

Development of Hybrid Recommender System Using Review Data Mining: Kindle Store Data Analysis Case (리뷰 데이터 마이닝을 이용한 하이브리드 추천시스템 개발: Amazon Kindle Store 데이터 분석사례)

  • Yihua Zhang;Qinglong Li;Ilyoung Choi;Jaekyeong Kim
    • Information Systems Review
    • /
    • v.23 no.1
    • /
    • pp.155-172
    • /
    • 2021
  • With the recent increase in online product purchases, a recommender system that recommends products considering users' preferences has still been studied. The recommender system provides personalized product recommendation services to users. Collaborative Filtering (CF) using user ratings on products is one of the most widely used recommendation algorithms. During CF, the item-based method identifies the user's product by using ratings left on the product purchased by the user and obtains the similarity between the purchased product and the unpurchased product. CF takes a lot of time to calculate the similarity between products. In particular, it takes more time when using text-based big data such as review data of Amazon store. This paper suggests a hybrid recommendation system using a 2-phase methodology and text data mining to calculate the similarity between products easily and quickly. To this end, we collected about 980,000 online consumer ratings and review data from the online commerce store, Amazon Kinder Store. As a result of several experiments, it was confirmed that the suggested hybrid recommendation system reflecting the user's rating and review data has resulted in similar recommendation time, but higher accuracy compared to the CF-based benchmark recommender systems. Therefore, the suggested system is expected to increase the user's satisfaction and increase its sales.

A Clustering Algorithm for Sequence Data Using Rough Set Theory (러프 셋 이론을 이용한 시퀀스 데이터의 클러스터링 알고리즘)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.2
    • /
    • pp.113-119
    • /
    • 2008
  • The World Wide Web is a dynamic collection of pages that includes a huge number of hyperlinks and huge volumes of usage informations. The resulting growth in online information combined with the almost unstructured web data necessitates the development of powerful web data mining tools. Recently, a number of approaches have been developed for dealing with specific aspects of web usage mining for the purpose of automatically discovering user profiles. We analyze sequence data, such as web-logs, protein sequences, and retail transactions. In our approach, we propose the clustering algorithm for sequence data using rough set theory. We present a simple example and experimental results using a splice dataset and synthetic datasets.

  • PDF

Exploring Subcultural Capital in Sneakerhead Culture -A Netnographic Investigation- (스니커헤드 하위문화에 대한 네트노그라피 분석 -하위문화자본 개념을 중심으로-)

  • Solhwi Kim;Eunhyuk Yim
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.47 no.5
    • /
    • pp.943-958
    • /
    • 2023
  • This study explores the sneakerhead subculture through the lens of subcultural capital, primarily focusing on online community interactions. The analysis utilizes text mining techniques and netnographic research methods to examine textual data extracted from the online sneakerhead community and aims to elucidate manifestations of subcultural capital within the subculture. The findings underscore several key points: Firstly, shared experiences cultivated by the collective consciousness of subcultural capital foster solidarity among members. Secondly, ongoing validation of authenticity and comprehension of sneakers' cultural significance are member requirements. Subsequently, exhibiting greater levels of subcultural capital empowers members, resulting in hierarchical structures both within and beyond the community. Fourthly, resale-driven sneaker commercialization yields positive outcomes, including individual profit and cultural expansion, yet also brings negative consequences, such as market distortion and intra-community conflict. Lastly, the online community fills a pivotal role in dictating subcultural trends, effectively functioning as an institutional network. Given sneakers' enduring status as a fashion phenomenon, further examination of in this realm is warranted.

Dynamic Subspace Clustering for Online Data Streams (온라인 데이터 스트림에서의 동적 부분 공간 클러스터링 기법)

  • Park, Nam Hun
    • Journal of Digital Convergence
    • /
    • v.20 no.2
    • /
    • pp.217-223
    • /
    • 2022
  • Subspace clustering for online data streams requires a large amount of memory resources as all subsets of data dimensions must be examined. In order to track the continuous change of clusters for a data stream in a finite memory space, in this paper, we propose a grid-based subspace clustering algorithm that effectively uses memory resources. Given an n-dimensional data stream, the distribution information of data items in data space is monitored by a grid-cell list. When the frequency of data items in the grid-cell list of the first level is high and it becomes a unit grid-cell, the grid-cell list of the next level is created as a child node in order to find clusters of all possible subspaces from the grid-cell. In this way, a maximum n-level grid-cell subspace tree is constructed, and a k-dimensional subspace cluster can be found at the kth level of the subspace grid-cell tree. Through experiments, it was confirmed that the proposed method uses computing resources more efficiently by expanding only the dense space while maintaining the same accuracy as the existing method.