• Title/Summary/Keyword: Stream Data Mining

Search Result 97, Processing Time 0.024 seconds

Method for Preference Score Based on User Behavior (웹 사이트 이용 고객의 행동 정보를 기반으로 한 고객 선호지수 산출 방법)

  • Seo, Dong-Yal;Kim, Doo-Jin;Yun, Jeong-Ki;Kim, Jae-Hoon;Moon, Kang-Sik;Oh, Jae-Hoon
    • CRM연구
    • /
    • v.4 no.1
    • /
    • pp.55-68
    • /
    • 2011
  • Recently with the development of Web services by utilizing a variety of web content, the studies on user experience and personalization based on web usage has attracted much attention. Majority of personalized analysis are have been carried out based on existing data, primarily using the database and statistical models. These approaches are difficult to reflect in a timely mannerm, and are limited to reflect the true behavioral characteristics because the data itself was just a result of customers' behaviors. However, recent studies and commercial products on web analytics try to track and analyze all of the actions from landing to exit to provide personalized service. In this study, by analyzing the customer's click-stream behaviors, we define U-Score(Usage Score), P-Score (Preference Score), M-Score(Mania Score) to indicate variety of customer preferences. With the devised three indicators, we can identify the customer's preferences more precisely, provide in-depth customer reports and customer relationship management, and utilize personalized recommender services.

  • PDF

Efficient Dynamic Weighted Frequent Pattern Mining by using a Prefix-Tree (Prefix-트리를 이용한 동적 가중치 빈발 패턴 탐색 기법)

  • Jeong, Byeong-Soo;Farhan, Ahmed
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.253-258
    • /
    • 2010
  • Traditional frequent pattern mining considers equal profit/weight value of every item. Weighted Frequent Pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery by considering different weights for different items. Existing algorithms in this area are based on fixed weight. But in our real world scenarios the price/weight/importance of a pattern may vary frequently due to some unavoidable situations. Tracking these dynamic changes is very necessary in different application area such as retail market basket data analysis and web click stream management. In this paper, we propose a novel concept of dynamic weight and an algorithm DWFPM (dynamic weighted frequent pattern mining). Our algorithm can handle the situation where price/weight of a pattern may vary dynamically. It scans the database exactly once and also eligible for real time data processing. To our knowledge, this is the first research work to mine weighted frequent patterns using dynamic weights. Extensive performance analyses show that our algorithm is very efficient and scalable for WFP mining using dynamic weights.

Streaming Decision Tree for Continuity Data with Changed Pattern (패턴의 변화를 가지는 연속성 데이터를 위한 스트리밍 의사결정나무)

  • Yoon, Tae-Bok;Sim, Hak-Joon;Lee, Jee-Hyong;Choi, Young-Mee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.1
    • /
    • pp.94-100
    • /
    • 2010
  • Data Mining is mainly used for pattern extracting and information discovery from collected data. However previous methods is difficult to reflect changing patterns with time. In this paper, we introduce Streaming Decision Tree(SDT) analyzing data with continuity, large scale, and changed patterns. SDT defines continuity data as blocks and extracts rules using a Decision Tree's learning method. The extracted rules are combined considering time of occurrence, frequency, and contradiction. In experiment, we applied time series data and confirmed resonable result.

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree (큐보이드 전위트리를 이용한 빙산질의 처리)

  • Han, Sang-Gil;Yang, Woo-Sock;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.226-234
    • /
    • 2009
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.

Customer Behavior Data Model using User Profile Analysis

  • Jung, Yong Gyu;Lee, Agatha;Lee, Jeong Chan;Lee, Young Dae
    • International Journal of Advanced Culture Technology
    • /
    • v.1 no.2
    • /
    • pp.13-17
    • /
    • 2013
  • Today, most of the companies have numerous issues to take advantage of the data within the organization. Modeling techniques could be described using profile and historical log data as a tool of data mining techniques. It is covered increasingly with data entry, research, processing, modeling and reporting components of the icon in the form of easy-to-use in many datamining tools. Visual data mining process can create a data stream. In this paper, customer behavior is predicted in pages or products, using the history profile analysis and the navigation items are necessary to predict unknown features.

  • PDF

Ubiquitous Data Mining Using Hybrid Support Vector Machine (변형된 Support Vector Machine을 이용한 유비쿼터스 데이터 마이닝)

  • Jun Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.312-317
    • /
    • 2005
  • Ubiquitous computing has had an effect to politics, economics, society, culture, education and so forth. For effective management of huge Ubiquitous networks environment, various computers which are connected to networks has to decide automatic optimum with intelligence. Currently in many areas, data mining has been used effectively to construct intelligent systems. We proposed a hybrid support vector machine for Ubiquitous data mining which realized intelligent Ubiquitous computing environment. Many data were collected by sensor networks in Ubiquitous computing environment. There are many noises in these data. The aim of proposed method was to eliminate noises from stream data according to sensor networks. In experiment, we verified the performance of our proposed method by simulation data for Ubiquitous sensor networks.

Implementation of Customer Behavior Evaluation System Using Real-time Web Log Stream Data (실시간 웹로그 스트림데이터를 이용한 고객행동평가시스템 구현)

  • Lee, Hanjoo;Park, Hongkyu;Lee, Wonsuk
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.1-11
    • /
    • 2018
  • Recently, the volume of online shopping market continues to be fast-growing, that is important to provide customized service based on customer behavior evaluation analysis. The existing systems only provide analysis data on the profiles and behaviors of the consumers, and there is a limit to the processing in real time due to disk based mining. There are problems of accuracy and system performance problems to apply existing systems to web services that require real-time processing and analysis. Therefore, The system proposed in this paper analyzes the web click log streams generated in real time to calculate the concentration level of specific products and finds interested customers which are likely to purchase the products, and provides and intensive promotions to interested customers. And we verify the efficiency and accuracy of the proposed system.

Evaluation of Water Quality Prediction Models at Intake Station by Data Mining Techniques (데이터마이닝 기법을 적용한 취수원 수질예측모형 평가)

  • Kim, Ju-Hwan;Chae, Soo-Kwon;Kim, Byung-Sik
    • Journal of Environmental Impact Assessment
    • /
    • v.20 no.5
    • /
    • pp.705-716
    • /
    • 2011
  • For the efficient discovery of knowledge and information from the observed systems, data mining techniques can be an useful tool for the prediction of water quality at intake station in rivers. Deterioration of water quality can be caused at intake station in dry season due to insufficient flow. This demands additional outflow from dam since some extent of deterioration can be attenuated by dam reservoir operation to control outflow considering predicted water quality. A seasonal occurrence of high ammonia nitrogen ($NH_3$-N) concentrations has hampered chemical treatment processes of a water plant in Geum river. Monthly flow allocation from upstream dam is important for downstream $NH_3$-N control. In this study, prediction models of water quality based on multiple regression (MR), artificial neural network and data mining methods were developed to understand water quality variation and to support dam operations through providing predicted $NH_3$-N concentrations at intake station. The models were calibrated with eight years of monthly data and verified with another two years of independent data. In those models, the $NH_3$-N concentration for next time step is dependent on dam outflow, river water quality such as alkalinity, temperature, and $NH_3$-N of previous time step. The model performances are compared and evaluated by error analysis and statistical characteristics like correlation and determination coefficients between the observed and the predicted water quality. It is expected that these data mining techniques can present more efficient data-driven tools in modelling stage and it is found that those models can be applied well to predict water quality in stream river systems.

A Review of Window Query Processing for Data Streams

  • Kim, Hyeon Gyu;Kim, Myoung Ho
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.4
    • /
    • pp.220-230
    • /
    • 2013
  • In recent years, progress in hardware technology has resulted in the possibility of monitoring many events in real time. The volume of incoming data may be so large, that monitoring all individual data might be intractable. Revisiting any particular record can also be impossible in this environment. Therefore, many database schemes, such as aggregation, join, frequent pattern mining, and indexing, become more challenging in this context. This paper surveys the previous efforts to resolve these issues in processing data streams. The emphasis is on specifying and processing sliding window queries, which are supported in many stream processing engines. We also review the related work on stream query processing, including synopsis structures, plan sharing, operator scheduling, load shedding, and disorder control.

User Identification and Session completion in Input Data Preprocessing for Web Mining (웹 마이닝을 위한 입력 데이타의 전처리과정에서 사용자구분과 세션보정)

  • 최영환;이상용
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.9
    • /
    • pp.843-849
    • /
    • 2003
  • Web usage mining is the technique of data mining that analyzes web users' usage patterns by large web log. To use the web usage mining technique, we have to classify correctly users and users session in preprocessing, but can't classify them completely by only log files with standard web log format. To classify users and user session there are many problems like local cache, firewall, ISP, user privacy, cookey etc., but there isn't any definite method to solve the problems now. Especially local cache problem is the most difficult problem to classify user session which is used as input in web mining systems. In this paper we propose a heuristic method which solves local cache problem by using only click stream data of server side like referrer log, agent log and access log, classifies user sessions and completes session.