• Title/Summary/Keyword: web log mining

Search Result 82, Processing Time 0.022 seconds

A Study on the Development of Internet Purchase Support Systems Based on Data Mining and Case-Based Reasoning (데이터마이닝과 사례기반추론 기법에 기반한 인터넷 구매지원 시스템 구축에 관한 연구)

  • 김진성
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.3
    • /
    • pp.135-148
    • /
    • 2003
  • In this paper we introduce the Internet-based purchase support systems using data mining and case-based reasoning (CBR). Internet Business activity that involves the end user is undergoing a significant revolution. The ability to track users browsing behavior has brought the vendor and end customer's closer than ever before. It is now possible for a vendor to personalize his product message for individual customers at massive scale. Most of former researchers, in this research arena, used data mining techniques to pursue the customer's future behavior and to improve the frequency of repurchase. The area of data mining can be defined as efficiently discovering association rules from large collections of data. However, the basic association rule-based data mining technique was not flexible. If there were no inference rules to track the customer's future behavior, association rule-based data mining systems may not present more information. To resolve this problem, we combined association rule-based data mining with CBR mechanism. CBR is used in reasoning for customer's preference searching and training through the cases. Data mining and CBR-based hybrid purchase support mechanism can reflect both association rule-based logical inference and case-based information reuse. A Web-log data gathered in the real-world Internet shopping mall is given to illustrate the quality of the proposed systems.

A Dynamic Recommendation System Using User Log Analysis and Document Similarity in Clusters (사용자 로그 분석과 클러스터 내의 문서 유사도를 이용한 동적 추천 시스템)

  • 김진수;김태용;최준혁;임기욱;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.586-594
    • /
    • 2004
  • Because web documents become creation and disappearance rapidly, users require the recommend system that offers users to browse the web document conveniently and correctly. One largely untapped source of knowledge about large data collections is contained in the cumulative experiences of individuals finding useful information in the collection. Recommendation systems attempt to extract such useful information by capturing and mining one or more measures of the usefulness of the data. The existing Information Filtering system has the shortcoming that it must have user's profile. And Collaborative Filtering system has the shortcoming that users have to rate each web document first and in high-quantity, low-quality environments, users may cover only a tiny percentage of documents available. And dynamic recommendation system using the user browsing pattern also provides users with unrelated web documents. This paper classifies these web documents using the similarity between the web documents under the web document type and extracts the user browsing sequential pattern DB using the users' session information based on the web server log file. When user approaches the web document, the proposed Dynamic recommendation system recommends Top N-associated web documents set that has high similarity between current web document and other web documents and recommends set that has sequential specificity using the extracted informations and users' session information.

Recommending System of Products based on Data mining Technique (데이터 마이닝 기법을 이용한 상품 추천 시스템)

  • Jung, Min-A.;Park, Kyung-Woo;Cho, Sung-Eui
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.3
    • /
    • pp.608-613
    • /
    • 2006
  • There are many e-showing mall because of revitalization of e-commerce system. It is necessary to recommending system of products that is for saving time and effort of customer. In this paper, we propose the system that is applying classification among data mining techniques to analysis of log data of customer. This log data contains access of user and purchasing of products. The proposed system operates in two phases. The first phase is composed of data filter module and association extraction module among web pages. The second phase is composed of personalization module and rule generation module. Customer can easily know the recommended sites because the proposed system can present rank of the recommended web pages to customer. As a result, the proposed system can efficiently do recommending of products to customer.

Detecting smartphone user habits using sequential pattern analysis

  • Lu, Dang Nhac;Nguyen, Thu Trang;Nguyen, Thi Hau;Nguyen, Ha Nam;Choi, Gyoo Seok
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.7 no.1
    • /
    • pp.20-22
    • /
    • 2015
  • Recently, the study of smart phone user habits has become a highly focused topic due to the rapid growth of the smart phone market. Indeed, sequential pattern analysis methods were efficiently used for web-based user habit mining long time ago. However, by means of simulations, it has been observed that these methods might fail for smart phone-based user habit mining. In this paper, we propose a novel approach that leads to a considerably increased performance of the traditional sequential pattern analysis methods by reasonably cutting off each chronological sequence of user logs on a device into shorter ones, which represent the sequential user activities in various periods of a day.

Web Structure Mining by Extracting Hyperlinks from Web Documents and Access Logs (웹 문서와 접근로그의 하이퍼링크 추출을 통한 웹 구조 마이닝)

  • Lee, Seong-Dae;Park, Hyu-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.11
    • /
    • pp.2059-2071
    • /
    • 2007
  • If the correct structure of Web site is known, the information provider can discover users# behavior patterns and characteristics for better services, and users can find useful information easily and exactly. There may be some difficulties, however, to extract the exact structure of Web site because documents one the Web tend to be changed frequently. This paper proposes new method for extracting such Web structure automatically. The method consists of two phases. The first phase extracts the hyperlinks among Web documents, and then constructs a directed graph to represent the structure of Web site. It has limitations, however, to discover the hyperlinks in Flash and Java Applet. The second phase is to find such hidden hyperlinks by using Web access log. It fist extracts the click streams from the access log, and then extract the hidden hyperlinks by comparing with the directed graph. Several experiments have been conducted to evaluate the proposed method.

An Web Caching Method based on the Object Reference Probability Distribution Characteristics and the Life Time of Web Object (웹 객체의 참조확률분포특성과 평균수명 기반의 웹 캐싱 기법)

  • Na, Yun-Ji;Ko, Il-Seok
    • Convergence Security Journal
    • /
    • v.6 no.4
    • /
    • pp.91-99
    • /
    • 2006
  • Generally, a study of web caching is conducted on a performance improvement with structural approaches and a new hybrid method using existing methods, and studies on caching method itself. And existing analysis of reference-characteristic are conducted on a history analysis and a preference of users, a view point of data mining by log analysis. In this study, we analyze the reference-characteristic of web object on a view point of a characteristic of probability-distribution and a mean value of lifetime of a web-object. And using this result, we propose the new method for a performance improvement of a web-caching.

  • PDF

An Integrated Data Mining Model for Customer Relationship Management (고객관계관리를 위한 통합 데이터마이닝 모형 연구)

  • Song, In-Young;Yi, Tae-Seok;Shin, Ki-Jeong;Kim, Kyung-Chang
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.3
    • /
    • pp.83-99
    • /
    • 2007
  • Nowadays, the advancement of digital information technology resulting in the increased interest of the management and the use of information has given stimulus to the research on the use and management of information. In this paper, we propose an integrated data mining model that can provide the necessary information and interface to users of scientific information portal service according to their respective classification groups. The integrated model classifies users from log files automatically collected by the web server based on users' behavioral patterns. By classifying the existing users of the web site, which provides information service, and analyzing their patterns, we proposed a web site utilization methodology that provides dynamic interface and user oriented site operating policy. In addition, we believe that our research can provide continuous web site user support, as well as provide information service according to user classification groups.

  • PDF

Web Log Analysis Using Support Vector Regression

  • Jun, Sung-Hae;Lim, Min-Taik;Jorn, Hong-Seok;Hwang, Jin-Soo;Park, Seong-Yong;Kim, Jee-Yun;Oh, Kyung-Whan
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.61-77
    • /
    • 2003
  • Due to the wide expansion of the internet, people can freely get information what they want with lesser efforts. However without adequate forms or rules to follow, it is getting more and more difficult to get necessary information. Because of seemingly chaotic status of the current web environment, it is sometimes called "Dizzy web" The user should wander from page to page to get necessary information. Therefore we need to construct system which properly recommends appropriate information for general user. The representative research field for this system is called Recommendation System(RS), The collaborative recommendation system is one of the RS. It was known to perform better than the other systems. When we perform the web user modeling or other web-mining tasks, the continuous feedback data is very important and frequently used. In this paper, we propose a collaborative recommendation system which can deal with the continuous feedback data and tried to construct the web page prediction system. We use a sojourn time of a user as continuous feedback data and combine the traditional model-based algorithm framework with the Support Vector Regression technique. In our experiments, we show the accuracy of our system and the computing time of page prediction compared with Pearson's correlation algorithm.algorithm.

An Efficient Algorithm for Mining Interactive Communication Sequence Patterns (대화형 통신 순서열 패턴의 마이닝을 위한 효율적인 알고리즘)

  • Haam, Deok-Min;Song, Ji-Hwan;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.169-179
    • /
    • 2009
  • Communication log data consist of communication events such as sending and receiving e-mail or instance message and visiting web sites, etc. Many countries including USA and EU enforce the retention of these data on the communication service providers for the purpose of investigating or detecting criminals through the Internet. Because size of the retained data is very large, the efficient method for extracting valuable information from the data is needed for Law Enforcement Authorities to use the retained data. This paper defines the Interactive Communication Sequence Patterns(ICSPs) that is the important information when each communication event in communication log data consists of sender, receiver, and timestamp of this event. We also define a Mining(FDICSP) problem to discover such patterns and propose a method called Fast Discovering Interactive Communication Sequence Pattern(FDICSP) to solve this problem. FDICSP focuses on the characteristics of ICS to reduce the search space when it finds longer sequences by using shorter sequences. Thus, FDICSP can find Interactive Communication Sequence Patterns efficiently.

Development of Recommendation Agents through Web Log Analysis (웹 로그 분석을 이용한 추천 에이전트의 개발)

  • 김성학;이창훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.621-630
    • /
    • 2003
  • Web logs are the information recorded by a web server when users access the web sites, and due to a speedy rising of internet usage, the worth of their practical use has become increasingly important. Analyzing such logs can use to determine the patterns representing users' navigational behavior in a Web site and restructure a Web site to create a more effective organizational presence. For these applications, the generally used key methods in many studies are association rules and sequential patterns based by Apriori algorithms, which are widely used to extract correlation among patterns. But Apriori inhere inefficiency in computing cost when applied to large databases. In this paper, we develop a new algorithm for mining interesting patterns which is faster than Apriori algorithm and recommendation agents which could provide a system manager with valuable information that are accessed sequentially by many users.

  • PDF