• Title/Summary/Keyword: web logs

Search Result 82, Processing Time 0.025 seconds

An Optimized User Behavior Prediction Model Using Genetic Algorithm On Mobile Web Structure

  • Hussan, M.I. Thariq;Kalaavathi, B.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.5
    • /
    • pp.1963-1978
    • /
    • 2015
  • With the advancement of mobile web environments, identification and analysis of the user behavior play a significant role and remains a challenging task to implement with variations observed in the model. This paper presents an efficient method for mining optimized user behavior prediction model using genetic algorithm on mobile web structure. The framework of optimized user behavior prediction model integrates the temporary and permanent register information and is stored immediately in the form of integrated logs which have higher precision and minimize the time for determining user behavior. Then by applying the temporal characteristics, suitable time interval table is obtained by segmenting the logs. The suitable time interval table that split the huge data logs is obtained using genetic algorithm. Existing cluster based temporal mobile sequential arrangement provide efficiency without bringing down the accuracy but compromise precision during the prediction of user behavior. To efficiently discover the mobile users' behavior, prediction model is associated with region and requested services, a method called optimized user behavior Prediction Model using Genetic Algorithm (PM-GA) on mobile web structure is introduced. This paper also provides a technique called MAA during the increase in the number of models related to the region and requested services are observed. Based on our analysis, we content that PM-GA provides improved performance in terms of precision, number of mobile models generated, execution time and increasing the prediction accuracy. Experiments are conducted with different parameter on real dataset in mobile web environment. Analytical and empirical result offers an efficient and effective mining and prediction of user behavior prediction model on mobile web structure.

Development of Recommendation Agents through Web Log Analysis (웹 로그 분석을 이용한 추천 에이전트의 개발)

  • 김성학;이창훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.621-630
    • /
    • 2003
  • Web logs are the information recorded by a web server when users access the web sites, and due to a speedy rising of internet usage, the worth of their practical use has become increasingly important. Analyzing such logs can use to determine the patterns representing users' navigational behavior in a Web site and restructure a Web site to create a more effective organizational presence. For these applications, the generally used key methods in many studies are association rules and sequential patterns based by Apriori algorithms, which are widely used to extract correlation among patterns. But Apriori inhere inefficiency in computing cost when applied to large databases. In this paper, we develop a new algorithm for mining interesting patterns which is faster than Apriori algorithm and recommendation agents which could provide a system manager with valuable information that are accessed sequentially by many users.

  • PDF

Framework for Efficient Web Page Prediction using Deep Learning

  • Kim, Kyung-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.165-172
    • /
    • 2020
  • Recently, due to exponential growth of access information on the web, the importance of predicting a user's next web page use has been increasing. One of the methods that can be used for predicting user's next web page is deep learning. To predict next web page, web logs are analyzed by data preprocessing and then a user's next web page is predicted on the output of the analyzed web logs using a deep learning algorithm. In this paper, we propose a framework for web page prediction that includes methods for web log preprocessing followed by deep learning techniques for web prediction. To increase the speed of preprocessing of large web log, a Hadoop based MapReduce programming model is used. In addition, we present a web prediction system that uses an efficient deep learning technique on the output of web log preprocessing for training and prediction. Through experiment, we show the performance improvement of our proposed method over traditional methods. We also show the accuracy of our prediction.

An Efficient Approach for Single-Pass Mining of Web Traversal Sequences (단일 스캔을 통한 웹 방문 패턴의 탐색 기법)

  • Kim, Nak-Min;Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.221-227
    • /
    • 2010
  • Web access sequence mining can discover the frequently accessed web pages pursued by users. Utility-based web access sequence mining handles non-binary occurrences of web pages and extracts more useful knowledge from web logs. However, the existing utility-based web access sequence mining approach considers web access sequences from the very beginning of web logs and therefore it is not suitable for mining data streams where the volume of data is huge and unbounded. At the same time, it cannot find the recent change of knowledge in data streams adaptively. The existing approach has many other limitations such as considering only forward references of web access sequences, suffers in the level-wise candidate generation-and-test methodology, needs several database scans, etc. In this paper, we propose a new approach for high utility web access sequence mining over data streams with a sliding window method. Our approach can not only handle large-scale data but also efficiently discover the recently generated information from data streams. Moreover, it can solve the other limitations of the existing algorithm over data streams. Extensive performance analyses show that our approach is very efficient and outperforms the existing algorithm.

Designing Summary Tables for Mining Web Log Data

  • Ahn, Jeong-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.1
    • /
    • pp.157-163
    • /
    • 2005
  • In the Web, the data is generally gathered automatically by Web servers and collected in server or access logs. However, as users access larger and larger amounts of data, query response times to extract information inevitably get slower. A method to resolve this issue is the use of summary tables. In this short note, we design a prototype of summary tables that can efficiently extract information from Web log data. We also present the relative performance of the summary tables against a sampling technique and a method that uses raw data.

  • PDF

User Access Patterns Discovery based on Apriori Algorithm under Web Logs (웹 로그에서의 Apriori 알고리즘 기반 사용자 액세스 패턴 발견)

  • Ran, Cong-Lin;Joung, Suck-Tae
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.6
    • /
    • pp.681-689
    • /
    • 2019
  • Web usage pattern discovery is an advanced means by using web log data, and it's also a specific application of data mining technology in Web log data mining. In education Data Mining (DM) is the application of Data Mining techniques to educational data (such as Web logs of University, e-learning, adaptive hypermedia and intelligent tutoring systems, etc.), and so, its objective is to analyze these types of data in order to resolve educational research issues. In this paper, the Web log data of a university are used as the research object of data mining. With using the database OLAP technology the Web log data are preprocessed into the data format that can be used for data mining, and the processing results are stored into the MSSQL. At the same time the basic data statistics and analysis are completed based on the processed Web log records. In addition, we introduced the Apriori Algorithm of Web usage pattern mining and its implementation process, developed the Apriori Algorithm program in Python development environment, then gave the performance of the Apriori Algorithm and realized the mining of Web user access pattern. The results have important theoretical significance for the application of the patterns in the development of teaching systems. The next research is to explore the improvement of the Apriori Algorithm in the distributed computing environment.

Trends of Search Behavior of Korean Web Users (국내 웹 이용자의 검색 행태 추이 분석)

  • Park Soyeon;Lee Joon Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.2
    • /
    • pp.147-160
    • /
    • 2005
  • This study examines trends of web query types and topics submitted to NAVER during one year period by analyzing query logs and click logs. There was a seasonal difference in the distribution of query types. Query type distribution was also different between weekdays and weekends, and between different days of the week. The log data show seasonal changes in terms of the topics of queries. Search topics seem to change between weekdays and weekends, and between different days of the week. However, there was little change in overall patterns of search behavior across one year. The implications for system designers and web content providers are discussed.

Consumer behavior prediction using Airbnb web log data (에어비앤비(Airbnb) 웹 로그 데이터를 이용한 고객 행동 예측)

  • An, Hyoin;Choi, Yuri;Oh, Raeeun;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.391-404
    • /
    • 2019
  • Customers' fixed characteristics have often been used to predict customer behavior. It has recently become possible to track customer web logs as customer activities move from offline to online. It has become possible to collect large amounts of web log data; however, the researchers only focused on organizing the log data or describing the technical characteristics. In this study, we predict the decision-making time until each customer makes the first reservation, using Airbnb customer data provided by the Kaggle website. This data set includes basic customer information such as gender, age, and web logs. We use various methodologies to find the optimal model and compare prediction errors for cases with web log data and without it. We consider six models such as Lasso, SVM, Random Forest, and XGBoost to explore the effectiveness of the web log data. As a result, we choose Random Forest as our optimal model with a misclassification rate of about 20%. In addition, we confirm that using web log data in our study doubles the prediction accuracy in predicting customer behavior compared to not using it.

A Personal Memex System Using Uniform Representation of the Data from Various Devices (다양한 기기로부터의 데이터 단일 표현을 통한 개인 미멕스 시스템)

  • Min, Young-Kun;Lee, Bog-Ju
    • The KIPS Transactions:PartB
    • /
    • v.16B no.4
    • /
    • pp.309-318
    • /
    • 2009
  • The researches on the system that automatically records and retrieves one's everyday life is relatively actively worked recently. These systems, called personal memex or life log, usually entail dedicated devices such as SenseCam in MyLifeBits project. This research paid attention to the digital devices such as mobile phones, credit cards, and digital camera that people use everyday. The system enables a person to store everyday life systematically that are saved in the devices or the deviced-related web pages (e.g., phone records in the cellular phone company) and to refer this quickly later. The data collection agent in the proposed system, called MyMemex, collects the personal life log "web data" using the web services that the web sites provide and stores the web data into the server. The "file data" stored in the off-line digital devices are also loaded into the server. Each of the file data or web data is viewed as a memex event that can be described by 4W1H form. The different types of data in different services are transformed into the memex event data in 4W1H form. The memex event ontology is used in this transform. Users can sign in to the web server of this service to view their life logs in the chronological manner. Users can also search the life logs using keywords. Moreover, the life logs can be viewed as a diary or story style by converting the memex events to sentences. The related memex events are grouped to be displayed as an "episode" by a heuristic identification method. A result with high accuracy has been obtained by the experiment for the episode identification using the real life log data of one of the authors.

Mining Association Rules from the Web Access Log of an Online News website (온라인 뉴스 웹사이트의 로그를 이용한 연관규칙 발견에 관한 연구)

  • Hwang, Hyunseok;Yoo, Keedong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-57
    • /
    • 2013
  • Today a lot of functional areas of a firm are operated on the Web. Online shopping malls analyze web log recording customers' activities on the web to connect them to business outcomes. Not only commercial websites, but online news sites also need to collect and analyze web logs to understand their news readers' interest. However, little research has been performed yet. In this research we mined the web access log of an online news website and conduct Market Basket Analysis to uncover the association rules among the categories of news articles. The research is composed of two stages: 1) Identifying the individual session of a visitor; 2) Mining association rule from news articles read by each session. We gather 7-day access logs two times. The results of log mining and meanings of association rules are suggested with managerial implications in conclusion section.