• Title/Summary/Keyword: Clustering behavior

Search Result 183, Processing Time 0.024 seconds

Analysis of Departing Passengers' Dwell Time using Clustering Techniques (클러스터링 기법을 활용한 출발 여객 체류 시간 분석)

  • An, Deok-bae;Kim, Hui-yang;Baik, Ho-jong
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.5
    • /
    • pp.380-385
    • /
    • 2019
  • This paper is concerned with departure passengers' dwell time analysis using real system data. Previous researches emphasize the importance of dwell time analysis from perspective of airport terminal planning and non-aeronautical revenue. However, short-term airport operation using passengers' dwell time is considered impossible due to absence of passengers' behavior data. Recently, in accordance with the wave of smart airport, world leading airports are systematically collecting passenger data. So there is high possibility of analyzing passengers' dwell time with the data stacked in the airport database. We conducted dwell time analysis using data from Incheon Int'l airport. In order to handle passenger data, we adapted clustering algorithm which is one of data mining techniques. As a clustering result, passengers are divided into 3 clusters. One is the cluster for passengers whose dwell time is relatively short and who tend to spend longer time in the airside. Another is the cluster for passengers who have near 3 hours dwell time. The other is the cluster for passengers whose total dwell time is extremely long.

A Modeling Methodology for Analysis of Dynamic Systems Using Heuristic Search and Design of Interface for CRM (휴리스틱 탐색을 통한 동적시스템 분석을 위한 모델링 방법과 CRM 위한 인터페이스 설계)

  • Jeon, Jin-Ho;Lee, Gye-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.179-187
    • /
    • 2009
  • Most real world systems contain a series of dynamic and complex phenomena. One of common methods to understand these systems is to build a model and analyze the behavior of them. A two-step methodology comprised of clustering and then model creation is proposed for the analysis on time series data. An interface is designed for CRM(Customer Relationship Management) that provides user with 1:1 customized information using system modeling. It was confirmed from experiments that better clustering would be derived from model based approach than similarity based one. Clustering is followed by model creation over the clustered groups, by which future direction of time series data movement could be predicted. The effectiveness of the method was validated by checking how similarly predicted values from the models move together with real data such as stock prices.

Hybrid Movie Recommendation System Using Clustering Technique (클러스터링 기법을 이용한 하이브리드 영화 추천 시스템)

  • Sophort Siet;Sony Peng;Yixuan Yang;Sadriddinov Ilkhomjon;DaeYoung Kim;Doo-Soon Park
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.357-359
    • /
    • 2023
  • This paper proposes a hybrid recommendation system (RS) model that overcomes the limitations of traditional approaches such as data sparsity, cold start, and scalability by combining collaborative filtering and context-aware techniques. The objective of this model is to enhance the accuracy of recommendations and provide personalized suggestions by leveraging the strengths of collaborative filtering and incorporating user context features to capture their preferences and behavior more effectively. The approach utilizes a novel method that combines contextual attributes with the original user-item rating matrix of CF-based algorithms. Furthermore, we integrate k-mean++ clustering to group users with similar preferences and finally recommend items that have highly rated by other users in the same cluster. The process of partitioning is the use of the rating matrix into clusters based on contextual information offers several advantages. First, it bypasses of the computations over the entire data, reducing runtime and improving scalability. Second, the partitioned clusters hold similar ratings, which can produce greater impacts on each other, leading to more accurate recommendations and providing flexibility in the clustering process. keywords: Context-aware Recommendation, Collaborative Filtering, Kmean++ Clustering.

Anomaly Detection Analysis using Repository based on Inverted Index (역방향 인덱스 기반의 저장소를 이용한 이상 탐지 분석)

  • Park, Jumi;Cho, Weduke;Kim, Kangseok
    • Journal of KIISE
    • /
    • v.45 no.3
    • /
    • pp.294-302
    • /
    • 2018
  • With the emergence of the new service industry due to the development of information and communication technology, cyber space risks such as personal information infringement and industrial confidentiality leakage have diversified, and the security problem has emerged as a critical issue. In this paper, we propose a behavior-based anomaly detection method that is suitable for real-time and large-volume data analysis technology. We show that the proposed detection method is superior to existing signature security countermeasures that are based on large-capacity user log data according to in-company personal information abuse and internal information leakage. As the proposed behavior-based anomaly detection method requires a technique for processing large amounts of data, a real-time search engine is used, called Elasticsearch, which is based on an inverted index. In addition, statistical based frequency analysis and preprocessing were performed for data analysis, and the DBSCAN algorithm, which is a density based clustering method, was applied to classify abnormal data with an example for easy analysis through visualization. Unlike the existing anomaly detection system, the proposed behavior-based anomaly detection technique is promising as it enables anomaly detection analysis without the need to set the threshold value separately, and was proposed from a statistical perspective.

Travel Behavior Analysis for Short-term Railroad Passenger Demand Forecasting in KTX (KTX 단기수요 예측을 위한 통행행태 분석)

  • Kim, Han-Soo;Yun, Dong-Hee
    • Proceedings of the KSR Conference
    • /
    • 2011.05a
    • /
    • pp.1282-1289
    • /
    • 2011
  • The rail passenger demand for the railroad operations required a short-term demand rather than a long-term demand. The rail passenger demand can be classified according to the purpose. First, the rail passenger demand will be use to the restructure of line planning on the current operating line. Second, the rail passenger demand will be use to the line planning on the new line and purchasing the train vehicles. The objective of study is to analyze the travel behavior of rail passenger for modeling of short-term demand forecasting. The scope of research is the passenger of KTX. The travel behavior was analyzed the daily trips, origin/destination trips for KTX passenger using the ANOVA and the clustering analysis. The results of analysis provide the directions of the short-term demand forecasting model.

  • PDF

New Customer Segmentation and Purchase-forecasting Using Changes in Customer Behavior (고객의 행동 변화를 통한 신규고객 세분화와 구매항목 예측)

  • Do, Hee Jung;Kim, Jae Yearn
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.33 no.3
    • /
    • pp.339-348
    • /
    • 2007
  • Since the 1980s, the marketing paradigm has rapidly changed from product-driven marketing to customer-driven marketing. Recently, due to an increase in the amount of information, customer-differentiation strategies have been emphasized more than product-differentiation strategies. This paper suggests a methodology for new customer segmentation and purchase forecasting using changes in customer behavior. This methodology includes a segmentation method for new customers using existing customer's characteristics and a purchase-forecasting system using the purchase-behavior patterns of existing customers. The proposed methodology not only provides differential services from a segmentation system but also recommends differential items from the purchase forecasting system for new and existing customers.

Factors Associated with Physical Activity and Sedentary Behavior among Elementary School Students (일부 초등학교 5, 6학년 학생의 신체활동과 좌식생활 관련 요인)

  • Kim, Bong-Jeong
    • Korean Journal of Health Education and Promotion
    • /
    • v.27 no.3
    • /
    • pp.33-47
    • /
    • 2010
  • Objectives: The purpose of this study was to identify personal and social environmental factors associated with physical activity and sedentary behavior among elementary school students. Methods: Cross-sectional self-reported data were collected from a conveniently clustering sample population of 1538 grade 5 to 6 students attending 19 elementary schools in Seoul metropolitan city and Gyeonggi province. Data were statistically analyzed using Chi-square test and multiple logistic regression analysis. Results: In multiple logistic regression analyses, significant factors that were associated with schoolchildren's physical activity were gender, father's job, social support for physical activity, friend support, participation in school physical education class. Father's education level, mother's job, family functioning and urban residents were significantly associated with TV viewing and gender, age, BMI(obesity), mother's job, family functioning and urban residents were significantly associated with playing computer games among elementary schoolchildren. These results showed that physical activity among elementary school students was most associated with social environmental factors and sedentary behavior among school students was most associated with personal and family environment factors. Conclusion: Health care providers should develop interventions to improve these family and social environmental factors to increase physical activity levels and to decrease sedentary behavior among elementary schoolchildren.

Real-world multimodal lifelog dataset for human behavior study

  • Chung, Seungeun;Jeong, Chi Yoon;Lim, Jeong Mook;Lim, Jiyoun;Noh, Kyoung Ju;Kim, Gague;Jeong, Hyuntae
    • ETRI Journal
    • /
    • v.44 no.3
    • /
    • pp.426-437
    • /
    • 2022
  • To understand the multilateral characteristics of human behavior and physiological markers related to physical, emotional, and environmental states, extensive lifelog data collection in a real-world environment is essential. Here, we propose a data collection method using multimodal mobile sensing and present a long-term dataset from 22 subjects and 616 days of experimental sessions. The dataset contains over 10 000 hours of data, including physiological, data such as photoplethysmography, electrodermal activity, and skin temperature in addition to the multivariate behavioral data. Furthermore, it consists of 10 372 user labels with emotional states and 590 days of sleep quality data. To demonstrate feasibility, human activity recognition was applied on the sensor data using a convolutional neural network-based deep learning model with 92.78% recognition accuracy. From the activity recognition result, we extracted the daily behavior pattern and discovered five representative models by applying spectral clustering. This demonstrates that the dataset contributed toward understanding human behavior using multimodal data accumulated throughout daily lives under natural conditions.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

A Cluster validity Index for Fuzzy Clustering

  • Lee, Haiyoung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.9 no.6
    • /
    • pp.621-626
    • /
    • 1999
  • In this paper a new cluster validation index which is heuristic but able to eliminate the monotonically decreasing tendency occurring in which the number of cluster c gets very large and close to the number of data points n is proposed. We review the FCM algorithm and some conventional cluster validity criteria discuss on the limiting behavior of the proposed validity index and provide some numerical examples showing the effectiveness of the proposed cluster validity index.

  • PDF