• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.03 seconds

Performance Comparison of Clustering using Discritization Algorithm (이산화 알고리즘을 이용한 계층적 클러스터링의 실험적 성능 평가)

  • Won, Jae Kang;Lee, Jeong Chan;Jung, Yong Gyu;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.53-60
    • /
    • 2013
  • Datamining from the large data in the form of various techniques for obtaining information have been developed. In recent years one of the most sought areas of pattern recognition and machine learning method is created with most of existing learning algorithms based on categorical attributes to a rule or decision model. However, the real-world data, it may consist of numeric attributes in many cases. In addition it contains attributes with numerical values to the normal categorical attribute. In this case, therefore, it is required processes in order to use the data to learn an appropriate value for the type attribute. In this paper, the domain of the numeric attributes are divided into several segments using learning algorithm techniques of discritization. It is described Clustering with other data mining techniques. Large amount of first cluster with characteristics is similar records from the database into smaller groups that split multiple given finite patterns in the pattern space. It is close to each other of a set of patterns that together make up a bunch. Among the set without specifying a particular category in a given data by extracting a pattern. It will be described similar grouping of data clustering technique to classify the data.

  • PDF

A New Memory-based Learning using Dynamic Partition Averaging (동적 분할 평균을 이용한 새로운 메모리 기반 학습기법)

  • Yih, Hyeong-Il
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.4
    • /
    • pp.456-462
    • /
    • 2008
  • The classification is that a new data is classified into one of given classes and is one of the most generally used data mining techniques. Memory-Based Reasoning (MBR) is a reasoning method for classification problem. MBR simply keeps many patterns which are represented by original vector form of features in memory without rules for reasoning, and uses a distance function to classify a test pattern. If training patterns grows in MBR, as well as size of memory great the calculation amount for reasoning much have. NGE, FPA, and RPA methods are well-known MBR algorithms, which are proven to show satisfactory performance, but those have serious problems for memory usage and lengthy computation. In this paper, we propose DPA (Dynamic Partition Averaging) algorithm. it chooses partition points by calculating GINI-Index in the entire pattern space, and partitions the entire pattern space dynamically. If classes that are included to a partition are unique, it generates a representative pattern from partition, unless partitions relevant partitions repeatedly by same method. The proposed method has been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory and FPA, and RPA.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

A Study of a Personalized Curation Service and Business Model based on Book Information (도서정보 기반의 고객 맞춤형 큐레이션 서비스 및 비즈니스 모델 연구)

  • Kwon, Hyeog-In;Na, Yun-Bin;Yu, Mi-Ok;Choi, Kwang-Sun
    • Journal of Information Technology Services
    • /
    • v.14 no.1
    • /
    • pp.251-262
    • /
    • 2015
  • This study checks the conceptual definition of domestic book curation which is still in the beginning stage, the necessity of developing service and business, domestic and overseas case of relevant service. Further, the problem of book recommendation service and the difficulty anticipated in the embodiment of service are investigated together and the business model as new IT service is suggested to supplement them. Specifically, the collection of book information and customer information (interest and purchase pattern) and the procedure of mining the collected information and the process of embodying visualization was presented in the sector of service in the first place. Then, the technical transfer of developed solution and the construction cost and the method to impose commission over contents sales are presented in the sector of business. Diverse social and economic effects are expected to realize by developing and utilizing such services, namely, promoting the distribution of excellent book which were kept in dead storage so far due to lack of marketing support, recommendation readers the proper books which are convenient and necessary.

The effect of ball size on the hollow center cracked disc (HCCD) in Brazilian test

  • Haeri, Hadi;Sarfarazi, Vahab;Zhu, Zheming;Moradizadeh, Masih
    • Computers and Concrete
    • /
    • v.22 no.4
    • /
    • pp.373-381
    • /
    • 2018
  • Hollow center cracked disc (HCCD) in Brazilian test was modelled numerically to study the crack propagation in the pre-cracked disc. The pre-existing edge cracks in the disc models were considered to investigate the crack propagation and coalescence paths within the modelled samples. The effect of particle size on the hollow center cracked disc (HCCD) in Brazilian test were considered too. The results shows that Failure pattern is constant by increasing the ball diameter. Tensile cracks are dominant mode of failure. These crack initiates from notch tip, propagate parallel to loading axis and coalescence with upper model boundary. Number of cracks increase by decreasing the ball diameter. Also, tensile fracture toughness was decreased with increasing the particle size. In this research, it is tried to improve the understanding of the crack propagation and crack coalescence phenomena in brittle materials which is of paramount importance in the stability analyses of rock and concrete structures, such as the underground openings, rock slopes and tunnel construction.

Analysis on the Usage of Internet Games for Children with Decision Tree Rules (의사결정규칙을 이용한 아동의 교육용 인터넷 게임 활용실태 분석)

  • Kim, Yong-Dae;Jung, Hui-Suk;Choi, Eun-Jeong;Park, Byung-Sun;Han, Jeong-Hye
    • Journal of The Korean Association of Information Education
    • /
    • v.5 no.3
    • /
    • pp.389-400
    • /
    • 2001
  • The Internet Game is widespreaded quickly on web, and there are many kinds of funny games for users to use easily, so that can be applied to ICT(Information Communication Technology)education. In this paper, we provide the analysis on the usage of Internet games for children and teachers that is conducted by the decision tree algorithm, which is one of the popular data mining techniques. The results show the pattern of children's and teachers' usages of Internet games.

  • PDF

K-means clustering using a center of gravity for grid-based sample (그리드 기반 표본의 무게중심을 이용한 케이-평균군집화)

  • Lee, Sun-Myung;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.121-128
    • /
    • 2010
  • K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.

Proactive Retrieval Method Using Context Patterns in Ubiquitous Computing (유비쿼터스 컴퓨팅에서 컨텍스트 패턴을 이용한 프로액티브 검색 기법)

  • Kim, Sung-Rim;Kwon, Joon-Hee
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.8
    • /
    • pp.1017-1024
    • /
    • 2004
  • Ubiquitous system requires intelligent environment and system that perceives context in a proactive manner. This paper describes proactive retrieval method using context patterns in ubiquitous computing. And as the user's contexts change, new information is delivered proactively based on user's context patterns. For proactive retrieval, we extract context patterns based on sequential pattern discovery and association rule in data mining. By storing only information to be needed in near future using the context patterns, we solved the problem of speed and storage capacity of mobile devices in ubiquitous computing. We explain algorithms and an example. Several experiments are performed and the experimental results show that our method has a good information retrieval.

  • PDF

Semiautomatic Pattern Mining for Training a Relation Extraction Model (관계추출 모델 학습을 위한 반자동 패턴 마이닝)

  • Choi, GyuHyeon;nam, Sangha;Choi, Key-Sun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.257-262
    • /
    • 2016
  • 본 논문은 비구조적인 자연어 문장으로부터 두 개체 사이의 관계를 표현하는 구조적인 트리플을 밝히는 관계추출에 관한 연구를 기술한다. 사람이 직접 언어적 분석을 통해 트리플이 표현되는 형식을 입력하여 관계를 추출하는 규칙 기반 접근법에 비해 기계가 데이터로부터 표현 형식을 학습하는 기계학습 기반 접근법은 더 다양한 표현 형식을 확보할 수 있다. 기계학습을 이용하려면 모델을 훈련하기 위한 학습 데이터가 필요한데 학습 데이터가 수집되는 방식에 따라 지도 학습, 원격지도 학습 등으로 구분할 수 있다. 지도 학습은 사람이 학습 데이터를 만들어야하므로 사람의 노력이 많이 필요한 단점이 있지만 양질의 데이터를 사용하는 만큼 고성능의 관계추출 모델을 만들기 용이하다. 원격지도 학습은 사람의 노력을 필요로 하지 않고 학습 데이터를 만들 수 있지만 데이터의 질이 떨어지는 만큼 높은 관계추출 모델의 성능을 기대하기 어렵다. 본 연구는 기계학습을 통해 관계추출 모델을 훈련하는데 있어 지도 학습과 원격지도 학습이 가지는 단점을 서로 보완하여 타협점을 제시하는 학습 방법을 제안한다.

  • PDF

Investigation of Some Blast Design and Evaluation Parameters for Fragmentation in Limestone Quarries (석회석 광산의 파쇄도 관련 발파설계 및 평가 변수들에 대한 고찰)

  • Rai, Piyush;Yang, Hyung-Sik
    • Tunnel and Underground Space
    • /
    • v.20 no.3
    • /
    • pp.183-193
    • /
    • 2010
  • The present paper highlights some important fragmentation issues experienced in the limestone quarry blast rounds. In light of these major issues, the paper outlines influence of a few important design parameters, which bear merit to alter the blast performance in order to duly resolve the issues in field scale blast rounds. A comprehensive field based program for evaluation of such blast rounds has also been suggested. The knowledge disseminated in the paper, backed up by sufficient images, is largely based on the experience of the authors, while designing, implementing and evaluating numerous field scale blast rounds in cement grade limestone quarries.