• Title/Summary/Keyword: Comparison mining

Search Result 283, Processing Time 0.029 seconds

Study on the Comparison and Analysis of Data Mining Models for the Efficient Customer Credit Evaluation (효율적인 신용평가를 위한 데이터마이닝 모형의 비교.분석에 관한 연구)

  • 김갑식
    • Journal of Information Technology Applications and Management
    • /
    • v.11 no.1
    • /
    • pp.161-174
    • /
    • 2004
  • This study is intended to suggest1 the optimized data mining model for the efficient customer credit evaluation in the capital finance industry. To accomplish the research objective, various data mining models for the customer credit evaluation are compared and analyzed. Furthermore, existing models such as Multi-Layered Perceptrons, Multivariate Discrimination Analysis, Radial Basis Function, Decision Tree, and Logistic Regression are employed for analyzing the customer information in the capital finance market and the detailed data of capital financing transactions. Finally, the data from the integrated model utilizing a genetic algorithm is compared with those of each individual model mentioned above. The results reveals that the integrated model is superior to other existing models.

  • PDF

Comparison of Multiway Discretization Algorithms for Data Mining

  • Kim, Jeong-Suk;Jang, Young-Mi;Na, Jong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.801-813
    • /
    • 2005
  • The discretization algorithms for continuous data have been actively studied in the area of data mining. These discretizations are very important in data analysis, especially for efficient model selection in data mining. So, in this paper, we introduce the principles of some mutiway discretization algorithms including KEX, 1R and CN4 algorithm and investigate the efficiency of these algorithms through numerical study. For various underlying distribution, we compare these algorithms in view of misclassification rate.

  • PDF

Comparison of the Center for Children's Foodservice Management in 2012, 2014, and 2016 Using Big Data and Opinion Mining (2012년, 2014년과 2016년의 어린이급식관리지원센터에 대한 빅데이터와 오피니언 마이닝을 통한 비교)

  • Jung, Eun-Jin;Chang, Un-Jae
    • Journal of the Korean Dietetic Association
    • /
    • v.23 no.2
    • /
    • pp.192-201
    • /
    • 2017
  • This study compared the Center for Children's Foodservice Management in 2012, 2014, and 2016 using big data and opinion mining. The data on the Center for Children's Foodservice Management were collected from the portal site, Naver, from January 1 to December 31 in 2012, 2014, & 2016 and analyzed by keyword frequency analysis, influx route analysis of data, polarity analysis via opinion mining, and positive and negative keyword analysis by polarity analysis. The results showed that nursery had the highest rank every year and education supported by Center for Children's Foodservice Management has increased significantly. The influx of data has increased through the influx route analysis of data. Blog and $caf\acute{e}e$, which have a considerable amount of information by the mother should be helpful for use as public relations and participation recruitment paths. By polarity analysis using opinion mining, the positive image of the Center for Children's Foodservice Management was increased. Therefore, the Center for Children's Foodservice Management was well-suited to the purpose and the interests of the people has been increasing steadily. In the near future, the Center for Children's Foodservice Management is expected have good recognition if various programs to participate with family are developed and advertised.

Association Rule Discovery Considering Strategic Importance: WARM (전략적 중요도를 고려한 연관규칙의 발견: WARM)

  • Choi, Doug-Won
    • The KIPS Transactions:PartD
    • /
    • v.17D no.4
    • /
    • pp.311-316
    • /
    • 2010
  • This paper presents a weight adjusted association rule mining algorithm (WARM). Assigning weights to each strategic factor and normalizing raw scores within each strategic factor are the key ideas of the presented algorithm. It is an extension of the earlier algorithm TSAA (transitive support association Apriori) and strategic importance is reflected by considering factors such as profit, marketing value, and customer satisfaction of each item. Performance analysis based on a real world database has been made and comparison of the mining outcomes obtained from three association rule mining algorithms (Apriori, TSAA, and WARM) is provided. The result indicates that each algorithm gives distinct and characteristic behavior in association rule mining.

Data Mining for Knowledge Management in a Health Insurance Domain

  • Chae, Young-Moon;Ho, Seung-Hee;Cho, Kyoung-Won;Lee, Dong-Ha;Ji, Sun-Ha
    • Journal of Intelligence and Information Systems
    • /
    • v.6 no.1
    • /
    • pp.73-82
    • /
    • 2000
  • This study examined the characteristicso f the knowledge discovery and data mining algorithms to demonstrate how they can be used to predict health outcomes and provide policy information for hypertension management using the Korea Medical Insurance Corporation database. Specifically this study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms CHAID (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) since logistic regression has assumed a major position in the healthcare field as a method for predicting or classifying health outcomes based on the specific characteristics of each individual case. This comparison was performed using the test set of 4,588 beneficiaries and the training set of 13,689 beneficiaries that were used to develop the models. On the contrary to the previous study CHAID algorithm performed better than logistic regression in predicting hypertension but C5.0 had the lowest predictive power. In addition CHAID algorithm and association rule also provided the segment characteristics for the risk factors that may be used in developing hypertension management programs. This showed that data mining approach can be a useful analytic tool for predicting and classifying health outcomes data.

  • PDF

Performance Comparison of Clustering Techniques for Spatio-Temporal Data (시공간 데이터를 위한 클러스터링 기법 성능 비교)

  • Kang Nayoung;Kang Juyoung;Yong Hwan-Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.15-37
    • /
    • 2004
  • With the growth in the size of datasets, data mining has recently become an important research topic. Especially, interests about spatio-temporal data mining has been increased which is a method for analyzing massive spatio-temporal data collected from a wide variety of applications like GPS data, trajectory data of surveillance system and earth geographic data. In the former approaches, conventional clustering algorithms are applied as spatio-temporal data mining techniques without any modification. In this paper, we focused to SOM that is the most common clustering algorithm applied to clustering analysis in data mining wet and develop the spatio-temporal data mining module based on it. In addition, we analyzed the clustering results of developed SOM module and compare them with those of K-means and Agglomerative Hierarchical algorithm in the aspects of homogeneity, separation, separation, silhouette width and accuracy. We also developed specialized visualization module fur more accurate interpretation of mining result.

  • PDF

A Comparison of Performance between STMP/MST and Existing Spatio-Temporal Moving Pattern Mining Methods (STMP/MST와 기존의 시공간 이동 패턴 탐사 기법들과의 성능 비교)

  • Lee, Yon-Sik;Kim, Eun-A
    • Journal of Internet Computing and Services
    • /
    • v.10 no.5
    • /
    • pp.49-63
    • /
    • 2009
  • The performance of spatio-temporal moving pattern mining depends on how to analyze and process the huge set of spatio-temporal data due to the nature of it. The several method was presented in order to solve the problems in which existing spatio-temporal moving pattern mining methods[1-10] have, such as increasing execution time and required memory size during the pattern mining, but they did not solve properly yet. Thus, we proposed the STMP/MST method[11] as a preceding research in order to extract effectively sequential and/or periodical frequent occurrence moving patterns from the huge set of spatio-temporal moving data. The proposed method reduces patterns mining execution time, using the moving sequence tree based on hash tree. And also, to minimize the required memory space, it generalizes detailed historical data including spatio-temporal attributes into the real world scopes of space and time by using spatio-temporal concept hierarchy. In this paper, in order to verify the effectiveness of the STMP/MST method, we compared and analyzed performance with existing spatio-temporal moving pattern mining methods based on the quantity of mining data and minimum support factor.

  • PDF

A Sequential Pattern Mining based on Dynamic Weight in Data Stream (스트림 데이터에서 동적 가중치를 이용한 순차 패턴 탐사 기법)

  • Choi, Pilsun;Kim, Hwan;Kim, Daein;Hwang, Buhyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.137-144
    • /
    • 2013
  • A sequential pattern mining is finding out frequent patterns from the data set in time order. In this field, a dynamic weighted sequential pattern mining is applied to a computing environment that changes depending on the time and it can be utilized in a variety of environments applying changes of dynamic weight. In this paper, we propose a new sequence data mining method to explore the stream data by applying the dynamic weight. This method reduces the candidate patterns that must be navigated by using the dynamic weight according to the relative time sequence, and it can find out frequent sequence patterns quickly as the data input and output using a hash structure. Using this method reduces the memory usage and processing time more than applying the existing methods. We show the importance of dynamic weighted mining through the comparison of different weighting sequential pattern mining techniques.

Comparison of similarity measures and community detection algorithms using collaboration filtering (협업 필터링을 사용한 유사도 기법 및 커뮤니티 검출 알고리즘 비교)

  • Ugli, Sadriddinov Ilkhomjon Rovshan;Hong, Minpyo;Park, Doo-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.366-369
    • /
    • 2022
  • The glut of information aggravated the process of data analysis and other procedures including data mining. Many algorithms were devised in Big Data and Data Mining to solve such an intricate problem. In this paper, we conducted research about the comparison of several similarity measures and community detection algorithms in collaborative filtering for movie recommendation systems. Movielense data set was used to do an empirical experiment. We applied three different similarity measures: Cosine, Euclidean, and Pearson. Moreover, betweenness and eigenvector centrality were used to detect communities from the network. As a result, we elucidated which algorithm is more suitable than its counterpart in terms of recommendation accuracy.