• Title/Summary/Keyword: 연관 규칙 알고리즘

Search Result 200, Processing Time 0.032 seconds

Partition Algorithm for Updating Discovered Association Rules in Data Mining (데이터마이닝에서 기존의 연관규칙을 갱신하는 분할 알고리즘)

  • 이종섭;황종원;강맹규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.23 no.54
    • /
    • pp.1-11
    • /
    • 2000
  • This study suggests the partition algorithm for updating the discovered association rules in large database, because a database may allow frequent or occasional updates, and such update may not only invalidate some existing strong association rules, but also turn some weak rules into strong ones. the Partition algorithm updates strong association rules efficiently in the whole update database reuseing the information of the old large itemsets. Partition algorithms that is suggested in this study scans an incremental database in view of the fact that it is difficult to find the new set of large itemset in the whole updated database after an incremental database is added to the original database. This method of generating large itemsets is different from that of FUP(Fast Update) and KDP(Kim Dong Pil)

  • PDF

-An Algorithm for Cube-based Mining Association Rules and Application to Database Marketing (데이터 큐브를 이용한 연관규칙 발견 알고리즘)

  • 한경록;김재련
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.23 no.54
    • /
    • pp.27-36
    • /
    • 2000
  • The problem of discovering association rules is an emerging research area, whose goal is to extract significant patterns or interesting rules from large databases and several algorithms for mining association rules have been applied to item-oriented sales transaction databases. Data warehouses and OLAP engines are expected to be widely available. OLAP and data mining are complementary; both are important parts of exploiting data. Our study shows that data cube is an efficient structure for mining association rules. OLAP databases are expected to be a major platform for data mining in the future. In this paper, we present an efficient and effective algorithm for mining association rules using data cube. The algorithm can be applicable to enhance the power of competitiveness of business organizations by providing rapid decision support and efficient database marketing through customer segmentation.

  • PDF

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

Development of Recommendation Agents through Web Log Analysis (웹 로그 분석을 이용한 추천 에이전트의 개발)

  • 김성학;이창훈
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.621-630
    • /
    • 2003
  • Web logs are the information recorded by a web server when users access the web sites, and due to a speedy rising of internet usage, the worth of their practical use has become increasingly important. Analyzing such logs can use to determine the patterns representing users' navigational behavior in a Web site and restructure a Web site to create a more effective organizational presence. For these applications, the generally used key methods in many studies are association rules and sequential patterns based by Apriori algorithms, which are widely used to extract correlation among patterns. But Apriori inhere inefficiency in computing cost when applied to large databases. In this paper, we develop a new algorithm for mining interesting patterns which is faster than Apriori algorithm and recommendation agents which could provide a system manager with valuable information that are accessed sequentially by many users.

  • PDF

Implementation of abnormal behavior detection Algorithm and Optimizing the performance of Algorithm (비정상행위 탐지 알고리즘 구현 및 성능 최적화 방안)

  • Shin, Dae-Cheol;Kim, Hong-Yoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.11
    • /
    • pp.4553-4562
    • /
    • 2010
  • With developing networks, information security is going to be important and therefore lots of intrusion detection system has been developed. Intrusion detection system has abilities to detect abnormal behavior and unknown intrusions also it can detect intrusions by using patterns studied from various penetration methods. Various algorithms are studying now such as the statistical method for detecting abnormal behavior, extracting abnormal behavior, and developing patterns that can be expected. Etc. This study using clustering of data mining and association rule analyzes detecting areas based on two models and helps design detection system which detecting abnormal behavior, unknown attack, misuse attack in a large network.

A Study on a Working Pattern Analysis Prototype using Correlation Analysis and Linear Regression Analysis in Welding BigData Environment (용접 빅데이터 환경에서 상관분석 및 회귀분석을 이용한 작업 패턴 분석 모형에 관한 연구)

  • Jung, Se-Hoon;Sim, Chun-Bo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.10
    • /
    • pp.1071-1078
    • /
    • 2014
  • Recently, information providing service using Big Data is being expanded. Big Data processing technology is actively being academic research to an important issue in the IT industry. In this paper, we analyze a skilled pattern of welder through Big Data analysis or extraction of welding based on R programming. We are going to reduce cost on welding work including weld quality, weld operation time by providing analyzed results non-skilled welder. Welding has a problem that should be invested long time to be a skilled welder. For solving these issues, we apply connection rules algorithms and regression method to much pattern variable for welding pattern analysis of skilled welder. We analyze a pattern of skilled welder according to variable of analyzed rules by analyzing top N rules. In this paper, we confirmed the pattern structure of power consumption rate and wire consumption length through experimental results of analyzed welding pattern analysis.

BCDR algorithm for network estimation based on pseudo-likelihood with parallelization using GPU (유사가능도 기반의 네트워크 추정 모형에 대한 GPU 병렬화 BCDR 알고리즘)

  • Kim, Byungsoo;Yu, Donghyeon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.381-394
    • /
    • 2016
  • Graphical model represents conditional dependencies between variables as a graph with nodes and edges. It is widely used in various fields including physics, economics, and biology to describe complex association. Conditional dependencies can be estimated from a inverse covariance matrix, where zero off-diagonal elements denote conditional independence of corresponding variables. This paper proposes a efficient BCDR (block coordinate descent with random permutation) algorithm using graphics processing units and random permutation for the CONCORD (convex correlation selection method) based on the BCD (block coordinate descent) algorithm, which estimates a inverse covariance matrix based on pseudo-likelihood. We conduct numerical studies for two network structures to demonstrate the efficiency of the proposed algorithm for the CONCORD in terms of computation times.

Performance Improvement of Collaborative Filtering System Using Associative User′s Clustering Analysis for the Recalculation of Preference and Representative Attribute-Neighborhood (선호도 재계산을 위한 연관 사용자 군집 분석과 Representative Attribute -Neighborhood를 이용한 협력적 필터링 시스템의 성능향상)

  • Jung, Kyung-Yong;Kim, Jin-Su;Kim, Tae-Yong;Lee, Jung-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.287-296
    • /
    • 2003
  • There has been much research focused on collaborative filtering technique in Recommender System. However, these studies have shown the First-Rater Problem and the Sparsity Problem. The main purpose of this Paper is to solve these Problems. In this Paper, we suggest the user's predicting preference method using Bayesian estimated value and the associative user clustering for the recalculation of preference. In addition to this method, to complement a shortcoming, which doesn't regard the attribution of item, we use Representative Attribute-Neighborhood method that is used for the prediction when we find the similar neighborhood through extracting the representative attribution, which most affect the preference. We improved the efficiency by using the associative user's clustering analysis in order to calculate the preference of specific item within the cluster item vector to the collaborative filtering algorithm. Besides, for the problem of the Sparsity and First-Rater, through using Association Rule Hypergraph Partitioning algorithm associative users are clustered according to the genre. New users are classified into one of these genres by Naive Bayes classifier. In addition, in order to get the similarity value between users belonged to the classified genre and new users, and this paper allows the different estimated value to item which user evaluated through Naive Bayes learning. As applying the preference granted the estimated value to Pearson correlation coefficient, it can make the higher accuracy because the errors that cause the missing value come less. We evaluate our method on a large collaborative filtering database of user rating and it significantly outperforms previous proposed method.

Negative Selection Algorithm based Multi-Level Anomaly Intrusion Detection for False-Positive Reduction (과탐지 감소를 위한 NSA 기반의 다중 레벨 이상 침입 탐지)

  • Kim, Mi-Sun;Park, Kyung-Woo;Seo, Jae-Hyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.6
    • /
    • pp.111-121
    • /
    • 2006
  • As Internet lastly grows, network attack techniques are transformed and new attack types are appearing. The existing network-based intrusion detection systems detect well known attack, but the false-positive or false-negative against unknown attack is appearing high. In addition, The existing network-based intrusion detection systems is difficult to real time detection against a large network pack data in the network and to response and recognition against new attack type. Therefore, it requires method to heighten the detection rate about a various large dataset and to reduce the false-positive. In this paper, we propose method to reduce the false-positive using multi-level detection algorithm, that is combine the multidimensional Apriori algorithm and the modified Negative Selection algorithm. And we apply this algorithm in intrusion detection and, to be sure, it has a good performance.

Classification and Analysis of Data Mining Algorithms (데이터마이닝 알고리즘의 분류 및 분석)

  • Lee, Jung-Won;Kim, Ho-Sook;Choi, Ji-Young;Kim, Hyon-Hee;Yong, Hwan-Seung;Lee, Sang-Ho;Park, Seung-Soo
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.279-300
    • /
    • 2001
  • Data mining plays an important role in knowledge discovery process and usually various existing algorithms are selected for the specific purpose of the mining. Currently, data mining techniques are actively to the statistics, business, electronic commerce, biology, and medical area and currently numerous algorithms are being researched and developed for these applications. However, in a long run, only a few algorithms, which are well-suited to specific applications with excellent performance in large database, will survive. So it is reasonable to focus our effort on those selected algorithms in the future. This paper classifies about 30 existing algorithms into 7 categories - association rule, clustering, neural network, decision tree, genetic algorithm, memory-based reasoning, and bayesian network. First of all, this work analyzes systematic hierarchy and characteristics of algorithms and we present 14 criteria for classifying the algorithms and the results based on this criteria. Finally, we propose the best algorithms among some comparable algorithms with different features and performances. The result of this paper can be used as a guideline for data mining researches as well as field applications of data mining.

  • PDF