• Title/Summary/Keyword: Association rule mining

Search Result 351, Processing Time 0.026 seconds

Association Rule Mining Algorithm and Analysis of Missing Values

  • Lee, Jae-Wan;Bobby D. Gerardo;Kim, Gui-Tae;Jeong, Jin-Seob
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.150-156
    • /
    • 2003
  • This paper explored the use of an algorithm for the data mining and method in handling missing data which had generated enhanced association patterns observed using the data illustrated here. The evaluations showed that more association patterns are generated in the second analysis which suggests more meaningful rules than in the first situation. It showed that the model offer more precise and important association rules that is more valuable when applied for business decision making. With the discovery of accurate association rules or business patterns, strategies could be efficiently planned out and implemented to improve marketing schemes. This investigation gives rise to a number of interesting issues that could be explored further like the effect of outliers and missing data for detecting fraud and devious database entries.

Development of association rule threshold by balancing of relative rule accuracy (상대적 규칙 정확도의 균형화에 의한 연관성 측도의 개발)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1345-1352
    • /
    • 2014
  • Data mining is the representative methodology to obtain meaningful information in the era of big data.By Wikipedia, association rule learning is a popular and well researched method for discovering interesting relationship between itemsets in large databases using association thresholds. It is intended to identify strong rules discovered in databases using different interestingness measures. Unlike general association rule, inverse association rule mining finds the rules that a special item does not occur if an item does not occur. If two types of association rule can be simultaneously considered, we can obtain the marketing information for some related products as well as the information of specific product marketing. In this paper, we propose a balanced attributable relative accuracy applicable to these association rule techniques, and then check the three conditions of interestingness measures by Piatetsky-Shapiro (1991). The comparative studies with rule accuracy, relative accuracy, attributable relative accuracy, and balanced attributable relative accuracy are shown by numerical example. The results show that balanced attributable relative accuracy is better than any other accuracy measures.

A Study for Statistical Criterion in Negative Association Rules Using Boolean Analyzer

  • Lee, Keun-Woo;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.2
    • /
    • pp.569-576
    • /
    • 2008
  • Association rule mining searches for interesting relationships among items in a given database. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule, support and confidence and lift. Association rule is an interesting rule among purchased items in transaction, but the negative association rule is an interesting rule that includes items which are not purchased. Boolean Analyzer is the method to produce the negative association rule using PIM. But, PIM is subjective. In this paper, we present statistical objective criterion in negative association rules using Boolean Analyzer.

  • PDF

A Study for Statistical Criterion in Negative Association Rules Using Boolean Analyzer

  • Shin, Sang-Jin;Lee, Keun-Woo
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.11a
    • /
    • pp.145-151
    • /
    • 2006
  • Association rule mining searches for interesting relationships among items in a given database. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule support and confidence and lift. Association rule is an interesting rule among purchased items in transaction, but the negative association rule is an interesting rule that includes items which are not purchased. Boolean Analyzer is the method to produce the negative association rule using PIM. But PIM is subjective. In this paper, we present statistical objective criterion in negative association rules using Boolean Analyzer.

  • PDF

Odoo Data Mining Module Using Market Basket Analysis

  • Yulia, Yulia;Budhi, Gregorius Satia;Hendratha, Stefani Natalia
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.1
    • /
    • pp.52-59
    • /
    • 2018
  • Odoo is an enterprise resource planning information system providing modules to support the basic business function in companies. This research will look into the development of an additional module at Odoo. This module is a data mining module using Market Basket Analysis (MBA) using FP-Growth algorithm in managing OLTP of sales transaction to be useful information for users to improve the analysis of company business strategy. The FP-Growth algorithm used in the application was able to produce multidimensional association rules. The company will know more about their sales and customers' buying habits. Performing sales trend analysis will give a valuable insight into the inner-workings of the business. The testing of the module is using the data from X Supermarket. The final result of this module is generated from a data mining process in the form of association rule. The rule is presented in narrative and graphical form to be understood easier.

A patent analysis method for identifying core technologies: Data mining and multi-criteria decision making approach (핵심 기술 파악을 위한 특허 분석 방법: 데이터 마이닝 및 다기준 의사결정 접근법)

  • Kim, Chul-Hyun
    • Journal of the Korea Safety Management & Science
    • /
    • v.16 no.1
    • /
    • pp.213-220
    • /
    • 2014
  • This study suggests new approach to identify core technologies through patent analysis. Specially, the approach applied data mining technique and multi-criteria decision making method to the co-classification information of registered patents. First, technological interrelationship matrices of intensity, relatedness, and cross-impact perspectives are constructed with support, lift and confidence values calculated by conducting an association rule mining on the co-classification information of patent data. Second, the analytic network process is applied to the constructed technological interrelationship matrices in order to produce the importance values of technologies from each perspective. Finally, data envelopment analysis is employed to the derived importance values in order to identify priorities of technologies, putting three perspectives together. It is expected that suggested approach could help technology planners to formulate strategy and policy for technological innovation.

The Development of Relative Interestingness Measure for Comparing with Degrees of Association

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1269-1279
    • /
    • 2008
  • Data mining is the technique to find useful information in huge databases. One of the well-studied problems in data mining is exploration for association rules. An association rule technique finds the relation among each items in massive volume databases by several interestingness measures. An important and useful classification scheme of interestingness measures may be based on user-involvement. This results in two categories - objective and subjective measures. This paper present some relative interestingess measures to compare with degrees of association for two groups. A comparative study with some relative interestingness measures is shown by numerical example. The results show that the relative net confidence is the best relative interestingness measure.

  • PDF

A Study on the Keyword Extraction for ESG Controversies Through Association Rule Mining (연관규칙 분석을 통한 ESG 우려사안 키워드 도출에 관한 연구)

  • Ahn, Tae Wook;Lee, Hee Seung;Yi, June Suh
    • The Journal of Information Systems
    • /
    • v.30 no.1
    • /
    • pp.123-149
    • /
    • 2021
  • Purpose The purpose of this study is to define the anti-ESG activities of companies recognized by media by reflecting ESG recently attracted attention. This study extracts keywords for ESG controversies through association rule mining. Design/methodology/approach A research framework is designed to extract keywords for ESG controversies as follows: 1) From DeepSearch DB, we collect 23,837 articles on anti-ESG activities exposed to 130 media from 2013 to 2018 of 294 listed companies with ESG ratings 2) We set keywords related to environment, social, and governance, and delete or merge them with other keywords based on the support, confidence, and lift derived from association rule mining. 3) We illustrate the importance of keywords and the relevance between keywords through density, degree centrality, and closeness centrality on network analysis. Findings We identify a total of 26 keywords for ESG controversies. 'Gapjil' records the highest frequency, followed by 'corruption', 'bribery', and 'collusion'. Out of the 26 keywords, 16 are related to governance, 8 to social, and 2 to environment. The keywords ranked high are mostly related to the responsibility of shareholders within corporate governance. ESG controversies associated with social issues are often related to unfair trade. As a result of confidence analysis, the keywords related to social and governance are clustered and the probability of mutual occurrence between keywords is high within each group. In particular, in the case of "owner's arrest", it is caused by "bribery" and "misappropriation" with an 80% confidence level. The result of network analysis shows that 'corruption' is located in the center, which is the most likely to occur alone, and is highly related to 'breach of duty', 'embezzlement', and 'bribery'.

Encoding of XML Elements for Mining Association Rules

  • Hu Gongzhu;Liu Yan;Huang Qiong
    • The Journal of Information Systems
    • /
    • v.14 no.3
    • /
    • pp.37-47
    • /
    • 2005
  • Mining of association rules is to find associations among data items that appear together in some transactions or business activities. As of today, algorithms for association rule mining, as well as for other data mining tasks, are mostly applied to relational databases. As XML being adopted as the universal format for data storage and exchange, mining associations from XML data becomes an area of attention for researchers and developers. The challenge is that the semi-structured data format in XML is not directly suitable for traditional data mining algorithms and tools. In this paper we present an encoding method to encode XML tree-nodes. This method is used to store the XML data in Value Table and Transaction Table that can be easily accessed via indexing. The hierarchical relationship in the original XML tree structure is embedded in the encoding. We applied this method to association rules mining of XML data that may have missing data.

  • PDF

Analysis of Characteristic Factors for Non-fatal Accidents in Construction Projects using Association Rule Mining (연관 규칙 탐색 기법을 이용한 건설공사 비사망 재해의 특성 요인 분석)

  • Gayeon, Lee;Sung Woo, Shin
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.6
    • /
    • pp.40-49
    • /
    • 2022
  • Simple statistical frequency based analysis, such as Pareto analysis, are widely used in conventional accident analysis. However, due to the dynamic and complex nature of construction works, many factors can simultaneously affect or involve the occurrence of accidents in construction projects. Therefore, the identification of the complex relationship between such factors is important to establish relevant and effective safety management policies and/or programs. In this study, characteristic factors and their relationships' contribution to non-fatal accidents in construction projects are analyzed using the association rule mining (ARM) technique. To this end, a total of 59,202 construction accident data are collected from 2015 to 2019 and the ARM is performed to retrieve specific relationships -named as association rules-among classified factors in the data. Characteristics of the retrieved relationships are analyzed and compared with the results of conventional Pareto analysis. Based on the results, it is found that both fall and trip are notable accident forms having characteristic relations with other factors for non-fatal accidents in construction projects. It is also found that small-scale construction, age of 50s, less than 1 month of working period, and architectural construction are important factors for non-fatal accidents in construction projects.