• Title/Summary/Keyword: ARM(Association Rule Mining)

Search Result 9, Processing Time 0.024 seconds

Violation Pattern Analysis for Good Manufacturing Practice for Medicine using t-SNE Based on Association Rule and Text Mining (우수 의약품 제조 기준 위반 패턴 인식을 위한 연관규칙과 텍스트 마이닝 기반 t-SNE분석)

  • Jun-O, Lee;So Young, Sohn
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.4
    • /
    • pp.717-734
    • /
    • 2022
  • Purpose: The purpose of this study is to effectively detect violations that occur simultaneously against Good Manufacturing Practice, which were concealed by drug manufacturers. Methods: In this study, we present an analysis framework for analyzing regulatory violation patterns using Association Rule Mining (ARM), Text Mining, and t-distributed Stochastic Neighbor Embedding (t-SNE) to increase the effectiveness of on-site inspection. Results: A number of simultaneous violation patterns was discovered by applying Association Rule Mining to FDA's inspection data collected from October 2008 to February 2022. Among them there were 'concurrent violation patterns' derived from similar regulatory ranges of two or more regulations. These patterns do not help to predict violations that simultaneously appear but belong to different regulations. Those unnecessary patterns were excluded by applying t-SNE based on text-mining. Conclusion: Our proposed approach enables the recognition of simultaneous violation patterns during the on-site inspection. It is expected to decrease the detection time by increasing the likelihood of finding intentionally concealed violations.

Exploring Convergence Fields of Safety Technology Using ARM-Based Patent Co-Classification Analysis (공통특허분류 분석을 활용한 안전기술융합분야 탐색 : Association Rule Mining(ARM) 접근법)

  • Suh, Yongyoon
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.88-95
    • /
    • 2017
  • As the safety fields are expanding to a variety of industrial fields, safety technology has been developed by convergence between industrial safety fields such as mechanics, ergonomics, electronics, chemistry, construction, and information science. As the technology convergence is facilitating recently advanced safety technology, it is important to explore the trends of safety technology for understanding which industrial technologies have been integrated thus far. For studying the trends of technology, the patent is considered one of the useful sources that has provided the ample information of new technology. The patent has been also used to identify the patterns of technology convergence through various quantitative methods. In this respect, this study aims to identify the convergence patterns and fields of safety technology using association rule mining(ARM)-based patent co-classification(co-class) analysis. The patent co-class data is especially useful for constructing convergence network between technological fields. Through linkages between technological fields, the core and hub classes of convergence network are explored to provide insight into the fields of safety technology. As the representative method for analyzing patent co-class network, the ARM is used to find the likelihood of co-occurrence of patent classes and the ARM network is presented to visualize the convergence network of safety technology. As a result, we find three major convergence fields of safety technology: working safety, medical safety, and vehicle safety.

Analysis of Characteristic Factors for Non-fatal Accidents in Construction Projects using Association Rule Mining (연관 규칙 탐색 기법을 이용한 건설공사 비사망 재해의 특성 요인 분석)

  • Gayeon, Lee;Sung Woo, Shin
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.6
    • /
    • pp.40-49
    • /
    • 2022
  • Simple statistical frequency based analysis, such as Pareto analysis, are widely used in conventional accident analysis. However, due to the dynamic and complex nature of construction works, many factors can simultaneously affect or involve the occurrence of accidents in construction projects. Therefore, the identification of the complex relationship between such factors is important to establish relevant and effective safety management policies and/or programs. In this study, characteristic factors and their relationships' contribution to non-fatal accidents in construction projects are analyzed using the association rule mining (ARM) technique. To this end, a total of 59,202 construction accident data are collected from 2015 to 2019 and the ARM is performed to retrieve specific relationships -named as association rules-among classified factors in the data. Characteristics of the retrieved relationships are analyzed and compared with the results of conventional Pareto analysis. Based on the results, it is found that both fall and trip are notable accident forms having characteristic relations with other factors for non-fatal accidents in construction projects. It is also found that small-scale construction, age of 50s, less than 1 month of working period, and architectural construction are important factors for non-fatal accidents in construction projects.

Identifying Core Robot Technologies by Analyzing Patent Co-classification Information

  • Jeon, Jeonghwan;Suh, Yongyoon;Koh, Jinhwan;Kim, Chulhyun;Lee, Sanghoon
    • Asian Journal of Innovation and Policy
    • /
    • v.8 no.1
    • /
    • pp.73-96
    • /
    • 2019
  • This study suggests a new approach for identifying core robot tech-nologies based on technological cross-impact. Specifically, the approach applies data mining techniques and multi-criteria decision-making methods to the co-classification information of registered patents on the robots. First, a cross-impact matrix is constructed with the confidence values by applying association rule mining (ARM) to the co-classification information of patents. Analytic network process (ANP) is applied to the co-classification frequency matrix for deriving weights of each robot technology. Then, a technique for order performance by similarity to ideal solution (TOPSIS) is employed to the derived cross-impact matrix and weights for identifying core robot technologies from the overall cross-impact perspective. It is expected that the proposed approach could help robot technology managers to formulate strategy and policy for technology planning of robot area.

Assoication Rule Analysis between lifestyle risk behaviors and multimorbidity: Findings from KHANES (국민건강영양조사 자료를 활용한 라이프스타일 위험요인과 다중이환간의 연관관계분석)

  • Hyun-Ju Lee;Sungmin Myoung
    • The Journal of Korean Society for School & Community Health Education
    • /
    • v.25 no.1
    • /
    • pp.29-41
    • /
    • 2024
  • Objectives: This study used an efficient data mining algorithm to explore association rules between the lifestyle risk behaviors and multimorbidity (having more than one chronic disease) in Korean adults. Methods: We used data from the 8th Korean National Health and Nutrition Examination Survey(2019-2020) for 7,609 adults aged ≥19 years. This study was undertaken where 6 lifestyle risk behaviors and 11 morbidities were analyzed using R and Rstudio for the ARM. Results: Among 117 association rules, combinations of hypertension, dyslipidemia and diabetes, hypertension were important role in inadequate sleep, physical inactivity and inadequate weight. Conclusion: The findings of this study are significant because they demonstrate the importance of lifestyle risk factors and the role of multiple chronic diseases using big data analytics such as association rule mining. We recommend developing selective and focused health education programs, such as exercise programs to address physical inactivity, dietary interventions to address inadequate weight, and mental health education programs to address inadequate sleep.

A Regression-Model-based Method for Combining Interestingness Measures of Association Rule Mining (연관상품 추천을 위한 회귀분석모형 기반 연관 규칙 척도 결합기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.127-141
    • /
    • 2017
  • Advances in Internet technologies and the proliferation of mobile devices enabled consumers to approach a wide range of goods and services, while causing an adverse effect that they have hard time reaching their congenial items even if they devote much time to searching for them. Accordingly, businesses are using the recommender systems to provide tools for consumers to find the desired items more easily. Association Rule Mining (ARM) technology is advantageous to recommender systems in that ARM provides intuitive form of a rule with interestingness measures (support, confidence, and lift) describing the relationship between items. Given an item, its relevant items can be distinguished with the help of the measures that show the strength of relationship between items. Based on the strength, the most pertinent items can be chosen among other items and exposed to a given item's web page. However, the diversity of the measures may confuse which items are more recommendable. Given two rules, for example, one rule's support and confidence may not be concurrently superior to the other rule's. Such discrepancy of the measures in distinguishing one rule's superiority from other rules may cause difficulty in selecting proper items for recommendation. In addition, in an online environment where a web page or mobile screen can provide a limited number of recommendations that attract consumer interest, the prudent selection of items to be included in the list of recommendations is very important. The exposure of items of little interest may lead consumers to ignore the recommendations. Then, such consumers will possibly not pay attention to other forms of marketing activities. Therefore, the measures should be aligned with the probability of consumer's acceptance of recommendations. For this reason, this study proposes a model-based approach to combine those measures into one unified measure that can consistently determine the ranking of recommended items. A regression model was designed to describe how well the measures (independent variables; i.e., support, confidence, and lift) explain consumer's acceptance of recommendations (dependent variables, hit rate of recommended items). The model is intuitive to understand and easy to use in that the equation consists of the commonly used measures for ARM and can be used in the estimation of hit rates. The experiment using transaction data from one of the Korea's largest online shopping malls was conducted to show that the proposed model can improve the hit rates of recommendations. From the top of the list to 13th place, recommended items in the higher rakings from the proposed model show the higher hit rates than those from the competitive model's. The result shows that the proposed model's performance is superior to the competitive model's in online recommendation environment. In a web page, consumers are provided around ten recommendations with which the proposed model outperforms. Moreover, a mobile device cannot expose many items simultaneously due to its limited screen size. Therefore, the result shows that the newly devised recommendation technique is suitable for the mobile recommender systems. While this study has been conducted to cover the cross-selling in online shopping malls that handle merchandise, the proposed method can be expected to be applied in various situations under which association rules apply. For example, this model can be applied to medical diagnostic systems that predict candidate diseases from a patient's symptoms. To increase the efficiency of the model, additional variables will need to be considered for the elaboration of the model in future studies. For example, price can be a good candidate for an explanatory variable because it has a major impact on consumer purchase decisions. If the prices of recommended items are much higher than the items in which a consumer is interested, the consumer may hesitate to accept the recommendations.

A R&D strategies for development using structured association map (구조화된 연관맵을 이용한 연구개발 전략 수립)

  • Song, Wonho;Lee, Junseok;Park, Sangsung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.190-195
    • /
    • 2016
  • A technology is continuously developed in a rapidly changing global market. A company requires an appropriate R&D strategy for adapting to this environment. That is, the technologies owned by the company needs to be thoroughly analyzed to improve its competitiveness. Alternatively, technology classification using IPC codes is carried out recently in an objective and quantitative way. International Patent Classification, IPC is an internationally specified classification system, so it is helpful to conduct an objective and quantitative patent analysis of technology. In this study, all of the patents owned by company C are investigated and a matrix representing IPC codes of each patent is created. Then, a structured association map of the patents is made through association rules mining based on Confidence. The association map can be used to inspect the current situation of a company about patents. It also allows highly associated technologies to be clustered. Using the association map, this study analyzes the technologies of company C and how it changes with time. The strategy for future technologies is established based on the result.

Considering Customer Buying Sequences to Enhance the Quality of Collaborative Filtering (구매순서를 고려한 개선된 협업필터링 방법론)

  • Cho, Yeong-Bin;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.69-80
    • /
    • 2007
  • The preferences of customers change over time. However, existing collaborative filtering (CF) systems are static, since they only incorporate information regarding whether a customer buys a product during a certain period and do not make use of the purchase sequences of customers. Therefore, the quality of the recommendations of the typical CF could be improved through the use of information on such sequences. In this study, we propose a new methodology for enhancing the quality of CF recommendation that uses customer purchase sequences. The proposed methodology is applied to a large department store in Korea and compared to existing CF techniques. Various experiments using real-world data demonstrate that the proposed methodology provides higher quality recommendations than do typical CF techniques with better performance.

  • PDF

The Behavior Analysis of Exhibition Visitors using Data Mining Technique at the KIDS & EDU EXPO for Children (유아교육 박람회에서 데이터마이닝 기법을 이용한 전시 관람 행동 패턴 분석)

  • Jung, Min-Kyu;Kim, Hyea-Kyeong;Choi, Il-Young;Lee, Kyoung-Jun;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.77-96
    • /
    • 2011
  • An exhibition is defined as market events for specific duration to present exhibitors' main products to business or private visitors, and it plays a key role as effective marketing channels. As the importance of exhibition is getting more and more, domestic exhibition industry has achieved such a great quantitative growth. But, In contrast to the quantitative growth of domestic exhibition industry, the qualitative growth of Exhibition has not achieved competent growth. In order to improve the quality of exhibition, we need to understand the preference or behavior characteristics of visitors and to increase the level of visitors' attention and satisfaction through the understanding of visitors. So, in this paper, we used the observation survey method which is a kind of field research to understand visitors and collect the real data for the analysis of behavior pattern. And this research proposed the following methodology framework consisting of three steps. First step is to select a suitable exhibition to apply for our method. Second step is to implement the observation survey method. And we collect the real data for further analysis. In this paper, we conducted the observation survey method to obtain the real data of the KIDS & EDU EXPO for Children in SETEC. Our methodology was conducted on 160 visitors and 78 booths from November 4th to 6th in 2010. And, the last step is to analyze the record data through observation. In this step, we analyze the feature of exhibition using Demographic Characteristics collected by observation survey method at first. And then we analyze the individual booth features by the records of visited booth. Through the analysis of individual booth features, we can figure out what kind of events attract the attention of visitors and what kind of marketing activities affect the behavior pattern of visitors. But, since previous research considered only individual features influenced by exhibition, the research about the correlation among features is not performed much. So, in this research, additional analysis is carried out to supplement the existing research with data mining techniques. And we analyze the relation among booths using data mining techniques to know behavior patterns of visitors. Among data mining techniques, we make use of two data mining techniques, such as clustering analysis and ARM(Association Rule Mining) analysis. In clustering analysis, we use K-means algorithm to figure out the correlation among booths. Through data mining techniques, we figure out that there are two important features to affect visitors' behavior patterns in exhibition. One is the geographical features of booths. The other is the exhibit contents of booths. Those features are considered when the organizer of exhibition plans next exhibition. Therefore, the results of our analysis are expected to provide guideline to understanding visitors and some valuable insights for the exhibition from the earlier phases of exhibition planning. Also, this research would be a good way to increase the quality of visitor satisfaction. Visitors' movement paths, booth location, and distances between each booth are considered to plan next exhibition in advance. This research was conducted at the KIDS & EDU EXPO for Children in SETEC(Seoul Trade Exhibition & Convention), but it has some constraints to be applied directly to other exhibitions. Also, the results were derived from a limited number of data samples. In order to obtain more accurate and reliable results, it is necessary to conduct more experiments based on larger data samples and exhibitions on a variety of genres.