• Title/Summary/Keyword: Apriori

Search Result 144, Processing Time 0.026 seconds

An analysis of operation status depending on the characteristics of R&D projects in Sciences and Engineering universities (이공계 대학 연구과제 특성 별 운영 형태 현황)

  • Lee, Sang-Soog;Yoo, Inhyeok;Kim, Jinhee
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.93-100
    • /
    • 2022
  • This study aimed to understand the current status of science and engineering university(SEU) R&D operations depending on the research project characteristics(e.g., stages and characteristics), then provide implications for future university R&D support systems and related policies. Hence, an online survey targeting SEU R&D recipients was conducted between October 4th to November 5th, 2021. Analyzing 445 valid data using the Apriori algorithm, 16 association rules for R&D operation according to the research project characteristics show that regardless of research characteristics, SEU's R&D projects, particularly in applied research, were funded or operated under the leadership of government or public institutions. For basic research, individual researchers had a higher level of autonomy in determining research topics; yet, they had a short duration (3 years) and a unit of evaluation period of more than 3 years. These findings can be empirical evidence for revealing the relationship among various variables in operating SEUs' R&D.

Development of the Goods Recommendation System using Association Rules and Collaborating Filtering (연관규칙과 협업적 필터링을 이용한 상품 추천 시스템 개발)

  • Kim, Ji-Hye;Park, Doo-Soon
    • The Journal of Korean Association of Computer Education
    • /
    • v.9 no.1
    • /
    • pp.71-80
    • /
    • 2006
  • As e-commerce developing rapidly, it is becoming a research focus about how to find customer's behavior patterns and realize commerce intelligence by use of Web mining technology. One of the most successful and widely used technologies for building personalization and goods recommendation system is collaborating filtering. However, collaborative filtering have serious data sparsity problem. Traditional association rule does not consider user's interests or preferences to provide a user with specific personalized service.In this paper, we propose an goods recommendation system, which is integrated an collaborative filtering algorithm with item-to-item corelation and an improved Apriori algorithm. This system has user's interests or preferences ro provide a user with specific personalized service.

  • PDF

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences

  • Kang, Tae-Ho;Yoo, Jae-Soo;Kim, Hak-Yong;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.2
    • /
    • pp.18-24
    • /
    • 2007
  • Biological sequences such as DNA and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of more than hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological datasets with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with a fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. The experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

SuffixSpan: A Formal Approach For Mining Sequential Patterns (SuffixSpan: 순차패턴 마이닝을 위한 형식적 접근방법)

  • Cho, Dong-Young
    • The Journal of Korean Association of Computer Education
    • /
    • v.5 no.4
    • /
    • pp.53-60
    • /
    • 2002
  • Typical Apriori-like methods for mining sequential patterns have some problems such as generating of many candidate patterns and repetitive searching of a large database. And PrefixSpan constructs the prefix projected databases which are stepwise partitioned in the mining process. It can reduce the searching space to estimate the support of candidate patterns, but the construction cost of projected databases is still high. For efficient sequential pattern mining, we need to reduce the cost to generate candidate patterns and searching space for the generated ones. To solve these problems, we proposed SuffixSpan(Suffix checked Sequential Pattern mining), a new method for sequential pattern mining, and show a formal approach to our method.

  • PDF

Design and Implementation of e-SRM System Supporting Individual Adjusting Feedback in Web-based Learning Environment (웹 기반 학습 환경에서 개별 적응적 피드백을 지원하는 e-SRM 시스템의 설계 및 구현)

  • Baek, Jang-Hyeon;Kim, Yung-Sik
    • Journal of The Korean Association of Information Education
    • /
    • v.8 no.3
    • /
    • pp.307-317
    • /
    • 2004
  • In web-based education environment, it is necessary to provide individually adjusting feedback according to learner's characteristic. Despite this necessity, it is a current state that there are difficulties in deriving the variables of learners' characteristics and lack in developing the systematic strategies and practical tools for providing individually adjusting feedback. This study analyzed the learners' learning patterns, one of learner's characteristic variables regarded as important in web-based teaching and learning environment by employing Apriori algorithm, and also grouped the learners by learning pattern. Under this framework, the e-SRM feedback system was designed and developed to provide learning content, learning channel, and learning situation, etc. for individual learners. The proposed system in this study is expected to provide an optimal learning environment complying with learner's characteristic.

  • PDF

Mining Frequent Contiguous Sequence Patterns in Biological Sequences (생물학적 서열들에서 빈발한 연속 서열 패턴 마이닝)

  • Kang, Tae-Ho;Yoo, Jae-Soo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06b
    • /
    • pp.27-31
    • /
    • 2007
  • 생물학적 서열 데이터는 크게 DNA 염기 서열과 단백질 아미노산 서열이 있다. 이들 서열은 일반적으로 많은 수의 항목들을 가지고 있어 그 길이가 매우 길다. 생물학적 데이터 서열들에는 보통 빈번하게 발생하는 부분 연속 서열들이 존재하는데 이들 서열들을 찾아내는 것은 다양한 서열 분석에서 유용하게 사용될 수 있다. 이를 위해 초기에는 Apriori 알고리즘을 기반으로 하는 순차패턴 마이닝 알고리즘들을 활용하는 방법들이 많이 제시되었다. 그중 PrefixSpan 알고리즘은 Apriori기반의 가장 효율적인 순차패턴 마이닝 기법이다. 하지만 이 알고리즘은 길이-1인 빈발 패턴들로부터 서열 패턴을 확장해나가는 방식으로 길이가 긴 연속 서열을 포함하는 생물학적 데이터 서열들에 대한 검색방법으로는 적합하지 않다. 최근에는 기존의 PrefixSpan방식을 이용하면서도 반복적인 처리과정을 줄인 MacosVSpan이 제안되었다. 하지만 이 알고리즘 또한 원본 데이터베이스보다 크기가 큰 별도의 프로젝션 데이터베이스를 사용함으로서 많은 비용부담이 발생하고 특히 길이가 긴 서열에 대해서는 더욱 효율적이지 못하다. 이에 본 논문에서 많은 양의 생물학적 데이터 서열들로부터 빈번한 연속서열을 고정길이 확장 트리를 이용하여 효과적으로 찾아내는 방법을 제안한다. 그리고 다양한 환경에서 실험을 통해 제안하는 방식이 MacosVSpan알고리즘에 비해 검색 성능이 우수함을 증명한다.

  • PDF

Automatic Error Detection of Morpho-syntactic Errors of English Writing Using Association Rule Analysis Algorithm (연관 규칙 분석 알고리즘을 활용한 영작문 형태.통사 오류 자동 발견)

  • Kim, Dong-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.3-8
    • /
    • 2010
  • 본 연구에서는 일련의 연구에서 수집된 영작문 오류 유형의 정제된 자료를 토대로 연관 규칙을 생성하고, 학습을 통해서 효용성이 검증된 연관 규칙을 활용해서 영작문 데이터의 형태 통사 오류를 자동으로 탐지한다. 영작문 데이터에서 형태 통사 오류를 찾아내는 작업은 많은 시간과 자원이 소요되는 작업이므로 자동화가 필수적이다. 기존의 연구들이 통계적 모델을 활용한 어휘적 오류에 치중하거나 언어 이론적 틀에 근거한 통사 처리에 집중하는 반면에, 본 연구는 데이터 마이닝을 통해서 정제된 데이터에서 연관 규칙을 생성하고 이를 검증한 후 형태 통사 오류를 감지한다. 이전 연구들에서는 이론적 틀에 맞추어진 규칙 생성이나 언어 모델 생성을 위한 대량의 코퍼스 데이터와 같은 다량의 지식 베이스 생성이 필수적인데, 본 연구는 적은 양의 정제된 데이터를 활용한다. 영작문 오류 유형의 형태 통사 연관 규칙을 생성하기 위해서 Apriori 알고리즘을 활용하였다. 알고리즘을 통해서 생성된 연관 규칙 중 잘못된 규칙이 생성될 가능성이 있으므로, 상관성 검정, 코사인 유사도와 같은 규칙 효용성의 통계적 검증을 활용해서 타당한 규칙만을 학습하였다. 이를 통해서 축적된 연관 규칙들을 영작문 오류를 자동으로 탐지하는 실험에 활용하였다.

  • PDF

A Personalized Clothing Recommender System Based on the Algorithm for Mining Association Rules (연관 규칙 생성 알고리즘 기반의 개인화 의류 추천 시스템)

  • Lee, Chong-Hyeon;Lee, Suk-Hoon;Kim, Jang-Won;Baik, Doo-Kwon
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.4
    • /
    • pp.59-66
    • /
    • 2010
  • We present a personalized clothing recommender system - one that mines association rules from transaction described in ontologies and infers a recommendation from the rules. The recommender system can forecast frequently changing trends of clothing using the Onto-Apriori algorithm, and it makes appropriate recommendations for each users possible through the inference marked as meta nodes. We simulates the rule generator and the inferential search engine of the system with focus on accuracy and efficiency, and our results validate the system.

Deep Learning Framework with Convolutional Sequential Semantic Embedding for Mining High-Utility Itemsets and Top-N Recommendations

  • Siva S;Shilpa Chaudhari
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.1
    • /
    • pp.44-55
    • /
    • 2024
  • High-utility itemset mining (HUIM) is a dominant technology that enables enterprises to make real-time decisions, including supply chain management, customer segmentation, and business analytics. However, classical support value-driven Apriori solutions are confined and unable to meet real-time enterprise demands, especially for large amounts of input data. This study introduces a groundbreaking model for top-N high utility itemset mining in real-time enterprise applications. Unlike traditional Apriori-based solutions, the proposed convolutional sequential embedding metrics-driven cosine-similarity-based multilayer perception learning model leverages global and contextual features, including semantic attributes, for enhanced top-N recommendations over sequential transactions. The MATLAB-based simulations of the model on diverse datasets, demonstrated an impressive precision (0.5632), mean absolute error (MAE) (0.7610), hit rate (HR)@K (0.5720), and normalized discounted cumulative gain (NDCG)@K (0.4268). The average MAE across different datasets and latent dimensions was 0.608. Additionally, the model achieved remarkable cumulative accuracy and precision of 97.94% and 97.04% in performance, respectively, surpassing existing state-of-the-art models. This affirms the robustness and effectiveness of the proposed model in real-time enterprise scenarios.

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

  • Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.1
    • /
    • pp.1-17
    • /
    • 2008
  • Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

  • PDF