• Title/Summary/Keyword: high utility patterns

Search Result 24, Processing Time 0.024 seconds

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

  • Ahmed, Chowdhury Farhan;Tanbeer, Syed Khairuzzaman;Jeong, Byeong-Soo
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.676-686
    • /
    • 2010
  • Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: UtilityLevel is a high-utility sequential pattern mining with a level-wise candidate generation approach, and UtilitySpan is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.

Mining High Utility Sequential Patterns Using Sequence Utility Lists (시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.51-62
    • /
    • 2018
  • High utility sequential pattern (HUSP) mining has been considered as an important research topic in data mining. Although some algorithms have been proposed for this topic, they incur the problem of producing a large search space for HUSPs. The tighter utility upper bound of a sequence can prune more unpromising patterns early in the search space. In this paper, we propose a sequence expected utility (SEU) as a new utility upper bound of each sequence, which is the maximum expected utility of a sequence and all its descendant sequences. A sequence utility list for each pattern is used as a new data structure to maintain essential information for mining HUSPs. We devise an algorithm, high sequence utility list-span (HSUL-Span), to identify HUSPs by employing SEU. Experimental results on both synthetic and real datasets from different domains show that HSUL-Span generates considerably less candidate patterns and outperforms other algorithms in terms of execution time.

A single-phase algorithm for mining high utility itemsets using compressed tree structures

  • Bhat B, Anup;SV, Harish;M, Geetha
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1024-1037
    • /
    • 2021
  • Mining high utility itemsets (HUIs) from transaction databases considers such factors as the unit profit and quantity of purchased items. Two-phase tree-based algorithms transform a database into compressed tree structures and generate candidate patterns through a recursive pattern-growth procedure. This procedure requires a lot of memory and time to construct conditional pattern trees. To address this issue, this study employs two compressed tree structures, namely, Utility Count Tree and String Utility Tree, to enumerate valid patterns and thus promote fast utility computation. Furthermore, the study presents an algorithm called single-phase utility computation (SPUC) that leverages these two tree structures to mine HUIs in a single phase by incorporating novel pruning strategies. Experiments conducted on both real and synthetic datasets demonstrate the superior performance of SPUC compared with IHUP, UP-Growth, and UP-Growth+algorithms.

Pattern Selection Using the Bias and Variance of Ensemble (앙상블의 편기와 분산을 이용한 패턴 선택)

  • Shin, Hyunjung;Cho, Sungzoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.28 no.1
    • /
    • pp.112-127
    • /
    • 2002
  • A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern 'utility index' that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.

Performance Analysis of Top-K High Utility Pattern Mining Methods (상위 K 하이 유틸리티 패턴 마이닝 기법 성능분석)

  • Ryang, Heungmo;Yun, Unil;Kim, Chulhong
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.89-95
    • /
    • 2015
  • Traditional frequent pattern mining discovers valid patterns with no smaller frequency than a user-defined minimum threshold from databases. In this framework, an enormous number of patterns may be extracted by a too low threshold, which makes result analysis difficult, and a too high one may generate no valid pattern. Setting an appropriate threshold is not an easy task since it requires the prior knowledge for its domain. Therefore, a pattern mining approach that is not based on the domain knowledge became needed due to inability of the framework to predict and control mining results precisely according to the given threshold. Top-k frequent pattern mining was proposed to solve the problem, and it mines top-k important patterns without any threshold setting. Through this method, users can find patterns from ones with the highest frequency to ones with the k-th highest frequency regardless of databases. In this paper, we provide knowledge both on frequent and top-k pattern mining. Although top-k frequent pattern mining extracts top-k significant patterns without the setting, it cannot consider both item quantities in transactions and relative importance of items in databases, and this is why the method cannot meet requirements of many real-world applications. That is, patterns with low frequency can be meaningful, and vice versa, in the applications. High utility pattern mining was proposed to reflect the characteristics of non-binary databases and requires a minimum threshold. Recently, top-k high utility pattern mining has been developed, through which users can mine the desired number of high utility patterns without the prior knowledge. In this paper, we analyze two algorithms related to top-k high utility pattern mining in detail. We also conduct various experiments for the algorithms on real datasets and study improvement point and development direction of top-k high utility pattern mining through performance analysis with respect to the experimental results.

A Study on Consumer In Search Patterns and Search Outcomes(1) (소비자 정보탐색유형과 탐색성과에 관한연구(I))

  • 채정숙
    • Journal of the Korean Home Economics Association
    • /
    • v.32 no.5
    • /
    • pp.67-82
    • /
    • 1994
  • The major purpose of this study was to find the influencing factors in explaining information search patterns and to find if significant differences exist in search outcomes by search patterns. The data for this study were collected in a survey conducted in March of 1993. The final sample consisted of 327 respondents purchased refrigerator 340 purchased bed. The important findings of this study are as follows: First The variables related to search cost-benefit play an important role in identifying search patterns of consumers. Second search outcomes were different among four information search patterns for each of information sources. The overall search outcomes the level of purchase knowledge and of post-purchase satisfaction was relatively high for high-search and high-reliance group compared with other groups. And the results also indicate that although some consumers search less than others they still can make good purchase decision-making and can maximize their utility if they choose useful information sources selectively and use those selected information sources effectively. The findings of this study provide some implications regarding consumer education programs the consumer information providing policies and future research methods.

  • PDF

High Utility Itemset Mining by Using Binary PSO Algorithm with V-shaped Transfer Function and Nonlinear Acceleration Coefficient Strategy

  • Tao, Bodong;Shin, Ok Keun;Park, Hyu Chan
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.103-112
    • /
    • 2022
  • The goal of pattern mining is to identify novel patterns in a database. High utility itemset mining (HUIM) is a research direction for pattern mining. This is different from frequent itemset mining (FIM), which additionally considers the quantity and profit of the commodity. Several algorithms have been used to mine high utility itemsets (HUIs). The original BPSO algorithm lacks local search capabilities in the subsequent stage, resulting in insufficient HUIs to be mined. Compared to the transfer function used in the original PSO algorithm, the V-shaped transfer function more sufficiently reflects the probability between the velocity and position change of the particles. Considering the influence of the acceleration factor on the particle motion mode and trajectory, a nonlinear acceleration strategy was used to enhance the search ability of the particles. Experiments show that the number of mined HUIs is 73% higher than that of the original BPSO algorithm, which indicates better performance of the proposed algorithm.

Comparison and Optimization of Parallel-Transmission RF Coil Elements for 3.0 T Body MRI (3.0 T MRI를 위한 Parallel-Transmission RF 코일 구조의 비교와 최적화)

  • Oh, Chang-Hyun;Lee, Heung-K.;Ryu, Yeun-Chul;Hyun, Jung-Ho;Choi, Hyuk-Jin
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.61-63
    • /
    • 2007
  • In high field (> 3 T) MR imaging, the magnetic field inhomogeneity in the target object increases due to the nonuniform electro-magnetic characteristics and relatively high Larmor frequency. Especially in the body imaging, the effect causes more serious problems resulting in locally high SAR(Specific Absorption Ratio). In this paper, we propose an optimized parallel-transmission RF coil element structure and show the utility of the coil by FDTD simulations to overcome the unwanted effects. Three types of TX coil elements are tested to maximize the efficiency and their driving patterns(amplitude and phase) optimized to have adequate field homogeneity, proper SAR level, and sufficient field strength. For the proposed coil element of 25 cm ${\times}$ 8 cm loop structure with 12 channels for a 3.0 T body coil, the 73% field non-uniformity without optimization was reduced to about 26% after optimization of driving patterns. The experimental as well as simulation results show the utility of the proposed parallel driving scheme is clinically useful for (ultra) high field MRI.

  • PDF

Development of Rushan (襦衫) and Qun (裙) Patterns for Traditional Chinese Wedding Dresses Using a Virtual Fitting Program

  • Liu, Xiang;Suh, Chuyeon
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.46 no.2
    • /
    • pp.250-271
    • /
    • 2022
  • Traditional wedding dresses have had a high market demand in China in recent years. Traditional wedding dresses from the Tang dynasty occupy an important position among traditional Chinese dresses, and they are also favored by young women. This study was conducted to develop the rushan and qun patterns of traditional wedding dress styles from the Tang dynasty for women in their twenties in China. For this purpose, the rushan and qun patterns of Tang and Song dynasty dresses and modern traditional dresses were collected and analyzed. Additionally, the developed patterns were validated for suitability through appearance evaluations of virtual and real fittings. The following proportions of the developed patterns were proposed: H/3.3 for rushan length, H/33 for collar width, H/1.08 for total sleeve length, H/6 for sleeve width, H/8.5 for sleeve hem width, and H/1.55 for qun length. In addition, the developed patterns received high scores in the appearance evaluations of the virtual and real fittings. Therefore, the developed rushan and qun patterns are expected to have high utility in the current traditional wedding dress industry.

High Utility Pattern Mining using a Prefix-Tree (Prefix-Tree를 이용한 높은 유틸리티 패턴 마이닝 기법)

  • Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan;Lee, In-Gi;Yong, Hwan-Seong
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.341-351
    • /
    • 2009
  • Recently high utility pattern (HUP) mining is one of the most important research issuer in data mining since it can consider the different weight Haloes of items. However, existing mining algorithms suffer from the performance degradation because it cannot easily apply Apriori-principle for pattern mining. In this paper, we introduce new high utility pattern mining approach by using a prefix-tree as in FP-Growth algorithm. Our approach stores the weight value of each item into a node and utilizes them for pruning unnecessary patterns. We compare the performance characteristics of three different prefix-tree structures. By thorough experimentation, we also prove that our approach can give performance improvement to a degree.