• Title/Summary/Keyword: 빈발도

Search Result 465, Processing Time 0.023 seconds

Mining Association Rules in Multidimensional Stream Data (다차원 스트림 데이터의 연관 규칙 탐사 기법)

  • Kim, Dae-In;Park, Joon;Kim, Hong-Ki;Hwang, Bu-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.6 s.109
    • /
    • pp.765-774
    • /
    • 2006
  • An association rule discovery, a technique to analyze the stored data in databases to discover potential information, has been a popular topic in stream data system. Most of the previous researches are concerned to single stream data. However, this approach may ignore in mining to multidimensional stream data. In this paper, we study the techniques discovering the association rules to multidimensional stream data. And we propose a AR-MS method reflecting the characteristics of stream data since make the summarization information by one data scan and discovering the association rules for significant rare data that appear infrequently in the database but are highly associated with specific event. Also, AR-MS method can discover the maximal frequent item of multidimensional stream data by using the summarization information. Through analysis and experiments, we show that AR-MS method is superior to other previous methods.

Protein Disorder/Order Region Classification Using EPs-TFP Mining Method (EPs-TFP 마이닝 기법을 이용한 단백질 Disorder/Order 지역 분류)

  • Lee, Heon Gyu;Shin, Yong Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.6
    • /
    • pp.59-72
    • /
    • 2012
  • Since a protein displays its specific functions when disorder region of protein sequence transits to order region with provoking a biological reaction, the separation of disorder region and order region from the sequence data is urgently necessary for predicting three dimensional structure and characteristics of the protein. To classify the disorder and order region efficiently, this paper proposes a classification/prediction method using sequence data while acquiring a non-biased result on a specific characteristics of protein and improving the classification speed. The emerging patterns based EPs-TFP methods utilizes only the essential emerging pattern in which the redundant emerging patterns are removed. This classification method finds the sequence patterns of disorder region, such sequence patterns are frequently shown in disorder region but relatively not frequently in the order region. We expand P-tree and T-tree conceptualized TFP method into a classification/prediction method in order to improve the performance of the proposed algorithm. We used Disprot 4.9 and CASP 7 data to evaluate EPs-TFP technique, the results of order/disorder classification show sensitivity 73.6, specificity 69.51 and accuracy 74.2.

Clustering Normal User Behavior for Anomaly Intrusion Detection (비정상행위 탐지를 위한 사용자 정상행위 클러스터링 기법)

  • Oh, Sang-Hyun;Lee, Won-Suk
    • The KIPS Transactions:PartC
    • /
    • v.10C no.7
    • /
    • pp.857-866
    • /
    • 2003
  • For detecting an intrusion based on the anomaly of a user's activities, previous works are concentrated on statistical techniques in order to analyze an audit data set. However. since they mainly analyze the average behavior of a user's activities, some anomalies can be detected inaccurately. In this paper, a new clustering algorithm for modeling the normal pattern of a user's activities is proposed. Since clustering can identify an arbitrary number of dense ranges in an analysis domain, it can eliminate the inaccuracy caused by statistical analysis. Also, clustering can be used to model common knowledge occurring frequently in a set of transactions. Consequently, the common activities of a user can be found more accurately. The common knowledge is represented by the occurrence frequency of similar data objects by the unit of a transaction as veil as the common repetitive ratio of similar data objects in each transaction. Furthermore, the proposed method also addresses how to maintain identified common knowledge as a concise profile. As a result, the profile can be used to detect any anomalous behavior In an online transaction.

Semi-Automatic Ontology Generation about XML Documents using Data Mining Method (데이터 마이닝 기법을 이용한 XML 문서의 온톨로지 반자동 생성)

  • Gu Mi-Sug;Hwang Jeong-Hee;Ryu Keun-Ho;Hong Jang-Eui
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.299-308
    • /
    • 2006
  • As recently XML is becoming the standard of exchanging web documents and public documentations, XML data are increasing in many areas. To retrieve the information about XML documents efficiently, the semantic web based on the ontology is appearing. The existing ontology has been constructed manually and it was time and cost consuming. Therefore in this paper, we propose the semi-automatic ontology generation technique using the data mining technique, the association rules. The proposed method solves what type and how many conceptual relationships and determines the ontology domain level for the automatic ontology generation, using the data mining algorithm. Appying the association rules to the XML documents, we intend to find out the conceptual relationships to construct the ontology, finding the frequent patterns of XML tags in the XML documents. Using the conceptual ontology domain level extracted from the data mining, we implemented the semantic web based on the ontology by XML Topic Maps (XTM) and the topic map engine, TM4J.

Expansion of Opinion Mining based on Entity Association Network Model (개체연관망 모델에 의한 오피니언마이닝의 확장)

  • Kim, Keun-Hyung
    • The KIPS Transactions:PartD
    • /
    • v.18D no.4
    • /
    • pp.237-244
    • /
    • 2011
  • Opinion Mining summarizes with classifying sensitive opinions of customers in huge online customer reviews for the attributes of products or services by positive and negative opinions. Because the customers represent their interests through subjective opinions as well as objective facts, the existing opinion mining techniques, which can analyze just the sensitive opinions, need to be expanded.. In this paper, We propose the novel entity association network model which expands the existing opinion mining techniques. The entity association model can not only represent positive and negative degree of the sensitive opinions, but also can represent the degree of the associations and relative importances between entities. We designed and implemented the customer reviews analysis system based on the entity association network model. We recognized that the system can represent more abundant information than the existing opinion mining techniques.

An Associative Class Set Generation Method for supporting Location-based Services (위치 기반 서비스 지원을 위한 연관 클래스 집합 생성 기법)

  • 김호숙;용환승
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.287-296
    • /
    • 2004
  • Recently, various location-based services are becoming very popular in mobile environments. In this paper, we propose a new concept of a frequent item set, called “associative class set”, for supporting the location-based service which uses a large quantity of a spatial database in mobile computing environments, and then present a new method for efficiently generating the associative class set. The associative class set is generated with considering the temporal relation of queries, the spatial distance of required objects, and access patterns of users. The result of our research can play a fundamental role in efficiently supporting location-based services and in overcoming the limitation of mobile environments. The associative class set can be applied by a recommendation system of a geographic information system in mobile computing environments, mobile advertisement, city development planning, and client cache police of mobile users.

The Efficient Spatio-Temporal Moving Pattern Mining using Moving Sequence Tree (이동 시퀀스 트리를 이용한 효율적인 시공간 이동 패턴 탐사 기법)

  • Lee, Yon-Sik;Ko, Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.2
    • /
    • pp.237-248
    • /
    • 2009
  • Recently, based on dynamic location or mobility of moving object, many researches on pattern mining methods actively progress to extract more available patterns from various moving patterns for development of location based services. The performance of moving pattern mining depend on how analyze and process the huge set of spatio-temporal data. Some of traditional spatio-temporal pattern mining methods[1-6,8-11]have proposed to solve these problem, but they did not solve properly to reduce mining execution time and minimize required memory space. Therefore, in this paper, we propose new spatio-temporal pattern mining method which extract the sequential and periodic frequent moving patterns efficiently from the huge set of spatio-temporal moving data. The proposed method reduces mining execution time of $83%{\sim}93%$ rate on frequent moving patterns mining using the moving sequence tree which generated from historical data of moving objects based on hash tree. And also, for minimizing the required memory space, it generalize the detained historical data including spatio-temporal attributes into the real world scope of space and time using spatio-temporal concept hierarchy.

Multi-parametric Diagnosis Indexes and Emerging Pattern based Classification Technique for Diagnosing Cardiovascular Disease (심혈관계 질환 진단을 위한 복합 진단 지표와 출현 패턴 기반의 분류 기법)

  • Lee, Heon-Gyu;Noh, Ki-Yong;Ryu, Keun-Ho;Jung, Doo-Young
    • The KIPS Transactions:PartD
    • /
    • v.16D no.1
    • /
    • pp.11-26
    • /
    • 2009
  • In order to diagnose cardiovascular disease, we proposed EP-based(emerging pattern- based) classification technique using multi-parametric diagnosis indexes. We analyzed linear/nonlinear features of HRV for three recumbent postures and extracted four diagnosis indexes from ST-segments to apply the multi-parametric diagnosis indexes. In this paper, classification model using essential emerging patterns for diagnosing disease was applied. This classification technique discovers disease patterns of patient group and these emerging patterns are frequent in patients with cardiovascular disease but are not frequent in the normal group. To evaluate proposed classification algorithm, 120 patients with AP (angina pectrois), 13 patients with ACS(acute coronary syndrome) and 128 normal people data were used. As a result of classification, when multi-parametric indexes were used, the percent accuracy in classifying three groups was turned out to be about 88.3%.

Personalized Recommendation System using FP-tree Mining based on RFM (RFM기반 FP-tree 마이닝을 이용한 개인화 추천시스템)

  • Cho, Young-Sung;Ho, Ryu-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.197-206
    • /
    • 2012
  • A exisiting recommedation system using association rules has the problem, such as delay of processing speed from a cause of frequent scanning a large data, scalability and accuracy as well. In this paper, using a Implicit method which is not used user's profile for rating, we propose the personalized recommendation system which is a new method using the FP-tree mining based on RFM. It is necessary for us to keep the analysis of RFM method and FP-tree mining to be able to reflect attributes of customers and items based on the whole customers' data and purchased data in order to find the items with high purchasability. The proposed makes frequent items and creates association rule by using the FP-tree mining based on RFM without occurrence of candidate set. We can recommend the items with efficiency, are used to generate the recommendable item according to the basic threshold for association rules with support, confidence and lift. To estimate the performance, the proposed system is compared with existing system. As a result, it can be improved and evaluated according to the criteria of logicality through the experiment with dataset, collected in a cosmetic internet shopping mall.

Searching association rules based on purchase history and usage-time of an item (콘텐츠 구매이력과 사용시간을 고려한 연관규칙탐색)

  • Lee, Bong-Kyu
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.81-88
    • /
    • 2020
  • Various methods of differentiating and servicing digital content for individual users have been studied. Searching for association rules is a very useful way to discover individual preferences in digital content services. The Apriori algorithm is useful as an association rule extractor using frequent itemsets. However, the Apriori algorithm is not suitable for application to an actual content service because it considers only the reference count of each content. In this paper, we propose a new algorithm based on the Apriori that searches association rules by using purchase history and usage-time for each item. The proposed algorithm utilizes the usage time with the weight value according to purchase items. Thus, it is possible to extract the exact preference of the actual user. We implement the proposed algorithm and verify the performance through the actual data presented in the actual content service system.