• Title/Summary/Keyword: Database Mining

Search Result 572, Processing Time 0.022 seconds

Approximation of Frequent Itemsets with Maximum Size by One-scan for Association Rule Mining Application (연관 규칙 탐사 응용을 위한 한 번 읽기에 의한 최대 크기 빈발항목 추정기법)

  • Han, Gab-Soo
    • The KIPS Transactions:PartD
    • /
    • v.15D no.4
    • /
    • pp.475-484
    • /
    • 2008
  • Nowadays, lots of data mining applications based on continuous and online real time are increasing by the rapid growth of the data processing technique. In order to do association rule mining in that application, we have to use new techniques to find the frequent itemsets. Most of the existing techniques to find the frequent itemsets should scan the total database repeatedly. But in the application based on the continuous and online real time, it is impossible to scan the total database repeatedly. We have to find the frequent itemsets with only one scan of the data interval for that kind of application. So in this paper we propose an approximation technique which finds the maximum size of the frequent itemsets and items included in the maximum size of the frequent itemsets for the processing of association rule mining.

Practical Text Mining for Trend Analysis: Ontology to visualization in Aerospace Technology

  • Kim, Yoosin;Ju, Yeonjin;Hong, SeongGwan;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4133-4145
    • /
    • 2017
  • Advances in science and technology are driving us to the better life but also forcing us to make more investment at the same time. Therefore, the government has provided the investment to carry on the promising futuristic technology successfully. Indeed, a lot of resources from the government have supported into the science and technology R&D projects for several decades. However, the performance of the public investments remains unclear in many ways, so thus it is required that planning and evaluation about the new investment should be on data driven decision with fact based evidence. In this regard, the government wanted to know the trend and issue of the science and technology with evidences, and has accumulated an amount of database about the science and technology such as research papers, patents, project reports, and R&D information. Nowadays, the database is supporting to various activities such as planning policy, budget allocation, and investment evaluation for the science and technology but the information quality is not reached to the expectation because of limitations of text mining to drill out the information from the unstructured data like the reports and papers. To solve the problem, this study proposes a practical text mining methodology for the science and technology trend analysis, in case of aerospace technology, and conduct text mining methods such as ontology development, topic analysis, network analysis and their visualization.

No-reference Sharpness Index for Scanning Electron Microscopy Images Based on Dark Channel Prior

  • Li, Qiaoyue;Li, Leida;Lu, Zhaolin;Zhou, Yu;Zhu, Hancheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.5
    • /
    • pp.2529-2543
    • /
    • 2019
  • Scanning electron microscopy (SEM) image can link with the microscopic world through reflecting interaction between electrons and materials. The SEM images are easily subject to blurring distortions during the imaging process. Inspired by the fact that dark channel prior captures the changes to blurred SEM images caused by the blur process, we propose a method to evaluate the SEM images sharpness based on the dark channel prior. A SEM image database is first established with mean opinion score collected as ground truth. For the quality assessment of the SEM image, the dark channel map is generated. Since blurring is typically characterized by the spread of edge, edge of dark channel map is extracted. Then noise is removed by an edge-preserving filter. Finally, the maximum gradient and the average gradient of image are combined to generate the final sharpness score. The experimental results on the SEM blurred image database show that the proposed algorithm outperforms both the existing state-of-the-art image sharpness metrics and the general-purpose no-reference quality metrics.

A Study on the Hybrid Data Mining Mechanism Based on Association Rules and Fuzzy Neural Networks (연관규칙과 퍼지 인공신경망에 기반한 하이브리드 데이터마이닝 메커니즘에 관한 연구)

  • Kim Jin Sung
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.884-888
    • /
    • 2003
  • In this paper, we introduce the hybrid data mining mechanism based in association rule and fuzzy neural networks (FNN). Most of data mining mechanisms are depended in the association rule extraction algorithm. However, the basic association rule-based data mining has not the learning ability. In addition, sequential patterns of association rules could not represent the complicate fuzzy logic. To resolve these problems, we suggest the hybrid mechanism using association rule-based data mining, and fuzzy neural networks. Our hybrid data mining mechanism was consisted of four phases. First, we used general association rule mining mechanism to develop the initial rule-base. Then, in the second phase, we used the fuzzy neural networks to learn the past historical patterns embedded in the database. Third, fuzzy rule extraction algorithm was used to extract the implicit knowledge from the FNN. Fourth, we combine the association knowledge base and fuzzy rules. Our proposed hybrid data mining mechanism can reflect both association rule-based logical inference and complicate fuzzy logic.

  • PDF

Use of Information Component (IC) and Relative Risk (RR) for Signal Detection of Drug Interactions of Clopidogrel : Data-mining Study Using Health Insurance Review & Assessment Service (HIRA) Claims Database (정보 성분과 상대위험도를 이용한 clopidogrel의 약물상호작용 시그널 검색 : 건강보험데이터베이스를 대상으로 한 데이터마이닝 연구)

  • Kim, Jin-Hyung;Choi, Chung-Am;Oh, Jung-Mi;Son, Sung-Ho;Shin, Wan-Gyoon
    • Korean Journal of Clinical Pharmacy
    • /
    • v.21 no.2
    • /
    • pp.90-99
    • /
    • 2011
  • Health Insurance Review & Assessment Service (HIRA) claims database has a high potential to detect signals of new drug interactions. The aim of this study was to evaluate the usefulness of information component (IC) and relative risk (RR) as a tool for signal detection, and to analyze the possible drug interactions caused by clopidogrel using HIRA claims database. This study was performed in elderly patients over 65 years of age who administered clopidogrel from January 2005 to June 2006 in South Korea. Serious Adverse Events (SAEs) as drug interactions of clopidogrel were defined as any ambulatory hospitalization for ischemic diseases within comcomitant medication period of clopidogrel. Information Component (IC) and Relative Risk (RR) were calculated to compare the proportion of drug-SAE pairs in order to select drug specific SAEs. IC and RR signals of clopidogrel drug interaction were screened when IC's 95% confidence interval was greater than 0 and RR's 95% confidence interval was greater than 1 respectively. All detected signals were compared to references such as $Micromedex^{(R)}$ and 2010 Drug Interaction $Facts^{TM}$. Sensitivity, specificity, positive predicted value and negative predicted value were used to evaluate usefulness of this method. Among 13,252,930 cases of elderly patients who co-administered clopidogrel and other drugs, 47,485 cases were detected as SAE. Of these, one-hundred nine cases were detected by the IC-based data-mining approach and ninety one cases were detected by the RR-based data-mining approach. Total One-hundred sixty three unrecognized signals were detected by IC or RR. Twelve signals from IC-based data-mining (57.1%) were corresponded with drug interactions from references and eight signals from RR-based data-mining (38.1%) were corresponded with drug interactions from references. These signals include proton pump inhibitors, calcium channel blockers and HMG CoA reductase Inhibitors, which were known to affect CYP450 metabolism. Further studies using HIRA claims database are necessary to develop appropriate data-mining measure.

Adaptive Decision Tree Algorithm for Data Mining in Real-Time Machine Status Database (실시간 기계 상태 데이터베이스에서 데이터 마이닝을 위한 적응형 의사결정 트리 알고리듬)

  • Baek, Jun-Geol;Kim, Kang-Ho;Kim, Sung-Shick;Kim, Chang-Ouk
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.26 no.2
    • /
    • pp.171-182
    • /
    • 2000
  • For the last five years, data mining has drawn much attention by researchers and practitioners because of its many applicable domains. This article presents an adaptive decision tree algorithm for dynamically reasoning machine failure cause out of real-time, large-scale machine status database. Among many data mining methods, intelligent decision tree building algorithm is especially of interest in the sense that it enables the automatic generation of decision rules from the tree, facilitating the construction of expert system. On the basis of experiment using semiconductor etching machine, it has been verified that our model outperforms previously proposed decision tree models.

  • PDF

A Data Mining Technique for Customer Behavior Association Analysis in Cyber Shopping Malls (가상상점에서 고객 행위 연관성 분석을 위한 데이터 마이닝 기법)

  • 김종우;이병헌;이경미;한재룡;강태근;유관종
    • The Journal of Society for e-Business Studies
    • /
    • v.4 no.1
    • /
    • pp.21-36
    • /
    • 1999
  • Using user monitoring techniques on web, marketing decision makers in cyber shopping malls can gather customer behavior data as well as sales transaction data and customer profiles. In this paper, we present a marketing rule extraction technique for customer behavior analysis in cyber shopping malls, The technique is an application of market basket analysis which is a representative data mining technique for extracting association rules. The market basket analysis technique is applied on a customer behavior log table, which provide association rules about web pages in a cyber shopping mall. The extracted association rules can be used for mall layout design, product packaging, web page link design, and product recommendation. A prototype cyber shopping mall with customer monitoring features and a customer behavior analysis algorithm is implemented using Java Web Server, Servlet, JDBC(Java Database Connectivity), and relational database on windows NT.

  • PDF

Mining Association Rules on Significant Rare Data using Relative Support (상대 지지도를 이용한 의미 있는 희소 항목에 대한 연관 규칙 탐사 기법)

  • Ha, Dan-Shim;Hwang, Bu-Hyun
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.577-586
    • /
    • 2001
  • Recently data mining, which is analyzing the stored data and discovering potential knowledge and information in large database is a key research topic in database research data In this paper, we study methods of discovering association rules which are one of data mining techniques. And we propose a technique of discovering association rules using the relative support to consider significant rare data which have the high relative support among some data. And we compare and evaluate existing methods and the proposed method of discovering association rules for discovering significant rare data.

  • PDF

A Study on the Database Marketing using Data Mining in the Traditional Medicine (데이터마이닝을 활용한 한방분야에서의 데이터베이스 마케팅에 대한 연구)

  • Lee Sang-Young;Lee Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.271-280
    • /
    • 2005
  • This study is to elicit the factors affected on the medical examination in the tra야tional medicine using the technical method of the decision tree and characterize the Patient subject by clustering analysis technique. And to draw results from the association analysis between the form of diseases in the re-hospitalized Patient group. The obtained results were analyzed for their effect on the hospital Profits. Thus. through application of the database marketing to the data mining technique in the tradition리 medicine, the characteristics of patient clients for the objective induction of factors affected on the hospital Fronts can be identified. Practical application of the database marketing as presented in this study will bring about a fundamental efficiency of hospital management and vitalization.

  • PDF

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.