• 제목/요약/키워드: Component mining

Search Result 142, Processing Time 0.027 seconds

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.8
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

An Implementation of Story Path Recommendation System of Interactive Drama Using PCA and NMF (PCA와 NMF를 이용한 대화식 드라마의 스토리 경로 추천 시스템 구현)

  • Lee, Yeon-Chang;Jang, Jae-Hee;Kim, Myung-Gwan
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.95-102
    • /
    • 2012
  • Interactive drama is a story which requires user's free choice and participation. In this study, we grasp user's preference by making training data that utilize characters of interactive drama. Furthermore, we describe process of implementing systems which recommend new users path of stories that correspond with their preference. We used PCA and NMF to extract characteristic of preference. The success rate of recommending was 75% with PCA, while 62.5% with NMF.

HisCoM-mimi: software for hierarchical structural component analysis for miRNA-mRNA integration model for binary phenotypes

  • Kim, Yongkang;Park, Taesung
    • Genomics & Informatics
    • /
    • v.17 no.1
    • /
    • pp.10.1-10.3
    • /
    • 2019
  • To identify miRNA-mRNA interaction pairs associated with binary phenotypes, we propose a hierarchical structural component model for miRNA-mRNA integration (HisCoM-mimi). Information on known mRNA targets provided by TargetScan is used to perform HisCoM-mimi. However, multiple databases can be used to find miRNA-mRNA signatures with known biological information through different algorithms. To take these additional databases into account, we present our advanced application software for HisCoM-mimi for binary phenotypes. The proposed HisCoM-mimi supports both TargetScan and miRTarBase, which provides manually-verified information initially gathered by text-mining the literature. By integrating information from miRTarBase into HisCoM-mimi, a broad range of target information derived from the research literature can be analyzed. Another improvement of the new HisCoM-mimi approach is the inclusion of updated algorithms to provide the lasso and elastic-net penalties for users who want to fit a model with a smaller number of selected miRNAs and mRNAs. We expect that our HisCoM-mimi software will make advanced methods accessible to researchers who want to identify miRNA-mRNA interaction pairs related with binary phenotypes.

Lp (p ≥ 1) SOLUTIONS OF MULTIDIMENSIONAL BSDES WITH TIME-VARYING QUASI-HÖLDER CONTINUITY GENERATORS IN GENERAL TIME INTERVALS

  • Lishun, Xiao;Shengjun, Fan
    • Communications of the Korean Mathematical Society
    • /
    • v.35 no.2
    • /
    • pp.667-684
    • /
    • 2020
  • The objective of this paper is solving multidimensional backward stochastic differential equations with general time intervals, in Lp (p ≥ 1) sense, where the generator g satisfies a time-varying Osgood condition in y, a time-varying quasi-Hölder continuity condition in z, and its ith component depends on the ith row of z. Our result strengthens some existing works even for the case of finite time intervals.

Dynamic Simulation and Modelling of the Switched Reluctance Motor (SRM의 Dynamic Simulation과 Modelling에 관한 연구)

  • Lee Ju-Hyun;Chen Hao;Ahn Jin-Woo
    • Proceedings of the KIPE Conference
    • /
    • 2004.07b
    • /
    • pp.922-925
    • /
    • 2004
  • The paper presents the component parts and their models of the Switched Reluctance motor drive system with the angle position-current chopping control and with the fixed angle pulse width modulation control. The calculation of the parameters and the simulated models based on the MATLAB SIMULINK software package are introduced by a four-phase 8/6 structure prototype with the four-phase asymmetric bridge power converter. The simulation of the prototype in the course of starting is made by the simulated models at the different control strategies and the different given rotor speed.

  • PDF

Variable Arrangement for Data Visualization

  • Huh, Moon Yul;Song, Kwang Ryeol
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.643-650
    • /
    • 2001
  • Some classical plots like scatterplot matrices and parallel coordinates are valuable tools for data visualization. These tools are extensively used in the modern data mining softwares to explore the inherent data structure, and hence to visually classify or cluster the database into appropriate groups. However, the interpretation of these plots are very sensitive to the arrangement of variables. In this work, we introduce two methods to arrange the variables for data visualization. First method is based on the work of Wegman (1999), and this is to arrange the variables using minimum distance among all the pairwise permutation of the variables. Second method is using the idea of principal components. We Investigate the effectiveness of these methods with parallel coordinates using real data sets, and show that each of the two proposed methods has its own strength from different aspects respectively.

  • PDF

Business Performance Analysis System based on Knowledge Discovery in Databases (Knowledge Discovery in Databases에 기반한 경영성과분석 시스템)

  • 조성훈;정민용
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.23 no.57
    • /
    • pp.11-20
    • /
    • 2000
  • In dynamic management environment, CEO must make an efficient decision with information & knowledge management systems based on IT(Information Technology). As a key component to cope with this current, we suggest the business performance analysis system based on KDD(Knowledge Discovery in Databases). We consider the theoretical model that is composited both Value-Added in respect of stakeholder and Economic Value-Added in respect of shareholder. Additionally we use DBMS and data mining method using Genetic Algorithms as physical model. To demonstrate the performance of the business performance analysis system, we analyse a domestic motors industry. The empirical case is based on the financial data of KISFAS(Korea Investors Services Financial Analysis System) database. The samples included in the study consist of H motors/S motors industry over the 16-year from 1981 to 1996.

  • PDF

Implementation of Management performance Analysis System with KDD (KDD에 기반한 경영성과 분석 시스템 구현)

  • An, Dong-Gyu;Jo, Seong-Hun
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2004.05a
    • /
    • pp.575-592
    • /
    • 2004
  • In modern dynamic management environment, there is growing recognition that? information & knowledge management systems are essential for CEO's efficient/effective decision making. As a key component to cope with this current, we suggest the management performance analysis syystem based on Knowledge Discovery in Database (KDD). The system measures management performance that is considered with both VA(Value- Added), which represents stakeholder's point of view and EVA(Economic Value-Added), which represents shareholder's point of view. The relation ship between management performance and some 80 financial ratios is analyzed, and then important financial ratios are drawn out. In analyzing the relationship, we applied KDD process which includes such as multidimensional cube, OLAP(On-Line Analytic Process), data mining and AHP(Analytic Hierarchy Process). To demonstrate the performance of the system, we conducted a case study using financial data over the 16-years from 1981 to 1996 of Korean automobile industry which is taken from database of KISF AS(Korea Investors Services Financial Analysis System).

  • PDF

Hybrid Neural Networks for Intrusion Detection System

  • Jirapummin, Chaivat;Kanthamanon, Prasert
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.928-931
    • /
    • 2002
  • Network based intrusion detection system is a computer network security tool. In this paper, we present an intrusion detection system based on Self-Organizing Maps (SOM) and Resilient Propagation Neural Network (RPROP) for visualizing and classifying intrusion and normal patterns. We introduce a cluster matching equation for finding principal associated components in component planes. We apply data from The Third International Knowledge Discovery and Data Mining Tools Competition (KDD cup'99) for training and testing our prototype. From our experimental results with different network data, our scheme archives more than 90 percent detection rate, and less than 5 percent false alarm rate in one SYN flooding and two port scanning attack types.

  • PDF

Thermodynamic Study of Liquid Pb-Bi, Pb-Na, Bi-Na Binaries and Pb-Bi-Na Ternary Solutions (熔融 Pb-Bi, Pb-Na, Bi-Na 및 Pb-Bi-Na 系의 物理化學的硏究)

  • Koh, Chang-Shik
    • Journal of the Korean Chemical Society
    • /
    • v.6 no.2
    • /
    • pp.133-142
    • /
    • 1962
  • This study was carried out to investigate the lead-bismuth-sodium ternary system which a basis of the Dittmer method as a part of "the fundamental study of pyrometallurgical debismuthizing of lead". Thermodynamic properties of each liquid Pb-Bi, Pb-Na binaries as well as liquid Pb-Bi-Na ternary solution were measured by e.m.f. of these concentration cells, and those of each component were also determined. Furthermore, iso-activity lines including Pb rich side composition of Pb-Bi-Na ternary solution were determined. The relationship between those thermodynamic characteristics and tendency of intermetallic compound formation was discussed through the above experiments.

  • PDF