• Title/Summary/Keyword: dimensionality

Search Result 559, Processing Time 0.031 seconds

Abnormality Detection to Non-linear Multivariate Process Using Supervised Learning Methods (지도학습기법을 이용한 비선형 다변량 공정의 비정상 상태 탐지)

  • Son, Young-Tae;Yun, Deok-Kyun
    • IE interfaces
    • /
    • v.24 no.1
    • /
    • pp.8-14
    • /
    • 2011
  • Principal Component Analysis (PCA) reduces the dimensionality of the process by creating a new set of variables, Principal components (PCs), which attempt to reflect the true underlying process dimension. However, for highly nonlinear processes, this form of monitoring may not be efficient since the process dimensionality can't be represented by a small number of PCs. Examples include the process of semiconductors, pharmaceuticals and chemicals. Nonlinear correlated process variables can be reduced to a set of nonlinear principal components, through the application of Kernel Principal Component Analysis (KPCA). Support Vector Data Description (SVDD) which has roots in a supervised learning theory is a training algorithm based on structural risk minimization. Its control limit does not depend on the distribution, but adapts to the real data. So, in this paper proposes a non-linear process monitoring technique based on supervised learning methods and KPCA. Through simulated examples, it has been shown that the proposed monitoring chart is more effective than $T^2$ chart for nonlinear processes.

Investigation of gene-gene interactions of clock genes for chronotype in a healthy Korean population

  • Park, Mira;Kim, Soon Ae;Shin, Jieun;Joo, Eun-Jeong
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.38.1-38.9
    • /
    • 2020
  • Chronotype is an important moderator of psychiatric illnesses, which seems to be controlled in some part by genetic factors. Clock genes are the most relevant genes for chronotype. In addition to the roles of individual genes, gene-gene interactions of clock genes substantially contribute to chronotype. We investigated genetic associations and gene-gene interactions of the clock genes BHLHB2, CLOCK, CSNK1E, NR1D1, PER1, PER2, PER3, and TIMELESS for chronotype in 1,293 healthy Korean individuals. Regression analysis was conducted to find associations between single nucleotide polymorphism (SNP) and chronotype. For gene-gene interaction analyses, the quantitative multifactor dimensionality reduction (QMDR) method, a nonparametric model-free method for quantitative phenotypes, were performed. No individual SNP or haplotype showed a significant association with chronotype by both regression analysis and single-locus model of QMDR. QMDR analysis identified NR1D1 rs2314339 and TIMELESS rs4630333 as the best SNP pairs among two-locus interaction models associated with chronotype (cross-validation consistency [CVC] = 8/10, p = 0.041). For the three-locus interaction model, the SNP combination of NR1D1 rs2314339, TIMELESS rs4630333, and PER3 rs228669 showed the best results (CVC = 4/10, p < 0.001). However, because the mean differences between genotype combinations were minor, the clinical roles of clock gene interactions are unlikely to be critical.

Classification of Imbalanced Data Based on MTS-CBPSO Method: A Case Study of Financial Distress Prediction

  • Gu, Yuping;Cheng, Longsheng;Chang, Zhipeng
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.682-693
    • /
    • 2019
  • The traditional classification methods mostly assume that the data for class distribution is balanced, while imbalanced data is widely found in the real world. So it is important to solve the problem of classification with imbalanced data. In Mahalanobis-Taguchi system (MTS) algorithm, data classification model is constructed with the reference space and measurement reference scale which is come from a single normal group, and thus it is suitable to handle the imbalanced data problem. In this paper, an improved method of MTS-CBPSO is constructed by introducing the chaotic mapping and binary particle swarm optimization algorithm instead of orthogonal array and signal-to-noise ratio (SNR) to select the valid variables, in which G-means, F-measure, dimensionality reduction are regarded as the classification optimization target. This proposed method is also applied to the financial distress prediction of Chinese listed companies. Compared with the traditional MTS and the common classification methods such as SVM, C4.5, k-NN, it is showed that the MTS-CBPSO method has better result of prediction accuracy and dimensionality reduction.

Wake dynamics of a 3D curved cylinder in oblique flows

  • Lee, Soonhyun;Paik, Kwang-Jun;Srinil, Narakorn
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.12 no.1
    • /
    • pp.501-517
    • /
    • 2020
  • Three-dimensional numerical simulations were performed to study the effects of flow direction and flow velocity on the flow regime behind a curved pipe represented by a curved circular cylinder. The cylinder is based on a previous study and consists of a quarter segment of a ring and a horizontal part at the end of the ring. The cylinder was rotated in the computational domain to examine five incident flow angles of 0-180° with 45° intervals at Reynolds numbers of 100 and 500. The detailed wake topologies represented by λ2 criterion were captured using a Large Eddy Simulation (LES). The curved cylinder leads to different flow regimes along the span, which shows the three-dimensionality of the wake field. At a Reynolds number of 100, the shedding was suppressed after flow angle of 135°, and oblique flow was observed at 90°. At a Reynolds number of 500, vortex dislocation was detected at 90° and 135°. These observations are in good agreement with the three-dimensionality of the wake field that arose due to the curved shape.

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction

  • B., Kiran Kumar;Gyani, Jayadev;Y., Bhavani;P., Ganesh Reddy;T, Nagasai Anjani Kumar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.1-10
    • /
    • 2022
  • Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.

A Study of the Style Type and Formative Properties of Short Front and Long Back Skirts in the Early Joseon Dynasty (조선 전기 전단후장형 치마의 스타일 유형과 조형적 특성 연구)

  • Yi Ji Hwang;Sohee Kim
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.47 no.2
    • /
    • pp.215-231
    • /
    • 2023
  • This study classifies short front long back skirts from the Joseon Dynasty by style type, identifies their formative characteristics based on their external morphological properties and internal composition, and examines their correlation with Korean thought. A literature review and empirical research were conducted for this study. The style of short front long back skirts is classified as inverted "b"-shaped, lower lip, wavy, trapezoid with a raised center hem, or half-circle. As such, this skirt possesses the formative properties of imbalance, variability of shape, intentional three-dimensionality, and confluence. In other words, with an imbalance resulting from the difference in length between the front and back, these skirts are characterized by variability in shape created by intentional three-dimensionality expressed as intentional three-dimensional beauty, the confluence of planes and dimensions, as well as of materials and colors. These properties are correlated with Korean ways of viewing the world. This study contributes to the development of Korean designs.

Exploration of the dimensionality of Iran's trade show performance and application of R-IPA (이란 전시회 성과요인 탐색 및 무역박람회에 수정된 중요도-성취도분석 (R-IPA) 적용 방안)

  • Yoon-say Jeong
    • Korea Trade Review
    • /
    • v.45 no.4
    • /
    • pp.45-63
    • /
    • 2020
  • This study aims to identify the dimensions of trade show performance in Iranian trade shows and apply the revised importance-performance analysis. The IPA method integrates two types of indirect importance and a composite I-P mapping using traditional four-quadrants, as well as a diagonal line on a two-dimensional grid. Based on the analysis results, this study presents several suggestions to contribute to the development of the trade show industry. First, it is noted that the dimensionality of trade show performance in a developing country context can be different from that of prior literature. Taking different industry development stages of a show hosting countries, examining the dimensions of each trade show performance with every effort to derive proper exhibitors' implications is necessary. Second, the use of statically-derived importance is recommended while considering respondents' convenience to reduce their time and fatigue when collecting data at the busy booths. Further, applying composite I-P mapping is suggested as an effective diagnostic tool to provide optimal trade show strategies for the exhibitors under the dynamic and ever-changing global business environment.

An Efficient Bitmap Indexing Method for Multimedia Data Reflecting the Characteristics of MPEG-7 Visual Descriptors (MPEG-7 시각 정보 기술자의 특성을 반영한 효율적인 멀티미디어 데이타 비트맵 인덱싱 방법)

  • Jeong Jinguk;Nang Jongho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.1
    • /
    • pp.9-20
    • /
    • 2005
  • Recently, the MPEG-7 standard a multimedia content description standard is wide]y used for content based image/video retrieval systems. However, since the descriptors standardized in MPEG-7 are usually multidimensional and the problem called 'Curse of dimensionality', previously proposed indexing methods(for example, multidimensional indexing methods, dimensionality reduction methods, filtering methods, and so on) could not be used to effectively index the multimedia database represented in MPEG-7. This paper proposes an efficient multimedia data indexing mechanism reflecting the characteristics of MPEG-7 visual descriptors. In the proposed indexing mechanism, the descriptor is transformed into a histogram of some attributes. By representing the value of each bin as a binary number, the histogram itself that is a visual descriptor for the object in multimedia database could be represented as a bit string. Bit strings for all objects in multimedia database are collected to form an index file, bitmap index, in the proposed indexing mechanism. By XORing them with the descriptors for query object, the candidate solutions for similarity search could be computed easily and they are checked again with query object to precisely compute the similarity with exact metric such as Ll-norm. These indexing and searching mechanisms are efficient because the filtering process is performed by simple bit-operation and it reduces the search space dramatically. Upon experimental results with more than 100,000 real images, the proposed indexing and searching mechanisms are about IS times faster than the sequential searching with more than 90% accuracy.

Simulation Study on E-commerce Recommender System by Use of LSI Method (LSI 기법을 이용한 전자상거래 추천자 시스템의 시뮬레이션 분석)

  • Kwon, Chi-Myung
    • Journal of the Korea Society for Simulation
    • /
    • v.15 no.3
    • /
    • pp.23-30
    • /
    • 2006
  • A recommender system for E-commerce site receives information from customers about which products they are interested in, and recommends products that are likely to fit their needs. In this paper, we investigate several methods for large-scale product purchase data for the purpose of producing useful recommendations to customers. We apply the traditional data mining techniques of cluster analysis and collaborative filtering(CF), and CF with reduction of product-dimensionality by use of latent semantic indexing(LSI). If reduced product-dimensionality obtained from LSI shows a similar latent trend of customers for buying products to that based on original customer-product purchase data, we expect less computational effort for obtaining the nearest-neighbor for target customer may improve the efficiency of recommendation performance. From simulation experiments on synthetic customer-product purchase data, CF-based method with reduction of product-dimensionality presents a better performance than the traditional CF methods with respect to the recall, precision and F1 measure. In general, the recommendation quality increases as the size of the neighborhood increases. However, our simulation results shows that, after a certain point, the improvement gain diminish. Also we find, as a number of products of recommendation increases, the precision becomes worse, but the improvement gain of recall is relatively small after a certain point. We consider these informations may be useful in applying recommender system.

  • PDF

On Optimizing LDA-extentions Using a Pre-Clustering (사전 클러스터링을 이용한 LDA-확장법들의 최적화)

  • Kim, Sang-Woon;Koo, Byum-Yong;Choi, Woo-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.3
    • /
    • pp.98-107
    • /
    • 2007
  • For high-dimensional pattern recognition, such as face classification, the small number of training samples leads to the Small Sample Size problem when the number of pattern samples is smaller than the number of dimensionality. Recently, various LDA-extensions have been developed, including LDA, PCA+LDA, and Direct-LDA, to address the problem. This paper proposes a method of improving the classification efficiency by increasing the number of (sub)-classes through pre-clustering a training set prior to the execution of Direct-LDA. In LDA (or Direct-LDA), since the number of classes of the training set puts a limit to the dimensionality to be reduced, it is increased to the number of sub-classes that is obtained through clustering so that the classification performance of LDA-extensions can be improved. In other words, the eigen space of the training set consists of the range space and the null space, and the dimensionality of the range space increases as the number of classes increases. Therefore, when constructing the transformation matrix, through minimizing the null space, the loss of discriminatve information resulted from this space can be minimized. Experimental results for the artificial data of X-OR samples as well as the bench mark face databases of AT&T and Yale demonstrate that the classification efficiency of the proposed method could be improved.