• 제목/요약/키워드: Dimensionality

검색결과 570건 처리시간 0.024초

지도학습기법을 이용한 비선형 다변량 공정의 비정상 상태 탐지 (Abnormality Detection to Non-linear Multivariate Process Using Supervised Learning Methods)

  • 손영태;윤덕균
    • 산업공학
    • /
    • 제24권1호
    • /
    • pp.8-14
    • /
    • 2011
  • Principal Component Analysis (PCA) reduces the dimensionality of the process by creating a new set of variables, Principal components (PCs), which attempt to reflect the true underlying process dimension. However, for highly nonlinear processes, this form of monitoring may not be efficient since the process dimensionality can't be represented by a small number of PCs. Examples include the process of semiconductors, pharmaceuticals and chemicals. Nonlinear correlated process variables can be reduced to a set of nonlinear principal components, through the application of Kernel Principal Component Analysis (KPCA). Support Vector Data Description (SVDD) which has roots in a supervised learning theory is a training algorithm based on structural risk minimization. Its control limit does not depend on the distribution, but adapts to the real data. So, in this paper proposes a non-linear process monitoring technique based on supervised learning methods and KPCA. Through simulated examples, it has been shown that the proposed monitoring chart is more effective than $T^2$ chart for nonlinear processes.

Investigation of gene-gene interactions of clock genes for chronotype in a healthy Korean population

  • Park, Mira;Kim, Soon Ae;Shin, Jieun;Joo, Eun-Jeong
    • Genomics & Informatics
    • /
    • 제18권4호
    • /
    • pp.38.1-38.9
    • /
    • 2020
  • Chronotype is an important moderator of psychiatric illnesses, which seems to be controlled in some part by genetic factors. Clock genes are the most relevant genes for chronotype. In addition to the roles of individual genes, gene-gene interactions of clock genes substantially contribute to chronotype. We investigated genetic associations and gene-gene interactions of the clock genes BHLHB2, CLOCK, CSNK1E, NR1D1, PER1, PER2, PER3, and TIMELESS for chronotype in 1,293 healthy Korean individuals. Regression analysis was conducted to find associations between single nucleotide polymorphism (SNP) and chronotype. For gene-gene interaction analyses, the quantitative multifactor dimensionality reduction (QMDR) method, a nonparametric model-free method for quantitative phenotypes, were performed. No individual SNP or haplotype showed a significant association with chronotype by both regression analysis and single-locus model of QMDR. QMDR analysis identified NR1D1 rs2314339 and TIMELESS rs4630333 as the best SNP pairs among two-locus interaction models associated with chronotype (cross-validation consistency [CVC] = 8/10, p = 0.041). For the three-locus interaction model, the SNP combination of NR1D1 rs2314339, TIMELESS rs4630333, and PER3 rs228669 showed the best results (CVC = 4/10, p < 0.001). However, because the mean differences between genotype combinations were minor, the clinical roles of clock gene interactions are unlikely to be critical.

Classification of Imbalanced Data Based on MTS-CBPSO Method: A Case Study of Financial Distress Prediction

  • Gu, Yuping;Cheng, Longsheng;Chang, Zhipeng
    • Journal of Information Processing Systems
    • /
    • 제15권3호
    • /
    • pp.682-693
    • /
    • 2019
  • The traditional classification methods mostly assume that the data for class distribution is balanced, while imbalanced data is widely found in the real world. So it is important to solve the problem of classification with imbalanced data. In Mahalanobis-Taguchi system (MTS) algorithm, data classification model is constructed with the reference space and measurement reference scale which is come from a single normal group, and thus it is suitable to handle the imbalanced data problem. In this paper, an improved method of MTS-CBPSO is constructed by introducing the chaotic mapping and binary particle swarm optimization algorithm instead of orthogonal array and signal-to-noise ratio (SNR) to select the valid variables, in which G-means, F-measure, dimensionality reduction are regarded as the classification optimization target. This proposed method is also applied to the financial distress prediction of Chinese listed companies. Compared with the traditional MTS and the common classification methods such as SVM, C4.5, k-NN, it is showed that the MTS-CBPSO method has better result of prediction accuracy and dimensionality reduction.

Wake dynamics of a 3D curved cylinder in oblique flows

  • Lee, Soonhyun;Paik, Kwang-Jun;Srinil, Narakorn
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제12권1호
    • /
    • pp.501-517
    • /
    • 2020
  • Three-dimensional numerical simulations were performed to study the effects of flow direction and flow velocity on the flow regime behind a curved pipe represented by a curved circular cylinder. The cylinder is based on a previous study and consists of a quarter segment of a ring and a horizontal part at the end of the ring. The cylinder was rotated in the computational domain to examine five incident flow angles of 0-180° with 45° intervals at Reynolds numbers of 100 and 500. The detailed wake topologies represented by λ2 criterion were captured using a Large Eddy Simulation (LES). The curved cylinder leads to different flow regimes along the span, which shows the three-dimensionality of the wake field. At a Reynolds number of 100, the shedding was suppressed after flow angle of 135°, and oblique flow was observed at 90°. At a Reynolds number of 500, vortex dislocation was detected at 90° and 135°. These observations are in good agreement with the three-dimensionality of the wake field that arose due to the curved shape.

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction

  • B., Kiran Kumar;Gyani, Jayadev;Y., Bhavani;P., Ganesh Reddy;T, Nagasai Anjani Kumar
    • International Journal of Computer Science & Network Security
    • /
    • 제22권10호
    • /
    • pp.1-10
    • /
    • 2022
  • Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.

조선 전기 전단후장형 치마의 스타일 유형과 조형적 특성 연구 (A Study of the Style Type and Formative Properties of Short Front and Long Back Skirts in the Early Joseon Dynasty)

  • 황이지;김소희
    • 한국의류학회지
    • /
    • 제47권2호
    • /
    • pp.215-231
    • /
    • 2023
  • This study classifies short front long back skirts from the Joseon Dynasty by style type, identifies their formative characteristics based on their external morphological properties and internal composition, and examines their correlation with Korean thought. A literature review and empirical research were conducted for this study. The style of short front long back skirts is classified as inverted "b"-shaped, lower lip, wavy, trapezoid with a raised center hem, or half-circle. As such, this skirt possesses the formative properties of imbalance, variability of shape, intentional three-dimensionality, and confluence. In other words, with an imbalance resulting from the difference in length between the front and back, these skirts are characterized by variability in shape created by intentional three-dimensionality expressed as intentional three-dimensional beauty, the confluence of planes and dimensions, as well as of materials and colors. These properties are correlated with Korean ways of viewing the world. This study contributes to the development of Korean designs.

이란 전시회 성과요인 탐색 및 무역박람회에 수정된 중요도-성취도분석 (R-IPA) 적용 방안 (Exploration of the dimensionality of Iran's trade show performance and application of R-IPA)

  • 정윤세
    • 무역학회지
    • /
    • 제45권4호
    • /
    • pp.45-63
    • /
    • 2020
  • This study aims to identify the dimensions of trade show performance in Iranian trade shows and apply the revised importance-performance analysis. The IPA method integrates two types of indirect importance and a composite I-P mapping using traditional four-quadrants, as well as a diagonal line on a two-dimensional grid. Based on the analysis results, this study presents several suggestions to contribute to the development of the trade show industry. First, it is noted that the dimensionality of trade show performance in a developing country context can be different from that of prior literature. Taking different industry development stages of a show hosting countries, examining the dimensions of each trade show performance with every effort to derive proper exhibitors' implications is necessary. Second, the use of statically-derived importance is recommended while considering respondents' convenience to reduce their time and fatigue when collecting data at the busy booths. Further, applying composite I-P mapping is suggested as an effective diagnostic tool to provide optimal trade show strategies for the exhibitors under the dynamic and ever-changing global business environment.

Writer verification using feature selection based on genetic algorithm: A case study on handwritten Bangla dataset

  • Jaya Paul;Kalpita Dutta;Anasua Sarkar;Kaushik Roy;Nibaran Das
    • ETRI Journal
    • /
    • 제46권4호
    • /
    • pp.648-659
    • /
    • 2024
  • Author verification is challenging because of the diversity in writing styles. We propose an enhanced handwriting verification method that combines handcrafted and automatically extracted features. The method uses a genetic algorithm to reduce the dimensionality of the feature set. We consider offline Bangla handwriting content and evaluate the proposed method using handcrafted features with a simple logistic regression, radial basis function network, and sequential minimal optimization as well as automatically extracted features using a convolutional neural network. The handcrafted features outperform the automatically extracted ones, achieving an average verification accuracy of 94.54% for 100 writers. The handcrafted features include Radon transform, histogram of oriented gradients, local phase quantization, and local binary patterns from interwriter and intrawriter content. The genetic algorithm reduces the feature dimensionality and selects salient features using a support vector machine. The top five experimental results are obtained from the optimal feature set selected using a consensus strategy. Comparisons with other methods and features confirm the satisfactory results.

Feature Engineering and Evaluation for Android Malware Detection Scheme

  • Jaemin Jung;Jihyeon Park;Seong-je Cho;Sangchul Han;Minkyu Park;Hsin-Hung Cho
    • Journal of Internet Technology
    • /
    • 제22권2호
    • /
    • pp.423-439
    • /
    • 2021
  • Android is one of the most popular platforms for the mobile and Internet of Things (IoT) devices. This popularity has made Android-based devices a valuable target of malicious apps. Thus, it is essential to devise automatic and portable malware detection approaches for the Android platform. There are many studies on detecting mobile malware using machine learning techniques. In these studies, however, the dataset is imbalanced or is not large enough to generalize the machine learning model, or the dimensionality of features is too high to apply nonlinear classifiers. In this article, we propose a machine learning-based Android malware detection scheme that uses API calls and permissions as features. To restrict the dimensionality of features, we propose minimal domain knowledge-based and Gini importance-based feature selection. We construct large and balanced real-world datasets to build a generalized and non-skewed model and verify our model through experiments. We achieve 96.51% classification accuracy using Random Forest classifier with low overhead. In addition, we also provide an analysis on falsely classified samples in detail. The analysis results show that API hiding can degrade the performance of API call information-based malware detection systems.

MPEG-7 시각 정보 기술자의 특성을 반영한 효율적인 멀티미디어 데이타 비트맵 인덱싱 방법 (An Efficient Bitmap Indexing Method for Multimedia Data Reflecting the Characteristics of MPEG-7 Visual Descriptors)

  • 정진국;낭종호
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제32권1호
    • /
    • pp.9-20
    • /
    • 2005
  • 최근 멀티미디어 정보를 기술하기 위한 표준인 MPEG-7이 제안되어 이미지/동영상 검색 시스템과 간은 응용분야에서 사용되기 시작하였다. 그러나 MPEG-7 시각 정보 기술자들은 대부분 고차원으로 표현이 되고, 고차원에서 발생되는 문제인 "Curse of dimensionality" 때문에 기존의 인덱싱 방법(예를 들면 트리 구조를 이용하는 다차원 인덱싱 방법, 차원을 줄이는 방법, 양자화 등의 압축 기법을 이용하는 방법 등)으로는 효율적인 검색을 할 수 없다. 본 논문에서는 MPEG-7 시각 정보 기술자들의 특징을 반영한 효율적인 인덱싱 방법을 제안한다. 제안된 방법에서는 기술자를 속성 히스토그램으로 변형하고 히스토그램의 자 빈 값을 이진 형태로 표현하여 비트열을 생성하며, 이러한 비트열들을 이용하여 비트맵 인덱스를 구성한다. 질의 오브젝트가 입력되면 비트맵 인덱스를 이용하여 결과에 포함될 가능성이 있는 후보 오브젝트 리스트를 생성하게 되는데 즉, 각 오브젝트의 인덱스와 질의 오브젝트의 비트열에 대한 XOR(Exclusive OR) 연산을 수행하여서 후보 오브젝트 리스트를 생성한다. 그리고 이 리스트에 있는 오브젝트들에 대해서만 L1-norm과 같은, 기술자를 위해 사용되는 비교 연산식을 수행하여 최종 결과 오브젝트들을 사용자에게 보여주게 된다. 본 논문에서 제안하는 알고리즘은 단순한 비트 연산을 통해 검색 결과에 포함될 가능성이 있는 오브젝트들을 추출해낼 수 있기 때문에 빠른 시간 내에 검색을 마칠 수 있도록 해준다. 실험에 의하면 제안한 방법을 이용하는 경우, 90% 이상의 정확도를 유지하면서 검색 시간에서는 순차 검색에 비해 15배 이상의 속도 향상을 보임을 알 수 있었다.