• Title/Summary/Keyword: Dimensionality reduction model

Search Result 71, Processing Time 0.026 seconds

Effective Dimensionality Reduction of Payload-Based Anomaly Detection in TMAD Model for HTTP Payload

  • Kakavand, Mohsen;Mustapha, Norwati;Mustapha, Aida;Abdullah, Mohd Taufik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3884-3910
    • /
    • 2016
  • Intrusion Detection System (IDS) in general considers a big amount of data that are highly redundant and irrelevant. This trait causes slow instruction, assessment procedures, high resource consumption and poor detection rate. Due to their expensive computational requirements during both training and detection, IDSs are mostly ineffective for real-time anomaly detection. This paper proposes a dimensionality reduction technique that is able to enhance the performance of IDSs up to constant time O(1) based on the Principle Component Analysis (PCA). Furthermore, the present study offers a feature selection approach for identifying major components in real time. The PCA algorithm transforms high-dimensional feature vectors into a low-dimensional feature space, which is used to determine the optimum volume of factors. The proposed approach was assessed using HTTP packet payload of ISCX 2012 IDS and DARPA 1999 dataset. The experimental outcome demonstrated that our proposed anomaly detection achieved promising results with 97% detection rate with 1.2% false positive rate for ISCX 2012 dataset and 100% detection rate with 0.06% false positive rate for DARPA 1999 dataset. Our proposed anomaly detection also achieved comparable performance in terms of computational complexity when compared to three state-of-the-art anomaly detection systems.

Using noise filtering and sufficient dimension reduction method on unstructured economic data (노이즈 필터링과 충분차원축소를 이용한 비정형 경제 데이터 활용에 대한 연구)

  • Jae Keun Yoo;Yujin Park;Beomseok Seo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.119-138
    • /
    • 2024
  • Text indicators are increasingly valuable in economic forecasting, but are often hindered by noise and high dimensionality. This study aims to explore post-processing techniques, specifically noise filtering and dimensionality reduction, to normalize text indicators and enhance their utility through empirical analysis. Predictive target variables for the empirical analysis include monthly leading index cyclical variations, BSI (business survey index) All industry sales performance, BSI All industry sales outlook, as well as quarterly real GDP SA (seasonally adjusted) growth rate and real GDP YoY (year-on-year) growth rate. This study explores the Hodrick and Prescott filter, which is widely used in econometrics for noise filtering, and employs sufficient dimension reduction, a nonparametric dimensionality reduction methodology, in conjunction with unstructured text data. The analysis results reveal that noise filtering of text indicators significantly improves predictive accuracy for both monthly and quarterly variables, particularly when the dataset is large. Moreover, this study demonstrated that applying dimensionality reduction further enhances predictive performance. These findings imply that post-processing techniques, such as noise filtering and dimensionality reduction, are crucial for enhancing the utility of text indicators and can contribute to improving the accuracy of economic forecasts.

Major gene interaction identification in Hanwoo by adjusted environmental effects (환경적인 요인을 보정한 한우의 우수 유전자 조합 선별)

  • Lee, Jea-Young;Jin, Mi-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.3
    • /
    • pp.467-474
    • /
    • 2012
  • Human diseases and livestock economic traits are not typically the result of variation of a single genetic locus, but are rather the result of interplay between interactions among multiple genes and a variety of environmental exposures. We have used linear regression model for adjusted environmental effects and multifactor dimensionality reduction (MDR) method to identify gene-gene interaction effect of statistical model in general. Of course, we use 5 SNPs (single uncleotide polymorphism) which were studied recently by Oh et al. (2011). We apply the MDR (multifactor demensionality reduction) method on the identify major interaction effects of single nucleotide polymorphisms responsible for economic traits in a Korean cattle population.

Investigation of gene-gene interactions of clock genes for chronotype in a healthy Korean population

  • Park, Mira;Kim, Soon Ae;Shin, Jieun;Joo, Eun-Jeong
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.38.1-38.9
    • /
    • 2020
  • Chronotype is an important moderator of psychiatric illnesses, which seems to be controlled in some part by genetic factors. Clock genes are the most relevant genes for chronotype. In addition to the roles of individual genes, gene-gene interactions of clock genes substantially contribute to chronotype. We investigated genetic associations and gene-gene interactions of the clock genes BHLHB2, CLOCK, CSNK1E, NR1D1, PER1, PER2, PER3, and TIMELESS for chronotype in 1,293 healthy Korean individuals. Regression analysis was conducted to find associations between single nucleotide polymorphism (SNP) and chronotype. For gene-gene interaction analyses, the quantitative multifactor dimensionality reduction (QMDR) method, a nonparametric model-free method for quantitative phenotypes, were performed. No individual SNP or haplotype showed a significant association with chronotype by both regression analysis and single-locus model of QMDR. QMDR analysis identified NR1D1 rs2314339 and TIMELESS rs4630333 as the best SNP pairs among two-locus interaction models associated with chronotype (cross-validation consistency [CVC] = 8/10, p = 0.041). For the three-locus interaction model, the SNP combination of NR1D1 rs2314339, TIMELESS rs4630333, and PER3 rs228669 showed the best results (CVC = 4/10, p < 0.001). However, because the mean differences between genotype combinations were minor, the clinical roles of clock gene interactions are unlikely to be critical.

Human Action Recognition Based on 3D Human Modeling and Cyclic HMMs

  • Ke, Shian-Ru;Thuc, Hoang Le Uyen;Hwang, Jenq-Neng;Yoo, Jang-Hee;Choi, Kyoung-Ho
    • ETRI Journal
    • /
    • v.36 no.4
    • /
    • pp.662-672
    • /
    • 2014
  • Human action recognition is used in areas such as surveillance, entertainment, and healthcare. This paper proposes a system to recognize both single and continuous human actions from monocular video sequences, based on 3D human modeling and cyclic hidden Markov models (CHMMs). First, for each frame in a monocular video sequence, the 3D coordinates of joints belonging to a human object, through actions of multiple cycles, are extracted using 3D human modeling techniques. The 3D coordinates are then converted into a set of geometrical relational features (GRFs) for dimensionality reduction and discrimination increase. For further dimensionality reduction, k-means clustering is applied to the GRFs to generate clustered feature vectors. These vectors are used to train CHMMs separately for different types of actions, based on the Baum-Welch re-estimation algorithm. For recognition of continuous actions that are concatenated from several distinct types of actions, a designed graphical model is used to systematically concatenate different separately trained CHMMs. The experimental results show the effective performance of our proposed system in both single and continuous action recognition problems.

Agglomerative Hierarchical Clustering Analysis with Deep Convolutional Autoencoders (합성곱 오토인코더 기반의 응집형 계층적 군집 분석)

  • Park, Nojin;Ko, Hanseok
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.1
    • /
    • pp.1-7
    • /
    • 2020
  • Clustering methods essentially take a two-step approach; extracting feature vectors for dimensionality reduction and then employing clustering algorithm on the extracted feature vectors. However, for clustering images, the traditional clustering methods such as stacked auto-encoder based k-means are not effective since they tend to ignore the local information. In this paper, we propose a method first to effectively reduce data dimensionality using convolutional auto-encoder to capture and reflect the local information and then to accurately cluster similar data samples by using a hierarchical clustering approach. The experimental results confirm that the clustering results are improved by using the proposed model in terms of clustering accuracy and normalized mutual information.

An extension of multifactor dimensionality reduction method for detecting gene-gene interactions with the survival time (생존시간과 연관된 유전자 간의 교호작용에 관한 다중차원축소방법의 확장)

  • Oh, Jin Seok;Lee, Seung Yeoun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1057-1067
    • /
    • 2014
  • Many genetic variants have been identified to be associated with complex diseases such as hypertension, diabetes and cancers throughout genome-wide association studies (GWAS). However, there still exist a serious missing heritability problem since the proportion explained by genetic variants from GWAS is very weak less than 10~15%. Gene-gene interaction study may be helpful to explain the missing heritability because most of complex disease mechanisms are involved with more than one single SNP, which include multiple SNPs or gene-gene interactions. This paper focuses on gene-gene interactions with the survival phenotype by extending the multifactor dimensionality reduction (MDR) method to the accelerated failure time (AFT) model. The standardized residual from AFT model is used as a residual score for classifying multiple geno-types into high and low risk groups and algorithm of MDR is implemented. We call this method AFT-MDR and compares the power of AFT-MDR with those of Surv-MDR and Cox-MDR in simulation studies. Also a real data for leukemia Korean patients is analyzed. It was found that the power of AFT-MDR is greater than that of Surv-MDR and is comparable with that of Cox-MDR, but is very sensitive to the censoring fraction.

Support vector machine and multifactor dimensionality reduction for detecting major gene interactions of continuous data (서포트 벡터 머신 알고리즘을 활용한 연속형 데이터의 다중인자 차원축소방법 적용)

  • Lee, Jea-Young;Lee, Jong-Hyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1271-1280
    • /
    • 2010
  • We have used multifactor dimensionality reduction (MDR) method to study genegene interaction effect of statistical model in general. But, MDR method could not be applied in the continuous data. In this paper, continuous-type data by the support vector machine (SVM) algorithm are proposed to the MDR method which provides an introduction to the technique. Also we apply the method on the identify major interaction effects of single nucleotide polymorphisms (SNPs) responsible for economic traits in a Korean cattle population.

Classification of Imbalanced Data Based on MTS-CBPSO Method: A Case Study of Financial Distress Prediction

  • Gu, Yuping;Cheng, Longsheng;Chang, Zhipeng
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.682-693
    • /
    • 2019
  • The traditional classification methods mostly assume that the data for class distribution is balanced, while imbalanced data is widely found in the real world. So it is important to solve the problem of classification with imbalanced data. In Mahalanobis-Taguchi system (MTS) algorithm, data classification model is constructed with the reference space and measurement reference scale which is come from a single normal group, and thus it is suitable to handle the imbalanced data problem. In this paper, an improved method of MTS-CBPSO is constructed by introducing the chaotic mapping and binary particle swarm optimization algorithm instead of orthogonal array and signal-to-noise ratio (SNR) to select the valid variables, in which G-means, F-measure, dimensionality reduction are regarded as the classification optimization target. This proposed method is also applied to the financial distress prediction of Chinese listed companies. Compared with the traditional MTS and the common classification methods such as SVM, C4.5, k-NN, it is showed that the MTS-CBPSO method has better result of prediction accuracy and dimensionality reduction.

A personalized exercise recommendation system using dimension reduction algorithms

  • Lee, Ha-Young;Jeong, Ok-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.6
    • /
    • pp.19-28
    • /
    • 2021
  • Nowadays, interest in health care is increasing due to Coronavirus (COVID-19), and a lot of people are doing home training as there are more difficulties in using fitness centers and public facilities that are used together. In this paper, we propose a personalized exercise recommendation algorithm using personalized propensity information to provide more accurate and meaningful exercise recommendation to home training users. Thus, we classify the data according to the criteria for obesity with a k-nearest neighbor algorithm using personal information that can represent individuals, such as eating habits information and physical conditions. Furthermore, we differentiate the exercise dataset by the level of exercise activities. Based on the neighborhood information of each dataset, we provide personalized exercise recommendations to users through a dimensionality reduction algorithm (SVD) among model-based collaborative filtering methods. Therefore, we can solve the problem of data sparsity and scalability of memory-based collaborative filtering recommendation techniques and we verify the accuracy and performance of the proposed algorithms.