• Title/Summary/Keyword: dimension reduction based methods

Search Result 62, Processing Time 0.019 seconds

DR-LSTM: Dimension reduction based deep learning approach to predict stock price

  • Ah-ram Lee;Jae Youn Ahn;Ji Eun Choi;Kyongwon Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.213-234
    • /
    • 2024
  • In recent decades, increasing research attention has been directed toward predicting the price of stocks in financial markets using deep learning methods. For instance, recurrent neural network (RNN) is known to be competitive for datasets with time-series data. Long short term memory (LSTM) further improves RNN by providing an alternative approach to the gradient loss problem. LSTM has its own advantage in predictive accuracy by retaining memory for a longer time. In this paper, we combine both supervised and unsupervised dimension reduction methods with LSTM to enhance the forecasting performance and refer to this as a dimension reduction based LSTM (DR-LSTM) approach. For a supervised dimension reduction method, we use methods such as sliced inverse regression (SIR), sparse SIR, and kernel SIR. Furthermore, principal component analysis (PCA), sparse PCA, and kernel PCA are used as unsupervised dimension reduction methods. Using datasets of real stock market index (S&P 500, STOXX Europe 600, and KOSPI), we present a comparative study on predictive accuracy between six DR-LSTM methods and time series modeling.

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

Aerosol optical depth prediction based on dimension reduction methods

  • Jungkyun Lee;Yaeji Lim
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.5
    • /
    • pp.521-533
    • /
    • 2024
  • As the concentration of fine dust has recently increased, numerous related studies are being conducted to address this issue. Aerosol optical depth (AOD) is a vital atmospheric parameter for measuring the optical properties of aerosols in the atmosphere, providing crucial information related to fine dust. In this paper, we apply three dimension reduction methods, nonnegative matrix factorization (NMF), empirical orthogonal functions (EOF) analysis and independent component analysis (ICA), to AOD data to analyze the patterns of fine dust in the East Asia region. Through a comparison of three dimension reduction methods, we observe that some patterns are observed in all three method, while some information are only extracted in a specific method. Additionally, we forecast AOD levels based on three methods, and compare the predictive performance of the three methodologies.

Applications of response dimension reduction in large p-small n problems

  • Minjee Kim;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.191-202
    • /
    • 2024
  • The goal of this paper is to show how multivariate regression analysis with high-dimensional responses is facilitated by the response dimension reduction. Multivariate regression, characterized by multi-dimensional response variables, is increasingly prevalent across diverse fields such as repeated measures, longitudinal studies, and functional data analysis. One of the key challenges in analyzing such data is managing the response dimensions, which can complicate the analysis due to an exponential increase in the number of parameters. Although response dimension reduction methods are developed, there is no practically useful illustration for various types of data such as so-called large p-small n data. This paper aims to fill this gap by showcasing how response dimension reduction can enhance the analysis of high-dimensional response data, thereby providing significant assistance to statistical practitioners and contributing to advancements in multiple scientific domains.

Classification of Microarray Gene Expression Data by MultiBlock Dimension Reduction

  • Oh, Mi-Ra;Kim, Seo-Young;Kim, Kyung-Sook;Baek, Jang-Sun;Son, Young-Sook
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.3
    • /
    • pp.567-576
    • /
    • 2006
  • In this paper, we applied the multiblock dimension reduction methods to the classification of tumor based on microarray gene expressions data. This procedure involves clustering selected genes, multiblock dimension reduction and classification using linear discrimination analysis and quadratic discrimination analysis.

Classification Using Sliced Inverse Regression and Sliced Average Variance Estimation

  • Lee, Hakbae
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.275-285
    • /
    • 2004
  • We explore classification analysis using graphical methods such as sliced inverse regression and sliced average variance estimation based on dimension reduction. Some useful information about classification analysis are obtained by sliced inverse regression and sliced average variance estimation through dimension reduction. Two examples are illustrated, and classification rates by sliced inverse regression and sliced average variance estimation are compared with those by discriminant analysis and logistic regression.

Method-Free Permutation Predictor Hypothesis Tests in Sufficient Dimension Reduction

  • Lee, Kyungjin;Oh, Suji;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.4
    • /
    • pp.291-300
    • /
    • 2013
  • In this paper, we propose method-free permutation predictor hypothesis tests in the context of sufficient dimension reduction. Different from an existing method-free bootstrap approach, predictor hypotheses are evaluated based on p-values; therefore, usual statistical practitioners should have a potential preference. Numerical studies validate the developed theories, and real data application is provided.

Intensive numerical studies of optimal sufficient dimension reduction with singularity

  • Yoo, Jae Keun;Gwak, Da-Hae;Kim, Min-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.303-315
    • /
    • 2017
  • Yoo (2015, Statistics and Probability Letters, 99, 109-113) derives theoretical results in an optimal sufficient dimension reduction with singular inner-product matrix. The results are promising, but Yoo (2015) only presents one simulation study. So, an evaluation of its practical usefulness is necessary based on numerical studies. This paper studies the asymptotic behaviors of Yoo (2015) through various simulation models and presents a real data example that focuses on ordinary least squares. Intensive numerical studies show that the $x^2$ test by Yoo (2015) outperforms the existing optimal sufficient dimension reduction method. The basis estimation by the former can be theoretically sub-optimal; however, there are no notable differences from that by the latter. This investigation confirms the practical usefulness of Yoo (2015).

Incremental Linear Discriminant Analysis for Streaming Data Using the Minimum Squared Error Solution (스트리밍 데이터에 대한 최소제곱오차해를 통한 점층적 선형 판별 분석 기법)

  • Lee, Gyeong-Hoon;Park, Cheong Hee
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.69-75
    • /
    • 2018
  • In the streaming data where data samples arrive sequentially in time, it is difficult to apply the dimension reduction method based on batch learning. Therefore an incremental dimension reduction method for the application to streaming data has been studied. In this paper, we propose an incremental linear discriminant analysis method using the least squared error solution. Instead of computing scatter matrices directly, the proposed method incrementally updates the projective direction for dimension reduction by using the information of a new incoming sample. The experimental results demonstrate that the proposed method is more efficient compared with previously proposed incremental dimension reduction methods.

A Comparative Experiment on Dimensional Reduction Methods Applicable for Dissimilarity-Based Classifications (비유사도-기반 분류를 위한 차원 축소방법의 비교 실험)

  • Kim, Sang-Woon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.3
    • /
    • pp.59-66
    • /
    • 2016
  • This paper presents an empirical evaluation on dimensionality reduction strategies by which dissimilarity-based classifications (DBC) can be implemented efficiently. In DBC, classification is not based on feature measurements of individual objects (a set of attributes), but rather on a suitable dissimilarity measure among the individual objects (pair-wise object comparisons). One problem of DBC is the high dimensionality of the dissimilarity space when a lots of objects are treated. To address this issue, two kinds of solutions have been proposed in the literature: prototype selection (PS)-based methods and dimension reduction (DR)-based methods. In this paper, instead of utilizing the PS-based or DR-based methods, a way of performing DBC in Eigen spaces (ES) is considered and empirically compared. In ES-based DBC, classifications are performed as follows: first, a set of principal eigenvectors is extracted from the training data set using a principal component analysis; second, an Eigen space is expanded using a subset of the extracted and selected Eigen vectors; third, after measuring distances among the projected objects in the Eigen space using $l_p$-norms as the dissimilarity, classification is performed. The experimental results, which are obtained using the nearest neighbor rule with artificial and real-life benchmark data sets, demonstrate that when the dimensionality of the Eigen spaces has been selected appropriately, compared to the PS-based and DR-based methods, the performance of the ES-based DBC can be improved in terms of the classification accuracy.