• Title/Summary/Keyword: Data Matrix

Search Result 2,901, Processing Time 0.031 seconds

Estimation of high-dimensional sparse cross correlation matrix

  • Yin, Cao;Kwangok, Seo;Soohyun, Ahn;Johan, Lim
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.655-664
    • /
    • 2022
  • On the motivation by an integrative study of multi-omics data, we are interested in estimating the structure of the sparse cross correlation matrix of two high-dimensional random vectors. We rewrite the problem as a multiple testing problem and propose a new method to estimate the sparse structure of the cross correlation matrix. To do so, we test the correlation coefficients simultaneously and threshold the correlation coefficients by controlling FRD at a predetermined level α. Further, we apply the proposed method and an alternative adaptive thresholding procedure by Cai and Liu (2016) to the integrative analysis of the protein expression data (X) and the mRNA expression data (Y) in TCGA breast cancer cohort. By varying the FDR level α, we show that the new procedure is consistently more efficient in estimating the sparse structure of cross correlation matrix than the alternative one.

ASSVD: Adaptive Sparse Singular Value Decomposition for High Dimensional Matrices

  • Ding, Xiucai;Chen, Xianyi;Zou, Mengling;Zhang, Guangxing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2634-2648
    • /
    • 2020
  • In this paper, an adaptive sparse singular value decomposition (ASSVD) algorithm is proposed to estimate the signal matrix when only one data matrix is observed and there is high dimensional white noise, in which we assume that the signal matrix is low-rank and has sparse singular vectors, i.e. it is a simultaneously low-rank and sparse matrix. It is a structured matrix since the non-zero entries are confined on some small blocks. The proposed algorithm estimates the singular values and vectors separable by exploring the structure of singular vectors, in which the recent developments in Random Matrix Theory known as anisotropic Marchenko-Pastur law are used. And then we prove that when the signal is strong in the sense that the signal to noise ratio is above some threshold, our estimator is consistent and outperforms over many state-of-the-art algorithms. Moreover, our estimator is adaptive to the data set and does not require the variance of the noise to be known or estimated. Numerical simulations indicate that ASSVD still works well when the signal matrix is not very sparse.

Improving on Matrix Factorization for Recommendation Systems by Using a Character-Level Convolutional Neural Network (문자 수준 컨볼루션 뉴럴 네트워크를 이용한 추천시스템에서의 행렬 분해법 개선)

  • Son, Donghee;Shim, Kyuseok
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.2
    • /
    • pp.93-98
    • /
    • 2018
  • Recommendation systems are used to provide items of interests for users to maximize a company's profit. Matrix factorization is frequently used by recommendation systems, based on an incomplete user-item rating matrix. However, as the number of items and users increase, it becomes difficult to make accurate recommendations due to the sparsity of data. To overcome this drawback, the use of text data related to items was recently suggested for matrix factorization algorithms. Furthermore, a word-level convolutional neural network was shown to be effective in the process of extracting the word-level features from the text data among these kinds of matrix factorization algorithms. However, it involves a large number of parameters to learn in the word-level convolutional neural network. Thus, we propose a matrix factorization algorithm which utilizes a character-level convolutional neural network with which to extract the character-level features from the text data. We also conducted a performance study with real-life datasets to show the effectiveness of the proposed matrix factorization algorithm.

An accelerated Levenberg-Marquardt algorithm for feedforward network

  • Kwak, Young-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.1027-1035
    • /
    • 2012
  • This paper proposes a new Levenberg-Marquardt algorithm that is accelerated by adjusting a Jacobian matrix and a quasi-Hessian matrix. The proposed method partitions the Jacobian matrix into block matrices and employs the inverse of a partitioned matrix to find the inverse of the quasi-Hessian matrix. Our method can avoid expensive operations and save memory in calculating the inverse of the quasi-Hessian matrix. It can shorten the training time for fast convergence. In our results tested in a large application, we were able to save about 20% of the training time than other algorithms.

Weighted Hadamard 변환을 이용한 Image Data 처리에 관한 연구

  • 소상호;윤재우;이문호
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1983.10a
    • /
    • pp.68-72
    • /
    • 1983
  • The Hadamard matrix is a symmetric matrix made of plus and minus ones as entries. There fore the use of Hadamard transform in the image processing requires only the real number operations and results in the computational advantages. Recently, However, certain degradation aspects have been reported. In this paper we propose a WH matrix which retains the main properties of Hadamard matrix. The actual improvement of the image transmission in the inner part of the picture has been demonstrated by the computer simulated image developments. The orthogonal transform offers a useful facility in the digital signal processing. As the size of the transmission block increases, however, the assigment of bits for each data must increase exponentially. Thus the SNR of the image tends to decline accordingly. As an attempt to increase the SNR, we propose the WH matrix whose elements are made of $\pm$1, $\pm$2, $\pm$3, and the unitform is 8$\times$8 matrix.

  • PDF

Global Covariance based Principal Component Analysis for Speaker Identification (화자식별을 위한 전역 공분산에 기반한 주성분분석)

  • Seo, Chang-Woo;Lim, Young-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.69-73
    • /
    • 2009
  • This paper proposes an efficient global covariance-based principal component analysis (GCPCA) for speaker identification. Principal component analysis (PCA) is a feature extraction method which reduces the dimension of the feature vectors and the correlation among the feature vectors by projecting the original feature space into a small subspace through a transformation. However, it requires a larger amount of training data when performing PCA to find the eigenvalue and eigenvector matrix using the full covariance matrix by each speaker. The proposed method first calculates the global covariance matrix using training data of all speakers. It then finds the eigenvalue matrix and the corresponding eigenvector matrix from the global covariance matrix. Compared to conventional PCA and Gaussian mixture model (GMM) methods, the proposed method shows better performance while requiring less storage space and complexity in speaker identification.

  • PDF

Pattern Extraction of Manufacturing Time Series Data Using Matrix Profile (매트릭스 프로파일을 이용한 제조 시계열 데이터 패턴 추출)

  • Kim, Tae-hyun;Jin, Kyo-hong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.210-212
    • /
    • 2022
  • In the manufacturing industry, various sensors are attached to monitor the status of production facility. In many cases, the data obtained through these sensors is time series data. In order to determine whether the status of the production facility is abnormal, the process of extracting patterns from time series data must be preceded. Also various methods for extracting patterns from time series data are studied. In this paper, we use matrix profile algorithm to extract patterns from the collected multivariate time series data. Through this, the pattern of multi sensor data currently being collected from the CNC machine is extracted.

  • PDF

Context-aware Connectivity Analysis Method using Context Data Prediction Model in Delay Tolerant Networks (Delay Tolerant Networks에서 속성정보 예측 모델을 이용한 상황인식 연결성 분석 기법)

  • Jeong, Rae-Jin;Oh, Young-Jun;Lee, Kang-Whan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.4
    • /
    • pp.1009-1016
    • /
    • 2015
  • In this paper, we propose EPCM(Efficient Prediction-based Context-awareness Matrix) algorithm analyzing connectivity by predicting cluster's context data such as velocity and direction. In the existing DTN, unrestricted relay node selection causes an increase of delay and packet loss. The overhead is occurred by limited storage and capability. Therefore, we propose the EPCM algorithm analyzing predicted context data using context matrix and adaptive revision weight, and selecting relay node by considering connectivity between cluster and base station. The proposed algorithm saves context data to the context matrix and analyzes context according to variation and predicts context data after revision from adaptive revision weight. From the simulation results, the EPCM algorithm provides the high packet delivery ratio by selecting relay node according to predicted context data matrix.

Comparative analysis on the distinctive functions and usability of bibliographic data analysis softwares (서지데이터 분석 툴에 대한 특성 및 편의성 비교분석)

  • Lee, bang-rae;Lee, June;Yeo, Woon-dong;Lee, Chang-Hoan;Moon, Young-Ho;Kwon, Oh-jin
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.501-505
    • /
    • 2007
  • Recently KISTI has developed the KnowlegeMatrix which is a stand-alone type bibliographic data analysis software. In this paper, we try to benchmark test on the performance level of KnowledgeMatrix with well-known S/Ws such as VantagePoint and BibTechMon. We compare distinctive functions and usability of each S/Ws on comparative categories including Data, Matrix, Analysis, Visualization and Preprocessing. Test results show that all S/Ws have differentiated specific feature, but there is some performance gaps. KnowledgeMatrix overally shows better performance than others.

  • PDF

Integration of Heterogeneous Models with Knowledge Consolidation (지식 결합을 이용한 서로 다른 모델들의 통합)

  • Bae, Jae-Kwon;Kim, Jin-Hwa
    • Korean Management Science Review
    • /
    • v.24 no.2
    • /
    • pp.177-196
    • /
    • 2007
  • For better predictions and classifications in customer recommendation, this study proposes an integrative model that efficiently combines the currently-in-use statistical and artificial intelligence models. In particular, by integrating the models such as Association Rule, Frequency Matrix, and Rule Induction, this study suggests an integrative prediction model. Integrated models consist of four models: ASFM model which combines Association Rule(A) and Frequency Matrix(B), ASRI model which combines Association Rule(A) and Rule Induction(C), FMRI model which combines Frequency Matrix(B) and Rule Induction(C), and ASFMRI model which combines Association Rule(A), Frequency Matrix(B), and Rule Induction(C). The data set for the tests is collected from a convenience store G, which is the number one in its brand in S. Korea. This data set contains sales information on customer transactions from September 1, 2005 to December 7, 2005. About 1,000 transactions are selected for a specific item. Using this data set. it suggests an integrated model predicting whether a customer buys or not buys a specific product for target marketing strategy. The performance of integrated model is compared with that of other models. The results from the experiments show that the performance of integrated model is superior to that of all other models such as Association Rule, Frequency Matrix, and Rule Induction.