• Title/Summary/Keyword: Matrix multiplication

Search Result 166, Processing Time 0.027 seconds

Secure Outsourced Computation of Multiple Matrix Multiplication Based on Fully Homomorphic Encryption

  • Wang, Shufang;Huang, Hai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5616-5630
    • /
    • 2019
  • Fully homomorphic encryption allows a third-party to perform arbitrary computation over encrypted data and is especially suitable for secure outsourced computation. This paper investigates secure outsourced computation of multiple matrix multiplication based on fully homomorphic encryption. Our work significantly improves the latest Mishra et al.'s work. We improve Mishra et al.'s matrix encoding method by introducing a column-order matrix encoding method which requires smaller parameter. This enables us to develop a binary multiplication method for multiple matrix multiplication, which multiplies pairwise two adjacent matrices in the tree structure instead of Mishra et al.'s sequential matrix multiplication from left to right. The binary multiplication method results in a logarithmic-depth circuit, thus is much more efficient than the sequential matrix multiplication method with linear-depth circuit. Experimental results show that for the product of ten 32×32 (64×64) square matrices our method takes only several thousand seconds while Mishra et al.'s method will take about tens of thousands of years which is astonishingly impractical. In addition, we further generalize our result from square matrix to non-square matrix. Experimental results show that the binary multiplication method and the classical dynamic programming method have a similar performance for ten non-square matrices multiplication.

Probability distribution-based approximation matrix multiplication simplification algorithm (확률분포 생성을 통한 근사 행렬 곱셈 간소화 방법)

  • Kwon, Oh-Young;Seo, Kyoung-Taek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1623-1629
    • /
    • 2022
  • Matrix multiplication is a fundamental operation widely used in science and engineering. There is an approximate matrix multiplication method as a way to reduce the amount of computation of matrix multiplication. Approximate matrix multiplication determines an appropriate probability distribution for selecting columns and rows of matrices, and performs approximate matrix multiplication by selecting columns and rows of matrices according to this distribution. Probability distributions are generated by considering both matrices A and B participating in matrix multiplication. In this paper, we propose a method to generate a probability distribution that selects columns and rows of matrices to be used for approximate matrix multiplication, targeting only matrix A. Approximate matrix multiplication was performed on 1000×1000 ~ 5000×5000 matrices using existing and proposed methods. The approximate matrix multiplication applying the proposed method compared to the conventional method has been shown to be closer to the original matrix multiplication result, averaging 0.02% to 2.34%.

Parallel Algorithm for Matrix-Matrix Multiplication on the GPU (GPU 기반 행렬 곱셈 병렬처리 알고리즘)

  • Park, Sangkun
    • Journal of Institute of Convergence Technology
    • /
    • v.9 no.1
    • /
    • pp.1-6
    • /
    • 2019
  • Matrix multiplication is a fundamental mathematical operation that has numerous applications across most scientific fields. In this paper, we presents a parallel GPU computation algorithm for dense matrix-matrix multiplication using OpenGL compute shader, which can play a very important role as a fundamental building block for many high-performance computing applications. Experimental results on NVIDIA Quad 4000 show that the proposed algorithm runs about 208 times faster than previous CPU algorithm and achieves performance of 75 GFLOPS in single precision for dense matrices with matrix size 4,096. Such performance proves that our algorithm is practical for real applications.

A Parallel-Architecture Processor Design for the Fast Multiplication of Homogeneous Transformation Matrices (Homogeneous Transformation Matrix의 곱셈을 위한 병렬구조 프로세서의 설계)

  • Kwon Do-All;Chung Tae-Sang
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.12
    • /
    • pp.723-731
    • /
    • 2005
  • The $4{\times}4$ homogeneous transformation matrix is a compact representation of orientation and position of an object in robotics and computer graphics. A coordinate transformation is accomplished through the successive multiplications of homogeneous matrices, each of which represents the orientation and position of each corresponding link. Thus, for real time control applications in robotics or animation in computer graphics, the fast multiplication of homogeneous matrices is quite demanding. In this paper, a parallel-architecture vector processor is designed for this purpose. The processor has several key features. For the accuracy of computation for real application, the operands of the processors are floating point numbers based on the IEEE Standard 754. For the parallelism and reduction of hardware redundancy, the processor takes column vectors of homogeneous matrices as multiplication unit. To further improve the throughput, the processor structure and its control is based on a pipe-lined structure. Since the designed processor can be used as a special purpose coprocessor in robotics and computer graphics, additionally to special matrix/matrix or matrix/vector multiplication, several other useful instructions for various transformation algorithms are included for wide application of the new design. The suggested instruction set will serve as standard in future processor design for Robotics and Computer Graphics. The design is verified using FPGA implementation. Also a comparative performance improvement of the proposed design is studied compared to a uni-processor approach for possibilities of its real time application.

A Hybrid Approach on Matrix Multiplication

  • Tolentino Maribel;Kim Myung-Kyu;Chae Soo-Hoan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.400-402
    • /
    • 2006
  • Matrix multiplication is an important problem in linear algebra. its main significance for combinatorial algorithms is its equivalence to a variety of other problems, such as transitive closure and reduction, solving linear systems, and matrix inversion. Thus the development of high-performance matrix multiplication implies faster algorithms for all of these problems. In this paper. we present a quantitative comparison of the theoretical and empirical performance of key matrix multiplication algorithms and use our analysis to develop a faster algorithm. We propose a Hybrid approach on Winograd's and Strassen's algorithms that improves the performance and discuss the performance of the hybrid Winograd-Strassen algorithm. Since Strassen's algorithm is based on a $2{\times}2$ matrix multiplication it makes the implementation very slow for larger matrix because of its recursive nature. Though we cannot get the theoretical threshold value of Strassen's algorithm, so we determine the threshold to optimize the use of Strassen's algorithm in nodes through various experiments and provided a summary shown in a table and graphs.

  • PDF

Matrix Multiplication Acceleration with GPU and Locality (GPU와 지역성을 이용한 행렬 곱셈 가속)

  • Kwon, Oh-Young;Lee, Chang-Mug
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.902-903
    • /
    • 2009
  • Matrix multiplication is widely used in scientific and engineering field. Locality can improve the execution performance of matrix multiplication. A method for accelerating matrix multiplication is presented. This method uses both CPU and GPU computing power in PC. The presented method improved execution time about %15~30% than the method which uses only GPU.

  • PDF

An Analytical Evaluation of 2D Mesh-connected SIMD Architecture for Parallel Matrix Multiplication (2D Mesh SIMD 구조에서의 병렬 행렬 곱셈의 수치적 성능 분석)

  • Kim, Cheong-Ghil
    • Journal of The Institute of Information and Telecommunication Facilities Engineering
    • /
    • v.10 no.1
    • /
    • pp.7-13
    • /
    • 2011
  • Matrix multiplication is a fundamental operation of linear algebra and arises in many areas of science and engineering. This paper introduces an efficient parallel matrix multiplication scheme on N ${\times}$ N mesh-connected SIMD array processor, called multiple hierarchical SIMD architecture (HMSA). The architectural characteristic of HMSA is the hierarchically structured control units which consist of a global control unit, N local control units configured diagonally, and $N^2$ processing elements (PEs) arranged in an N ${\times}$ N array. PEs are communicating through local buses connecting four adjacent neighbor PEs in mesh-torus networks and global buses running across the rows and columns called horizontal buses and vertical buses, respectively. This architecture enables HMSA to have the features of diagonally indexed concurrent broadcast and the accessibility to either rows (row control mode) or columns (column control mode) of 2D array PEs alternately. An algorithmic mapping method is used for performance evaluation by mapping matrix multiplication on the proposed architecture. The asymptotic time complexities of them are evaluated and the result shows that paralle matrix multiplication on HMSA can provide significant performance improvement.

  • PDF

A Study on the Efficient Multiplication with All m$\times$k Boolean Matrices (모든 m$\times$k 불리언 행렬과의 효율적 곱셈에 관한 연구)

  • Han, Jae-Il
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.2
    • /
    • pp.27-33
    • /
    • 2006
  • Boolean matrices are applied to a variety of areas and used successfully in many applications, and there are many researches on boolean matrices. Most researches deal with the multiplication of boolean matrices, but all of them focus on the multiplication of two boolean matrices and very few researches deal with the multiplication between many n$\times$m boolean matrices and all m$\times$k boolean matrices. The paper discusses the existing optimal algorithms for the multiplication of two boolean matrices are not suitable for the multiplication between a n$\times$m boolean matrix and all m$\times$k boolean matrices, establishes a theory that enables the efficient multiplication of a n$\times$m boolean matrix and all m$\times$k boolean matrices, and shows the execution results of a multiplication algorithm designed with this theory.

  • PDF

Efficient Multiplication of Boolean Matrices and Algorithm for D-Class Computation (D-클래스 계산을 위한 불리언 행렬의 효율적 곱셈 및 알고리즘)

  • Han, Jae-Il;Shin, Bum-Joo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.2
    • /
    • pp.68-78
    • /
    • 2007
  • D-class is defined as a set of equivalent $n{\times}n$ boolean matrices according to a given equivalence relation. The D-class computation requires the multiplication of three boolean matrices for each of all possible triples of $n{\times}n$ boolean matrices. However, almost all the researches on boolean matrices focused on the efficient multiplication of only two boolean matrices and a few researches have recently been shown to deal with the multiplication of all boolean matrices. The paper suggests a mathematical theory that enables the efficient multiplication for all possible boolean matrix triples and the efficient computation of all D-classes, and discusses algorithms designed with the theory and their execution results.

  • PDF

Optimizing 2-stage Tiling-based Matrix Multiplication in FPGA-based Neural Network Accelerator (FPGA기반 뉴럴네트워크 가속기에서 2차 타일링 기반 행렬 곱셈 최적화)

  • Jinse, Kwon;Jemin, Lee;Yongin, Kwon;Jeman, Park;Misun, Yu;Taeho, Kim;Hyungshin, Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.6
    • /
    • pp.367-374
    • /
    • 2022
  • The acceleration of neural networks has become an important topic in the field of computer vision. An accelerator is absolutely necessary for accelerating the lightweight model. Most accelerator-supported operators focused on direct convolution operations. If the accelerator does not provide GEMM operation, it is mostly replaced by CPU operation. In this paper, we proposed an optimization technique for 2-stage tiling-based GEMM routines on VTA. We improved performance of the matrix multiplication routine by maximizing the reusability of the input matrix and optimizing the operation pipelining. In addition, we applied the proposed technique to the DarkNet framework to check the performance improvement of the matrix multiplication routine. The proposed GEMM method showed a performance improvement of more than 2.4 times compared to the non-optimized GEMM method. The inference performance of our DarkNet framework has also improved by at least 2.3 times.