• 제목/요약/키워드: Matrix Multiplication

검색결과 169건 처리시간 0.021초

The Limit Distribution of an Invariant Test Statistic for Multivariate Normality

  • Kim Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • 제12권1호
    • /
    • pp.71-86
    • /
    • 2005
  • Testing for normality has always been an important part of statistical methodology. In this paper a test statistic for multivariate normality is proposed. The underlying idea is to investigate all the possible linear combinations that reduce to the standard normal distribution under the null hypothesis and compare the order statistics of them with the theoretical normal quantiles. The suggested statistic is invariant with respect to nonsingular matrix multiplication and vector addition. We show that the limit distribution of an approximation to the suggested statistic is representable as the supremum over an index set of the integral of a suitable Gaussian process.

공분산형 ARMA 고속 Transversal 필터에 관한 연구 (A Covariance Type ARMA Fast Transversal Filter)

  • 이철희;장영수
    • 한국음향학회지
    • /
    • 제11권1호
    • /
    • pp.67-79
    • /
    • 1992
  • 적응방식이나 실시간 처리에 적합한 온라인 ARMA 계수추정을 위하여 공분산형 ARMA 고속 transversal 필터 알고리즘을 제안하였다. 제안된 알고리즘은 ARMA 모델의 경우 상관함수 행렬의 이동불변 특성이 각 블록 별로 만족함을 이용하여 ELS(Extended Least Squares)를 공분산형의 경우에 대해 고속 시갱신 알고리즘으로 구현한 것으로서, 알고리즘의 유도에는 사영연산자를 이용한 기하학적 접근방식을 사용하였다. 제안된 알고리즘은 13N+37 MADPR의 연산량을 필요로 하며, AR부분과 MA부분의 차수를 달리할 수 있다.

  • PDF

Optimization of ARIA Block-Cipher Algorithm for Embedded Systems with 16-bits Processors

  • Lee, Wan Yeon;Choi, Yun-Seok
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제8권1호
    • /
    • pp.42-52
    • /
    • 2016
  • In this paper, we propose the 16-bits optimization design of the ARIA block-cipher algorithm for embedded systems with 16-bits processors. The proposed design adopts 16-bits XOR operations and rotated shift operations as many as possible. Also, the proposed design extends 8-bits array variables into 16-bits array variables for faster chained matrix multiplication. In evaluation experiments, our design is compared to the previous 32-bits optimized design and 8-bits optimized design. Our 16-bits optimized design yields about 20% faster execution speed and about 28% smaller footprint than 32-bits optimized code. Also, our design yields about 91% faster execution speed with larger footprint than 8-bits optimized code.

The Limit Distribution and Power of a Test for Bivariate Normality

  • Kim, Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • 제9권1호
    • /
    • pp.187-196
    • /
    • 2002
  • Testing for normality has always been a center of practical and theoretical interest in statistical research. In this paper a test statistic for bivariate normality is proposed. The underlying idea is to investigate all the possible linear combinations that reduce to the standard normal distribution under the null hypothesis and compare the order statistics of them with the theoretical normal quantiles. The suggested statistic is invariant with respect to nonsingular matrix multiplication and vector addition. We show that the limit distribution of an approximation to the suggested statistic is represented as the supremum over an index set of the integral of a suitable Gaussian Process. We also simulate the null distribution of the statistic and give some critical values of the distribution and power results.

백색광주사간섭무늬의 정점검출을 위한 이차다항식맞춤 알고리즘 (Quadratic polynomial fitting algorithm for peak point detection of white light scanning interferograms)

  • 박민철;김승우
    • 한국광학회지
    • /
    • 제9권4호
    • /
    • pp.245-250
    • /
    • 1998
  • 본 논문에서는 백색광주사간섭무늬의 정점검출을 위한 새로운 디지털처리 알고리즘을 제안한다. 본 알고리즘은 백색광주사간섭무늬의 가시도함수를 이차의 다항식으로 가정하고, 측정된 광강도 값들을 최소자승법을 이용하여 직접적으로 곡선맞춤하여 가시도함수의 정점의 위치를 검출한다. 기존의 정점검출 알고리즘들과 비교하여, 본 이차다항식맞춤 알고리즘은 가시도함수의 추출을 위한 별도의 연산이 요구되지 않아 3N+29의 작은 곱셈 계산량만으로 연산을 완료 할 수 있다. 또한 최소자승법을 사용함으로써 간섭무늬가 갖는 외부 교란을 효과적으로 억제하여 안정된 해를 제공하는 장점을 갖는다.

  • PDF

인공지능 컴퓨팅 프로세서 반도체 동향과 ETRI의 자율주행 인공지능 프로세서 (Trends in AI Computing Processor Semiconductors Including ETRI's Autonomous Driving AI Processor)

  • 양정민;권영수;강성원
    • 전자통신동향분석
    • /
    • 제32권6호
    • /
    • pp.57-65
    • /
    • 2017
  • Neural network based AI computing is a promising technology that reflects the recognition and decision operation of human beings. Early AI computing processors were composed of GPUs and CPUs; however, the dramatic increment of a floating point operation requires an energy efficient AI processor with a highly parallelized architecture. In this paper, we analyze the trends in processor architectures for AI computing. Some architectures are still composed using GPUs. However, they reduce the size of each processing unit by allowing a half precision operation, and raise the processing unit density. Other architectures concentrate on matrix multiplication, and require the construction of dedicated hardware for a fast vector operation. Finally, we propose our own inAB processor architecture and introduce domestic cutting-edge processor design capabilities.

밀결합 멀티프로세서 시스템의 구현 및 성능평가 (Implementation and Performance Evaluation of a Tightly-Coupled Multiprocessor System)

  • 김덕진;김영천;박석천
    • 대한전자공학회논문지
    • /
    • 제24권5호
    • /
    • pp.777-785
    • /
    • 1987
  • In this paper, a tightly-coupled multiprocessor system is implemented with four processing elements based on MC68000 CPU, a common menory (128KB), and a single time-shared bus. The multi-tasking operating system, MTOS, is modified so that the multiprocessor system can support multitasking and multiprocessing. The performance of the proposed system is evaluated by stochastic Petri Net system modeling. The efficiency and the processing power are simulated for various load factors and up to 16 PEs. By running benchmark programs, such as quicksort, FFT, and matrix-multiplication, the speed of parallel processing is compared with that of a single processor.

  • PDF

Derivation Algorithm of State-Space Equation for Production Systems Based on Max-Plus Algebra

  • Goto, Hiroyuki;Masuda, Shiro
    • Industrial Engineering and Management Systems
    • /
    • 제3권1호
    • /
    • pp.1-11
    • /
    • 2004
  • This paper proposes a new algorithm for determining an optimal control input for production systems. In many production systems, completion time should be planned within the due dates by taking into account precedence constraints and processing times. To solve this problem, the max-plus algebra is an effective approach. The max-plus algebra is an algebraic system in which the max operation is addition and the plus operation is multiplication, and similar operation rules to conventional algebra are followed. Utilizing the max-plus algebra, constraints of the system are expressed in an analogous way to the state-space description in modern control theory. Nevertheless, the formulation of a system is currently performed manually, which is very inefficient when applied to practical systems. Hence, in this paper, we propose a new algorithm for deriving a state-space description and determining an optimal control input with several constraint matrices and parameter vectors. Furthermore, the effectiveness of this proposed algorithm is verified through execution examples.

멀티플렉서 기반의 비트 연속 승산기를 이용한 시스톨릭 어레이 며 행렬 승산기 구현 (Implementation of the Systolic Array for Band Matrix Multiplication using Mutiplexer-based Bit-serial Multiplier)

  • 한영욱;김진만;유명근;송기용
    • 융합신호처리학회 학술대회논문집
    • /
    • 한국신호처리시스템학회 2003년도 하계학술대회 논문집
    • /
    • pp.288-291
    • /
    • 2003
  • 본 논문에서는 모듈성과 확장성을 갖는 시스톨릭 어레이를 이용한 두 띠 행렬의 비트 연속 승산기 구현에 대하여 기술한다. 띠 폭이 3인 4$\times$4 띠 행렬이 주어질 때 워드 레블 승산기 설계를 위한 3차원 DG로부터 2차원 시스톨릭 어레이를 유도한 후, 워드 레블 PE를 비트 연속 승산기와 가산기를 이용하여 비트 레블 PE로 변환시켜 띠 행렬의 비트 레블 승산기를 설계한다. 구현된 워드 레블 승산기와 비트 레블 승산기는 RT 수준에서 VHDL로 모델링하여 동작을 검증하였다. 검증된 시스톨릭 어레이를 이용한 워드 레블 승산기와 비트 레블 승산기는 Hynix에서 제공하는 0.35$\mu\textrm{m}$ 셀 라이브러리를 사용하여 Synopsys design compiler로 합성되었다.

  • PDF

인공지능 응용을 위한 하이브리드 메모리 설계 탐색 기법 (An Design Exploration Technique of a Hybrid Memory for Artificial Intelligence Applications)

  • 조두산
    • 한국산업융합학회 논문집
    • /
    • 제24권5호
    • /
    • pp.531-536
    • /
    • 2021
  • As artificial intelligence technology advances, it is being applied to various application fields. Artificial intelligence is performing well in the field of image recognition and classification. Chip design specialized in this field is also actively being studied. Artificial intelligence-specific chips are designed to provide optimal performance for the applications. At the design task, memory component optimization is becoming an important issue. In this study, the optimal algorithm for the memory size exploration is presented, and the optimal memory size is becoming as a important factor in providing a proper design that meets the requirements of performance, cost, and power consumption.