• Title/Summary/Keyword: m-병렬

Search Result 787, Processing Time 0.026 seconds

A Parallel Implementation of Purge Process for Lustre File System (Lustre 파일 시스템을 위한 Purge 기능의 병렬화 구현)

  • Kwon, Min-Woo;Yoon, Jun-Weon;Hong, Tae-Young;Park, Chan-Yeol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.64-65
    • /
    • 2016
  • 슈퍼컴퓨터는 대용량의 데이터를 효율적으로 관리하기 위해 Lustre 파일 시스템과 같은 고성능의 병렬 파일 시스템을 이용한다. 한국과학기술정보연구원의 슈퍼컴퓨터 4호기 Tachyon 2차 시스템과 같이 다수의 사용자가 접속하는 슈퍼컴퓨터는 사용자의 데이터가 한없이 누적됨으로 Lustre 파일 시스템의 성능이 저하되는 이슈가 있다. 본 논문에서는 사용자의 데이터가 누적되는 것을 방지하기 위해 장기간 사용하지 않는 데이터를 자동 삭제하는 기능인 Purge기능을 구현하였다. 특히, 기하급수적으로 늘어나는 병렬 파일 시스템의 용량에 대처하기 위해 병렬 컴퓨팅 기술을 이용해 고속 Purge 기능을 구현하였다. 단일 컴퓨팅 노드와 병렬 환경에서 구현한 결과를 비교하였을 때, 단일 컴퓨팅 노드에서는 1,517GB 용량을 지우는데 221.2초가 걸렸으며 16개의 컴퓨팅 노드를 이용한 병렬 환경에서는 49.9초가 걸렸다. 이 결과를 비교했을 때 단일 컴퓨팅 노드에서 구현한 결과 대비 병렬 환경에서 구현했을 때 약 4.4배의 성능향상을 얻을 수 있었다.

A Parallel Processing System for Visual Media Applications (시각매체를 위한 병렬처리 시스템)

  • Lee, Hyung;Pakr, Jong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.1A
    • /
    • pp.80-88
    • /
    • 2002
  • Visual media(image, graphic, and video) processing poses challenge from several perpectives, specifically from the point of view of real-time implementation and scalability. There have been several approaches to obtain speedups to meet the computing demands in multimedia processing ranging from media processors to special purpose implementations. A variety of parallel processing strategies are adopted in these implementations in order to achieve the required speedups. We have investigated a parallel processing system for improving the processing speed o f visual media related applications. The parallel processing system we proposed is similar to a pipelined memory stystem(MAMS). The multi-access memory system is made up of m memory modules and a memory controller to perform parallel memory access with a variety of combinations of 1${\times}$pq, pq${\times}$1, and p${\times}$q subarray, which improves both cost and complexity of control. Facial recognition, Phong shading, and automatic segmentation of moving object in image sequences are some that have been applied to the parallel processing system and resulted in faithful processing speed. This paper describes the parallel processing systems for the speedup and its utilization to three time-consuming applications.

Development of Thin and Parallel XYθ Alignment Stage (박형 병렬구조 XYθ 정렬 스테이지 개발)

  • Kang, Dong-Bae;Ahn, Jung-Hwan;Son, Seong-Min
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.1
    • /
    • pp.74-79
    • /
    • 2011
  • Alignment systems with multi-axis motions are applied to determine vertical arrangement of multilayer assembly such as LCD, PDP, and MLCC. This study reports the development of XY${\theta}$ alignment stage which is designed as thin-type structure and parallel actuations. The thin-type parallel XY${\theta}$ alignment stage is maintained below $1{\mu}m$ in repeatability error. The squareness and straightness also allow precise motion for the alignment by the developed stage. The measured error is ${\pm}6.25{\mu}m$ in the alignment experiment by the vision system on the parallel XY${\theta}$ alignment stage.

An Architecture of the Fast Parallel Multiplier over Finite Fields using AOP (AOP를 이용한 유한체 위에서의 고속 병렬연산기의 구조)

  • Kim, Yong-Tae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.1
    • /
    • pp.69-79
    • /
    • 2012
  • In this paper, we restrict the case as m odd, n=mk, and propose and explicitly exhibit the architecture of a new parallel multiplier over the field GF($2^m$) with a type k Gaussian period which is a subfield of the field GF($2^n$) implements multiplication using the parallel multiplier over the extension field GF($2^n$). The complexity of the time and area of our multiplier is the same as that of Reyhani-Masoleh and Hasan's multiplier which is the most efficient among the known multipliers in the case of type IV.

Low Power Parallel Acquisition Scheme for UWB Systems (저전력 병렬탐색기법을 이용한 UWB시스템의 동기 획득)

  • Kim, Sang-In;Cho, Kyoung-Rok
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.1
    • /
    • pp.147-154
    • /
    • 2007
  • In this paper, we propose a new parallel search algorithm to acquire synchronization for UWB(Ultra Wideband) systems that reduces computation of the correlation. The conventional synchronization acquisition algorithms check all the possible signal phases simultaneously using multiple correlators. However it reduces the acquisition time, it makes high power consumption owing to increasing of correlation. The proposed algorithm divides the preamble signal to input the correlator into an m-bit bunch. We check the result of the correlation at first stage of an m-bit bunch data and predict whether it has some synchronization acquisition information or not. Thus, it eliminates the unnecessary operation and save the number of correlation. We evaluate the proposed algorithm under the AWGN and the multi-Path channel model with MATLAB. The proposed parallel search scheme reduces number of the correlation 65% on the AWGN and 20% on the multi-path fading channel.

A Low Complexity Bit-Parallel Multiplier over Finite Fields with ONBs (최적정규기저를 갖는 유한체위에서의 저 복잡도 비트-병렬 곱셈기)

  • Kim, Yong-Tae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.4
    • /
    • pp.409-416
    • /
    • 2014
  • In H/W implementation for the finite field, the use of normal basis has several advantages, especially the optimal normal basis is the most efficient to H/W implementation in $GF(2^m)$. The finite field $GF(2^m)$ with type I optimal normal basis(ONB) has the disadvantage not applicable to some cryptography since m is even. The finite field $GF(2^m)$ with type II ONB, however, such as $GF(2^{233})$ are applicable to ECDSA recommended by NIST. In this paper, we propose a bit-parallel multiplier over $GF(2^m)$ having a type II ONB, which performs multiplication over $GF(2^m)$ in the extension field $GF(2^{2m})$. The time and area complexity of the proposed multiplier is the same as or partially better than the best known type II ONB bit-parallel multiplier.

Construction of High-Speed Parallel Multiplier on Finite Fields GF(3m) (유한체 GF(3m)상의 고속 병렬 승산기의 구성)

  • Choi, Yong-Seok;Park, Seung-Yong;Seong, Hyeon-Kyeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.3
    • /
    • pp.510-520
    • /
    • 2011
  • In this paper, we propose a new multiplication algorithm for primitive polynomial with all 1 of coefficient in case that m is odd and even on finite fields $GF(3^m)$, and compose the multiplier with parallel input-output module structure using the presented multiplication algorithm. The proposed multiplier is designed $(m+1)^2$ same basic cells that have a mod(3) addition gate and a mod(3) multiplication gate. Since the basic cells have no a latch circuit, the multiplicative circuit is very simple and is short the delay time $T_A+T_X$ per cell unit. The proposed multiplier is easy to extend the circuit with large m having regularity and modularity by cell array, and is suitable to the implementation of VLSI circuit.

Design of High-Speed Parallel Multiplier with All Coefficients 1's of Primitive Polynomial over Finite Fields GF(2m) (유한체 GF(2m)상의 기약다항식의 모든 계수가 1을 갖는 고속 병렬 승산기의 설계)

  • Seong, Hyeon-Kyeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.2
    • /
    • pp.9-17
    • /
    • 2013
  • In this paper, we propose a new multiplication algorithm for two polynomials using primitive polynomial with all 1 of coefficient on finite fields GF($2^m$), and design the multiplier with high-speed parallel input-output module structure using the presented multiplication algorithm. The proposed multiplier is designed $m^2$ same basic cells that have a 2-input XOR gate and a 2-input AND gate. Since the basic cell have no a latch circuit, the multiplicative circuit is very simple and is short the delay time $D_A+D_X$ per cell unit. The proposed multiplier is easy to extend the circuit with large m having regularity and modularity by cell array, and is suitable to the implementation of VLSI circuit.

A Study on the Parallel Stream Cipher by Nonlinear Combiners (비선형 결합함수에 빠른 병렬 스트림 암호에 관한 연구)

  • 이훈재;변우익
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2001.05a
    • /
    • pp.77-83
    • /
    • 2001
  • In recent years, the AES in North America and the NESSIE project in Europe have been in progress. Six proposals have been submitted to the NESSIE project including the LILI-128 by Simpson in Australia in the synchronous stream cipher category. These proposals tend towards a design with parallelism of the algorithms in order to facilitate speed-up. In this paper, we consider the PS-LFSR and propose the effective implementation of various nonlinear combiners: memoryless-nonlinear combiner, memory-nonlinear combiner, nonlinear filter function, and clock-controlled function. Finally, we propose m-parallel SUM-BSG and LILI-l28's parallel implementation as examples, and we determine their securities and performances.

  • PDF

A Study on the Communication Performance Improvement of the Parallel Finite-Different Time-Domain Simulator by using the MPI Persistent Communication (MPI의 지속 통신 메커니즘을 이용한 병렬 유한차분시간영역 전산모사 프로그램의 통신 성능 향상에 관한 연구)

  • Kim, Huioon;Chun, Kyungwon;Kim, Hyeong-gyu;Hong, Hyunpyo;Chung, Youngjoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.942-945
    • /
    • 2009
  • 유한차분시간영역 방법은 전자기파 관련 분야의 전산모사에 많이 사용되는 수치해석기법이다. 이 방법을 이용하여 구현한 전산모사 프로그램은 많은 계산 자원 필요로 하기 때문에 병렬 계산 환경을 이용하게 되는 경우가 많다. 병렬 계산 환경에서 전산모사를 수행할 경우, 병렬로 수행되는 각 프로세스 간의 통신 속도와 네트워크의 지연 시간은 계산의 병목 현상을 초래하여 전체적인 성능을 저하시키는 원인이 된다. 따라서, 본 논문에서는 MPI의 지속 통신 메커니즘을 이용하여 병렬 프로세스 간 동기화 속도를 증가시킴으로써 유한차분시간영역 전산모사 프로그램에서의 MPI 통신 성능의 향상을 꾀하고, 그 결과를 그래프로 도시하였다. 또한 기존의 양방향 통신과 단방향 통신 메커니즘을 사용했을 때의 성능과 비교/분석하여, 병렬 유한차분시간영역 전산모사 프로그램에 있어서 지속 통신 메커니즘의 장/단점을 제시하고, 그 효용성에 관해 논의한다.