• Title/Summary/Keyword: SIMD 컴퓨터

Search Result 45, Processing Time 0.035 seconds

SIMD Optimization for Improving the Performance of a CPU-Based Graph Engine (SIMD 최적화를 이용한 CPU 기반 그래프 엔진의 성능 개선)

  • Ikhyeon Jo;Myung-Hwan Jang;Sang-Wook Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.383-385
    • /
    • 2023
  • Single-machine-based 그래프 엔진의 state-of-the-art 모델인 RealGraph 는 쓰레드를 이용한 병렬화로 성능을 향상하였으나 쓰레드 내부에서의 병렬성은 고려되지 않았다. 본 논문은 SIMD 명령어를 이용해 RealGraph 의 병렬성을 향상시켰다. 쓰레드 내부의 효율성을 높이기 위해 RealGraph 의 구조와 그래프 알고리즘의 분석을 통한 SIMD 명령어의 적용 가능한 영역을 탐색하였다. 실험으로 SIMD 명령어의 적용을 통해 쓰레드 내부에서 벡터 연산을 수행하여 평균 7.6%, 11.7%, 9.2%의 수행 시간 단축을 이끌어냈으며 SIMD 명령어의 적용이 그래프 엔진의 분석 성능에 얼마나 도움이 될 수 있는지 확인하였다.

Parallel Factorization using Quadratic Sieve Algorithm on SIMD machines (SIMD상에서의 이차선별법을 사용한 병렬 소인수분해 알고리즘)

  • Kim, Yang-Hee
    • The KIPS Transactions:PartA
    • /
    • v.8A no.1
    • /
    • pp.36-41
    • /
    • 2001
  • In this paper, we first design an parallel quadratic sieve algorithm for factoring method. We then present parallel factoring algorithm for factoring a large odd integer by repeatedly using the parallel quadratic sieve algorithm based on the divide-and-conquer strategy on SIMD machines with DMM. We show that this algorithm is optimal in view of the product of time and processor numbers.

  • PDF

A Design of Modified SIMD Architecture for block level parallelism (Block level parallelism 을 위한 수정된 SIMD Architecture 설계)

  • Jonghee Youn;Daeho Kim;Minwook Ahn;Youngkyu Choi;Hokyun Kim;Seungjun Yang;Yunheung Paek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.857-859
    • /
    • 2008
  • 미디어 어플리케이션, 특히 비디오 어플리케이션의 경우 커널 코드를 얼마나 효과적으로 처리하느냐에 따라 전체적인 성능에 큰 차이가 생긴다. 이러한 커널 코드를 효과적으로 처리하기 위해, 일반적인 DSP co-processor 에 SIMD 구조를 추가한 아키텍처를 설계하여 비디오 어플리케이션의 전체적인 성능을 향상할 수 있도록 하였다.

Efficient Parallel Logic Simulation on SIMD Computers (SIMD 컴퓨터상에서 효율적인 병렬처리 논리 시뮬레이션)

  • Chung, Yun-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.2
    • /
    • pp.315-326
    • /
    • 1996
  • As the complexity of VLSI circuits has increased, a lot of simulation time for verifying their correctness has been required. This paper presents efficient parallelel logic simulation protocols, data structures, algorithms to implement fast logic simulation on SIMD parallel processing computers. The performance results of the presented schemes on CM-2 are given and analyzed.

  • PDF

Parallel Algorithms for the Discrete Logarithm Problem dn SIMD Machines (SIMD상에서 이산대수 문제에 대한 병렬 알고리즘)

  • 김양희;정창성
    • Review of KIISC
    • /
    • v.4 no.2
    • /
    • pp.40-46
    • /
    • 1994
  • 고속 계산을 요구하는 분야에서는 여러개의 프로세싱 소자를 사용하여 속도를 증가시키는 병렬 처리의 필요성이 점점 증대되고 있다. 특히 암호처리에서 이산대수 문제나 factorization문제는 많은 시간이 걸리므로 고속계산을 위한 병렬처리가 매우 중요하다. 본 논문에서는 Pohlig-Hellman에 의한 이산대수 알고리즘을 SIMD구조의 병렬 컴퓨터상에서 고속으로 처리할 수 있는 두가지 병렬 이산대수 알고리즘을 제시하며, 이를16개의 트랜스퓨터로 구성된 병렬 컴퓨터인 KOPS(Korea Parallel System)상에서 구현한 성능평가를 제시한다.

  • PDF

SIMD Extended VLIW ASIP architecture (SIMD 명령어가 추가된 VLIW ASIP 프로세서)

  • Yang, Seungjun;Park, Sanghyun;Heo, Ingoo;Lee, Jongwon;Kim, Yongjoo;Paek, Yunheung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.1589-1590
    • /
    • 2010
  • VLIW 아키텍처는 동시에 여러 개의 명령어를 수행하면서도 상대적으로 크기가 작으며 적은 전력을 소모한다는 장점 때문에 임베디드 어플리케이션을 처리하기 위해 많이 쓰이고 있다. 본 논문에서는 SIMD 명령어를 추가한 VLIW 아키텍처를 설계함으로써 동영상 처리와 같은 미디어 어플리케이션을 효과적으로 처리할 수 있도록 하였다.

A Design of compiler for partitioned SIMD reconfigurable parallel processor (분할 SIMD 재구성형 병렬 프로세서를 위한 컴파일러 설계)

  • Kwon, Yongin;Kim, Yongjoo;yoon, Jonghee;Ahn, Minwook;Choi, Youngkyu;Paek, Yunheung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.11-12
    • /
    • 2009
  • 본 논문에서는 휴대용 단말기의 실시간 서비스 제공을 위한 재구성형 병렬 프로세서를 소개 하고, 그 기능인 분할 SIMD를 표현하기 위한 새로운 프로그래밍 언어와 컴파일러를 설계한다. 이 방식을 이용하여 빠른 어플리케이션 개발과 각 어플리케이션의 성능 향상에 용이하다.

Implementation of SIMD-based Many-Core Processor for Efficient Image Data Processing (효율적인 영상데이터 처리를 위한 SIMD기반 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.1-9
    • /
    • 2011
  • Recently, as mobile multimedia devices are used more and more, the needs for high-performance and low-energy multimedia processors are increasing. Application-specific integrated circuits (ASIC) can meet the needed high performance for mobile multimedia, but they provide limited, if any, generality needed for various application requirements. DSP based systems can used for various types of applications due to their generality, but they require higher cost and energy consumption as well as less performance than ASICs. To solve this problem, this paper proposes a single instruction multiple data (SIMD) based many-core processor which supports high-performance and low-power image data processing while keeping generality. The proposed SIMD based many-core processor composed of 16 processing elements (PEs) exploits large data parallelism inherent in image data processing. Experimental results indicate that the proposed SIMD-based many-core processor higher performance (22 times better), energy efficiency (7 times better), and area efficiency (3 times better) than conversional commercial high-performance processors.

Implementation of Parallel Processor for Sound Synthesis of Guitar (기타의 음 합성을 위한 병렬 프로세서 구현)

  • Choi, Ji-Won;Kim, Yong-Min;Cho, Sang-Jin;Kim, Jong-Myon;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.191-199
    • /
    • 2010
  • Physical modeling is a synthesis method of high quality sound which is similar to real sound for musical instruments. However, since physical modeling requires a lot of parameters to synthesize sound of a musical instrument, it prevents real-time processing for the musical instrument which supports a large number of sounds simultaneously. To solve this problem, this paper proposes a single instruction multiple data (SIMD) parallel processor that supports real-time processing of sound synthesis of guitar, a representative plucked string musical instrument. To control six strings of guitar, we used a SIMD parallel processor which consists of six processing elements (PEs). Each PE supports modeling of the corresponding string. The proposed SIMD processor can generate synthesized sounds of six strings simultaneously when a parallel synthesis algorithm receives excitation signals and parameters of each string as an input. Experimental results using a sampling rate 44.1 kHz and 16 bits quantization indicate that synthesis sounds using the proposed parallel processor were very similar to original sound. In addition, the proposed parallel processor outperforms commercial TI's TMS320C6416 in terms of execution time (8.9x better) and energy efficiency (39.8x better).

On the Conceptual Design of the SIMD Vector Machine Attachable to SISD Machine (SISD 머신에 부착 가능한 SIMD 벡터 머신의 개념적 설계)

  • Cho Young-Il;Ko Young-Woong
    • The KIPS Transactions:PartA
    • /
    • v.12A no.3 s.93
    • /
    • pp.263-272
    • /
    • 2005
  • The addressing mode for data is performed by the software in yon Neumann-concept(SISD) computer a priori without hardware design of an address counter for operands. Therefore, in the addressing mode for the vector the corresponding variables as much as the number of the elements should be specified and used also in the software method. This is because not for operand but only for an instructions, quasi PC(program counter) is designed in hardware physically. A vector has a characteristic of a structural dimension. In this paper we propose to design a hardware unit physically external to the CPU for addressing only the elements of a vector unit with the structure and dimension. Because of the high speed performance for a vector processing it should be designed in the SIMD pipeline mechanics. The proposed mechanics is evaluated through a simulation. Our result shows $12\%$ to $30\%$ performance enhancement over CRAY architecture under the same hardware consideration(processing unit).