• Title/Summary/Keyword: Parallel-Addition

Search Result 1,046, Processing Time 0.032 seconds

Parallel-Addition Convolution Algorithm in Grayscale Image (그레이스케일 영상의 병렬가산 컨볼루션 알고리즘)

  • Choi, Jong-Ho
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.4
    • /
    • pp.288-294
    • /
    • 2017
  • Recently, deep learning using convolutional neural network (CNN) has been extensively studied in image recognition. Convolution consists of addition and multiplication. Multiplication is computationally expensive in hardware implementation, relative to addition. It is also important factor limiting a chip design in an embedded deep learning system. In this paper, I propose a parallel-addition processing algorithm that converts grayscale images to the superposition of binary images and performs convolution only with addition. It is confirmed that the convolution can be performed by a parallel-addition method capable of reducing the processing time in experiment for verifying the availability of proposed algorithm.

Quench Characteristics of Superconducting Elements using Reactors at Series and Parallel Connections (직·병렬연결시 리액터를 이용한 초전도 소자의 퀜치 특성)

  • Choi, Hyo-Sang;Lim, Sung-Hun;Cho, Yong-Sun;Nam, Gueng-Hyun;Lee, Na-young;Park, Hyoung-Min
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.18 no.9
    • /
    • pp.863-869
    • /
    • 2005
  • We investigated quench characteristics of superconducting elements connected in series and parallel each other. The serial and parallel connections of superconducting elements causes a difficulty in simultaneous quench due to slight difference between their critical current densities. In other to induce simultaneous quench, we fabricated four type circuits; serially connected circuit before parallel connection, the circuit connected in parallel before serial connection, serially connected circuit before parallel connection with reactors, the circuit connected in Parallel before serial connection with reactors. We confirmed that the simultaneous quenches occurred in serial and parallel connections of superconducting elements using reactors. In addition, the power burden of superconducting elements was smaller than those of serial and parallel connections of superconducting elements without reactors.

Multiplexed Optical Correlation Filter for Optical Parallel Addition Based on Symbolic Substitution with Redundant Binary Number (기호치환을 기초로 한 잉여 이진수 광병렬 가산용 다중 광상관 필터)

  • 노덕수;조웅호;김정우;이하운;김수중
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.3
    • /
    • pp.109-119
    • /
    • 1996
  • We propsoed a multiplexed optical correlation filter method for an optical parallel addition based on symbolic substitution. In the proposed mthod, we used redundant binary number which was easy to minimize the number of the symbolic substitution rules. We chose MACE filter which had very low sidelobes and good correlation peak compared with SDF filter as the optical correlation recognition filter and encoded input numbers properly to increase the discrimination capability. In order to minimize the number of symbolic substitution rules, sixteen input patterns were divided into six groups of the same addition results and six filters for recognizing the input patterns were used. these filters were multiplexed in two MMACE filter planes and the corresponding substitution method was proposed. Through the computer simulation, we confirmed the proposed method was suitable to implement the optical parallel adder.

  • PDF

High Throughput Parallel Decoding Method for H.264/AVC CAVLC

  • Yeo, Dong-Hoon;Shin, Hyun-Chul
    • ETRI Journal
    • /
    • v.31 no.5
    • /
    • pp.510-517
    • /
    • 2009
  • A high throughput parallel decoding method is developed for context-based adaptive variable length codes. In this paper, several new design ideas are devised and implemented for scalable parallel processing, a reduction in area, and a reduction in power requirements. First, simplified logical operations instead of memory lookups are used for parallel processing. Second, the codes are grouped based on their lengths for efficient logical operation. Third, up to M bits of the input stream can be analyzed simultaneously. For comparison, we designed a logical-operation-based parallel decoder for M=8 and a conventional parallel decoder. High-speed parallel decoding becomes possible with our method. In addition, for similar decoding rates (1.57 codes/cycle for M=8), our new approach uses 46% less chip area than the conventional method.

Parallel VHDL Simulation on IBM SP2 and SGI Origin 2000 (IBM SP2와 SGI Origin 2000에서의 병렬 VHDL 시뮬레이션)

  • 정영식
    • Journal of the Korea Society for Simulation
    • /
    • v.7 no.1
    • /
    • pp.69-83
    • /
    • 1998
  • In this paper, we present the results of simulation by running parallel VHDL simulation on typical MPP(Massively Parallel Processor) systems such as IBM SP2 and SGI Origin 2000. Parallel simulation uses the synchronous protocol and parallel program is implemented using MPI(Message Passing Interface) based on message passing model, so that it can urn on any parallel programming environment which supports MPI, a standard communication library. And then GVT(Global Virtual Time) computation for parallel simulation is based on the global broadcasting with MPI_Bcast(), which is a standard function in MPI and piggybacking. Our benchmark exhibits that as size of VHDL grows, the parallel simulation has a better performance compared with the sequential simulation. In addition, we also show the results of comparison between IBM SP2 and SGI Origin 2000 by applying the same application to those indirectly.

  • PDF

A Six-Degree-of-Freedom Force-Reflecting Master Hand Controller using Fivebar Parallel Mechanism (5각 관절 병렬 구조를 이용한 6자유도 힘 반사형 마스터 콘트롤러)

  • 진병대;우기영;권동수
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.5 no.3
    • /
    • pp.288-296
    • /
    • 1999
  • A force-reflecting hand controller can provide the kinesthetic information obtained from a slave manipulator to the operator of a teleoperation system. The goal is to construct a compact hand controller that can provide large workspace and good force-reflecting capability. This paper presents the design and the analysis of a 6-degree-of-freedom force-reflecting hand controller using fivebar parallel mechanism. The forward kinematics of the fivebar parallel mechanism has been calculated in real-time using three pin-joint sensors in addition to six actuator position sensors. A force decomposition approach is used to compute the Jacobian. To evaluate the characteristics of the fivebar parallel mechanism, it has been compared with the other three parallel mechanisms in terms with workspace and manipulability measure. The hand controller using the fivebar parallel mechanism has been constructed and tested to verify the feasibility of the design concept.

  • PDF

Grid-Enabled Parallel Simulation Based on Parallel Equation Formulation

  • Andjelkovic, Bojan;Litovski, Vanco B.;Zerbe, Volker
    • ETRI Journal
    • /
    • v.32 no.4
    • /
    • pp.555-565
    • /
    • 2010
  • Parallel simulation is an efficient way to cope with long runtimes and high computational requirements in simulations of modern complex integrated electronic circuits and systems. This paper presents an algorithm for parallel simulation based on parallelization in equation formulation and simultaneous calculation of matrix contributions for nonlinear analog elements. In addition, the paper describes the development of a grid interface for a parallel simulator that enables a designer to perform simulations on distant computer clusters. Performances of the developed parallel simulation algorithm are evaluated by simulation of a microelectromechanical system.

Adaptive and optimized agent placement scheme for parallel agent-based simulation

  • Jin, Ki-Sung;Lee, Sang-Min;Kim, Young-Chul
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.313-326
    • /
    • 2022
  • This study presents a noble scheme for distributed and parallel simulations with optimized agent placement for simulation instances. The traditional parallel simulation has some limitations in that it does not provide sufficient performance even though using multiple resources. The main reason for this discrepancy is that supporting parallelism inevitably requires additional costs in addition to the base simulation cost. We present a comprehensive study of parallel simulation architectures, execution flows, and characteristics. Then, we identify critical challenges for optimizing large simulations for parallel instances. Based on our cost-benefit analysis, we propose a novel approach to overcome the performance constraints of agent-based parallel simulations. We also propose a solution for eliminating the synchronizing cost among local instances. Our method ensures balanced performance through optimal deployment of agents to local instances and an adaptive agent placement scheme according to the simulation load. Additionally, our empirical evaluation reveals that the proposed model achieves better performance than conventional methods under several conditions.

New High Speed Parallel Multiplier for Real Time Multimedia Systems (실시간 멀티미디어 시스템을 위한 새로운 고속 병렬곱셈기)

  • Cho, Byung-Lok;Lee, Mike-Myung-Ok
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.671-676
    • /
    • 2003
  • In this paper, we proposed a new First Partial product Addition (FPA) architecture with new compressor (or parallel counter) to CSA tree built in the process of adding partial product for improving speed in the fast parallel multiplier to improve the speed of calculating partial product by about 20% compared with existing parallel counter using full Adder. The new circuit reduces the CLA bit finding final sum by N/2 using the novel FPA architecture. A 5.14nS of multiplication speed of the $16{\times}16$ multiplier is obtained using $0.25\mu\textrm{m}$ CMOS technology. The architecture of the multiplier is easily opted for pipeline design and demonstrates high speed performance.

On a Parallel-Structured High-Speed Implementation of the Word-Based Stream Cipher (워드기반 스트림암호의 병렬화 고속 구현 방안)

  • Lee, Hoon-Jae;Do, Kyung-Hoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.4
    • /
    • pp.859-867
    • /
    • 2010
  • In this paper, we propose some parallel structures of the word-based nonlinear combining functions in word-based stream cipher, high-speed versions of general (bit-based) nonlinear combining functions. Especially, we propose the high-speed structures of popular four kinds in word-based nonlinear combiners using by PS-WFSR (Parallel-Shifting or Parallel-Structured Word-based FSR): m-parallel word-based nonlinear combiner without memory, m-parallel word-based nonlinear combiner with memories, m-parallel word-based nonlinear filter function, and m-parallel word-based clock-controlled function. In addition, we propose an implementation example of the m-parallel word-based DRAGON stream cipher, and determine its cryptographic security and performance.