• 제목/요약/키워드: Hardware Efficient

검색결과 1,130건 처리시간 0.023초

Efficient VLSI architecture for one-dimensional discrete wavelet transform using a sealable data reorder unit

  • Park, Taegeun
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.353-356
    • /
    • 2002
  • In this paper, we design an efficient, scalable one-dimensional discrete wavelet transform (1DDWT) filter using data reorder unit (DRU). At each level, the required hardware is optimized by sharing multipliers and adders because the input rate is reduced by a factor of two at each level due to decimation. The proposed architecture shows 100% hardware utilization by balancing hardware with input rate. Furthermore, sharing the coefficients of the highpass and the lowpass filters using the mirror filter property reduces the number of multipliers and adders by half. We designed a scalable DRU that efficiently reorders and feeds inputs to highpass and lowpass filters. The proposed DRU-based architecture is so regular and scalable that it can be easily extended to an arbitrary 1D DWT structure with M taps and J levels. Compared to other architectures, the proposed DWT filter shows efficiency in performance with relatively less hardware.

  • PDF

GF(p)의 타원곡선 암호 시스템을 위한 효율적인 하드웨어 몽고메리 모듈러 역원기 (Efficient Hardware Montgomery Modular Inverse Module for Elliptic Curve Cryptosystem in GF(p))

  • 최필주;김동규
    • 한국멀티미디어학회논문지
    • /
    • 제20권2호
    • /
    • pp.289-297
    • /
    • 2017
  • When implementing a hardware elliptic curve cryptosystem (ECC) module, the efficient design of Modular Inverse (MI) algorithm is especially important since it requires much more computation than other finite field operations in ECC. Among the MI algorithms, binary Right-Shift modular inverse (RS) algorithm has good performance when implemented in hardware, but Montgomery Modular Inverse (MMI) algorithm is not considered in [1, 2]. Since MMI has a similar structure to that of RS, we show that the area-improvement idea that is applied to RS is applicable to MMI, and that we can improve the speed of MMI. We designed area- and speed-improved MMI variants as hardware modules and analyzed their performance.

연산공유 승산 알고리즘을 이용한 내적의 최적화 및 이를 이용한 1차원 DCT 프로세서 설계 (Optimization Design Method for Inner Product Using CSHM Algorithm and its Application to 1-D DCT Processor)

  • 이태욱;조상복
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제53권2호
    • /
    • pp.86-93
    • /
    • 2004
  • The DCT algorithm needs an efficient hardware architecture to compute inner product. The conventional design method, like ROM-based DA(Distributed Arithmetic), has large hardware complexity. Because of this reason, a CSHM(Computation Sharing Multiplication) was proposed for implementing inner product by Park. However, the Park's CSHM has inefficient hardware architecture in the precomputer and select units. Therefore it degrades the performance of the multiplier. In this paper, we presents the optimization design method for inner product using CSHM algorithm and applied it to implementation of 1-D DCT processor. The experimental results show that the proposed multiplier is more efficient than Park's when hardware architectures and logic synthesis results were compared. The designed 1-D DCT processor by using proposed design method is more high performance than typical methods.

Efficient hardware implementation and analysis of true random-number generator based on beta source

  • Park, Seongmo;Choi, Byoung Gun;Kang, Taewook;Park, Kyunghwan;Kwon, Youngsu;Kim, Jongbum
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.518-526
    • /
    • 2020
  • This paper presents an efficient hardware random-number generator based on a beta source. The proposed generator counts the values of "0" and "1" and provides a method to distinguish between pseudo-random and true random numbers by comparing them using simple cumulative operations. The random-number generator produces labeled data indicating whether the count value is a pseudo- or true random number according to its bit value based on the generated labeling data. The proposed method is verified using a system based on Verilog RTL coding and LabVIEW for hardware implementation. The generated random numbers were tested according to the NIST SP 800-22 and SP 800-90B standards, and they satisfied the test items specified in the standard. Furthermore, the hardware is efficient and can be used for security, artificial intelligence, and Internet of Things applications in real time.

공유메모리 다중처리기에서 효율적인 프로세서 동기화 기법 (An Efficient Processor Synchronization Scheme on Shared Memory Multiprocessor)

  • 윤석한;원철호;김덕진
    • 전자공학회논문지B
    • /
    • 제32B권5호
    • /
    • pp.683-692
    • /
    • 1995
  • Many kinds of large scale multiprocessing and parallel-processing systems have recently been developed. The contention on the shared data caused by multiple processors may degrade system performance. So, processor synchronization has become one of the important issues in these systems. To solve the synchornization issues, a lot of software and hardware schemes based on spin lock have been proposed. Although software schemes are easy to implement, hardware schemes are preferred in many systems to gain optimized performance. This paper proposes an efficient processor synchronization scheme, called QCX,and describes its design considerations, hardware, algorithm, protocol. Also, in this paper, the performance of QCX has been evaluated with QOLB[5] and LBP[7] using a simulation. The simulation, with varying the number of processor and the contention on shared variables, measured the average execution times of a workload. The simulation results show that the performances of QCX is best when practicability is considered. QCX is more efficient than QOLB and LBP in two aspects. First, the hardware of QCX is more simple and cost-effective because the cache structure need not be changed. Secondly, QCX is more general because it uses a generic atomic instruction.

  • PDF

블록암호 알고리듬 LEA의 효율적인 하드웨어 구현 (An Efficient Hardware Implementation of Block Cipher Algorithm LEA)

  • 성미지;박장녕;신경욱
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2014년도 추계학술대회
    • /
    • pp.777-779
    • /
    • 2014
  • LEA(Lightweight Encryption Algorithm)는 2012년 국가보안기술연구소(NSRI)에서 개발한 128비트 고속 경량 블록암호 알고리듬이다. LEA는 128/192/256비트 마스터키를 사용하여 128비트 평문을 128비트 암호문으로, 또는 그 역으로 변환한다. 라운드 변환블록의 암호화 연산과 복호화 연산의 하드웨어 자원이 공유되도록 설계하였으며, 또한 키 스케줄러도 암호화와 복호화의 하드웨어 자원이 공유되도록 설계하여 저전력, 저면적 구현을 실현했다. 설계된 LEA 프로세서는 FPGA 구현을 통해 하드웨어 동작을 검증하였다.

  • PDF

효율적인 하드웨어 구현을 위한 정렬 알고리즘에 대한 분석 (Analysis of Sorting Algorithm for Efficient Hardware Implementation)

  • 김한결;강봉순
    • 전기전자학회논문지
    • /
    • 제23권3호
    • /
    • pp.978-983
    • /
    • 2019
  • 자율주행, AI의 시대가 도래함에 따라 카메라를 통하여 물체를 정확히 인식 및 판단하는 것이 중요해졌다. 특히 카메라를 이용하여 물체를 인식하는 방법은 다른 여러 방법들에 비하여 시각적으로 많은 양의 정보를 얻을 수 있기 때문에 정확한 영상을 추출하기 위하여 많은 영상 신호 처리 방법들이 연구되고 있다. 또한, 이러한 영상 신호 처리의 기능을 실제 하드웨어로 구현하기 위하여 많은 연구도 동시에 진행되고 있다. 본 논문에서는 영상 신호 처리에서 자주 사용되는 정렬 알고리즘에 대하여 동작원리 및 특징을 비교하고 성능에 대한 평가를 정리하였다. 이를 토대로 대표적인 정렬 알고리즘 중 하드웨어로 구현할 때 효율적인 알고리즘에 대하여 정의한다.

Hardware-Aware Rate Monotonic Scheduling Algorithm for Embedded Multimedia Systems

  • Park, Jae-Beom;Yoo, Joon-Hyuk
    • ETRI Journal
    • /
    • 제32권5호
    • /
    • pp.657-664
    • /
    • 2010
  • Many embedded multimedia systems employ special hardware blocks to co-process with the main processor. Even though an efficient handling of such hardware blocks is critical on the overall performance of real-time multimedia systems, traditional real-time scheduling techniques cannot afford to guarantee a high quality of multimedia playbacks with neither delay nor jerking. This paper presents a hardware-aware rate monotonic scheduling (HA-RMS) algorithm to manage hardware tasks efficiently and handle special hardware blocks in the embedded multimedia system. The HA-RMS prioritizes the hardware tasks over software tasks not only to increase the hardware utilization of the system but also to reduce the output jitter of multimedia applications, which results in reducing the overall response time.

하드웨어 소프트웨어 통합 설계에 의한 H.263 동영상 코덱 구현 (An Efficient Hardware-Software Co-Implementation of an H.263 Video Codec)

  • 장성규;김성득;이재헌;정의철;최건영;김종대;나종범
    • 한국통신학회논문지
    • /
    • 제25권4B호
    • /
    • pp.771-782
    • /
    • 2000
  • 이 논문에서는 하드웨어와 소프트웨어의 통합 설계에 의한 H.263 동영상 코덱을 구현한다. 동영상의 부호화와 복호화를 실시간으로 수행하기 위해 동작 속도 및 응용성을 동시에 고려하여 H.263 코덱의 각 부분 중 어느 부분이 하드웨어 또는 소프트웨어로 구현된는 것이 바람직한지 결정하였다. 하드웨어로 구현하는 부분은 움직임 추정부 및 보상부와 메모리 제어부이고, 나머지 부분은 RISC (reduced instruction set computer) 프로세서를 사용하여 소프트웨어로 처리한다. 이 논문에서는 하드웨어 및 소프트웨어 모듈의 효과적인 구현 방법을 소개한다. 특히 하드웨어로 구현되는 움직임 추정부를 위해서 주변 움직임 변위의 상관성 및 계층적 탐색을 이용한 다수의 움직임 후보를 가지고 알고리즘을 사용하였으며, 이 알고리즘에 기반한 소면적 구조를 제안한다. 소프트웨어로 처리되는 DCT (discrete cosine transform) 부분의 최적화를 위해서 움직임 추정부에서 얻어진 SAD (sum of absolute difference) 값에 근거하여 DCT 이후 양자화된 계수들의 통계적 특성을 분류하는 기법을 사용한다. 제안된 방법을 실제 RISC 프로세서와 gate array를 이용하여 구\ulcorner하고, 그 성능이 우수함을 확인하였다.

  • PDF

An Efficient Multicast Addressing Scheme for the Self-Routing Multistage Networks

  • Kim, Hong-Ryul;Lim, Chae-Tak
    • Journal of Electrical Engineering and information Science
    • /
    • 제2권3호
    • /
    • pp.22-28
    • /
    • 1997
  • In this paper, we propose an efficient multicast addressing scheme for he self-routing multistage networks. Using only N-bit routing header an the simple hardware logic, the new scheme can efficiently provides all point-to-multipoint connections in single pass through the multistage copy networks. We also designed a hardware logic of switching element to implementation of multicasting in ATM switches are performed.

  • PDF