• Title/Summary/Keyword: Hardware Resources

Search Result 444, Processing Time 0.023 seconds

Implementation and Design of Digital Instruments System using FPGA (FPGA를 이용한 디지털 계측 시스템의 설계 및 구현)

  • Choi, Hyun Jun;Jang, Seok Woo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.2
    • /
    • pp.55-61
    • /
    • 2013
  • A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing. The FPGA configuration is generally specified using a hardware description language (HDL), similar to that used for an application-specific integrated circuit (ASIC) (circuit diagrams were previously used to specify the configuration, as they were for ASICs, but this is increasingly rare). Contemporary FPGAs have large resources of logic gates and RAM blocks to implement complex digital computations. In this paper, we implement a system of digital instrumentation using FPGA. This system consists of the trigger part, memory address controller part, control FSM part, Encoder part, LCD controller part. The hardware implement using FPGA and the verification of the operation is done in a PC simulation. The proposed hardware was mapped into Cyclone III EP2C5Q208 from Altera and used 1,700(40%) of Logic Element (LE). The implemented circuit used 24,576-bit memory element with 6-bit input signal. The result from implementing in hardware (FPGA) could operate stably in 140MHz.

An Efficient Hardware Implementation of Block Cipher Algorithm LEA (블록암호 알고리듬 LEA의 효율적인 하드웨어 구현)

  • Sung, Mi-ji;Park, Jang-nyeong;Shin, Kyung-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.777-779
    • /
    • 2014
  • The LEA(Lightweight Encryption Algorithm) is a 128-bit high-speed/lightweight block cipher algorithm developed by National Security Research Institute(NSRI) in 2012. The LEA encrypts plain text of 128-bit using cipher key of 128/192/256-bit, and produces cipher text of 128-bit, and vice versa. To reduce hardware complexity, we propose an efficient architecture which shares hardware resources for encryption and decryption in round transformation block. Hardware sharing technique for key scheduler was also devised to achieve area-efficient and low-power implementation. The designed LEA cryptographic processor was verified by using FPGA implementation.

  • PDF

Training-Free Hardware-Aware Neural Architecture Search with Reinforcement Learning

  • Tran, Linh Tam;Bae, Sung-Ho
    • Journal of Broadcast Engineering
    • /
    • v.26 no.7
    • /
    • pp.855-861
    • /
    • 2021
  • Neural Architecture Search (NAS) is cutting-edge technology in the machine learning community. NAS Without Training (NASWOT) recently has been proposed to tackle the high demand of computational resources in NAS by leveraging some indicators to predict the performance of architectures before training. The advantage of these indicators is that they do not require any training. Thus, NASWOT reduces the searching time and computational cost significantly. However, NASWOT only considers high-performing networks which does not guarantee a fast inference speed on hardware devices. In this paper, we propose a multi objectives reward function, which considers the network's latency and the predicted performance, and incorporate it into the Reinforcement Learning approach to search for the best networks with low latency. Unlike other methods, which use FLOPs to measure the latency that does not reflect the actual latency, we obtain the network's latency from the hardware NAS bench. We conduct extensive experiments on NAS-Bench-201 using CIFAR-10, CIFAR-100, and ImageNet-16-120 datasets, and show that the proposed method is capable of generating the best network under latency constrained without training subnetworks.

A Study of Integral Image Hardware Design for Memory Size Efficiency (메모리 크기에 효율적인 적분영상 하드웨어 설계 연구)

  • Lee, Su-Hyun;Jeong, Yong-Jin
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.9
    • /
    • pp.75-81
    • /
    • 2014
  • The integral image is the sum of input image pixel values. It is mainly used to speed up processing of a box filter operation, such as Haar-like features. However, large memory for integral image data can be an obstacle on an embedded hardware environment with limited memory resources. Therefore, an efficient method to store the integral image is necessary. In this paper, we propose a memory size reduction hardware design for integral image. The hardware design is used two methods. It is the new integral image memory and modulo calculation for reducing integral image data. The new integral image memory has additional calculation overhead, but it is not obstacle in hardware environment that parallel processing is possible. In the Xilinx Virtex5-LX330T targeted experimental result, integral image memory can be reduced by 50% on a $640{\times}480$ 8-bit gray-scale input image.

A STUDY ON THE RELIABILITY OF THE DAEJEON HARDWARE CORRELATOR FOR THE KVN OBSERVATION MODES (KVN 관측모드별 대전상관기의 상관결과 고찰)

  • OH, SE-JIN;ROH, DUK-GYOO;YEOM, JAE-HWAN;OH, CHUNG-SIK;LEE, SANG-SUNG;JUNG, DONG-KYU;KIM, HYO-RYOUNG;CHUNG, HYUN-SOO
    • Publications of The Korean Astronomical Society
    • /
    • v.31 no.2
    • /
    • pp.11-19
    • /
    • 2016
  • This paper presents the results of test observations toward a point source, 4C39.25, for observation modes with various bandwidths and numbers of IF streams in order to examine a reliability of the Daejeon hardware correlator performance for correlating VLBI (Very Long Baseline Interferometry) data obtained with the several observation modes of the KVN (Korean VLBI Network). We used a DiFX software correlator (DiFX) as a reference, for investigating the output visibilities from the Daejeon corelator. It is found that the band shapes of the output visibilities from two correlators are similar to each other and the correlated flux density for each baseline obtained from the Daejeon hardware correlator is lower by 3 - 7% than that from the DiFX. The flux difference is attributed to the limitation of FPGA resources and the difference of fringe rotation algorithm of the Daejeon hardware correlator. The conversion factor, 0.93 ~ 0.97, is proposed for future correlation with the Daejeon hardware correlator.

A New Parallelizing Algorithm and Cell-based Hardware Architecture for High-speed Generation of Digital Hologram (디지털 홀로그램의 고속 생성을 위한 병렬화 알고리즘 및 셀 기반의 하드웨어 구조)

  • Seo, Young-Ho;Choi, Hyun-Jun;Yoo, Ji-Sang;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.16 no.1
    • /
    • pp.54-63
    • /
    • 2011
  • This paper proposes a new equation to calculate computer-generated hologram (CGH) in a high speed and its cell-based VLSI (veri large scale integrated circuit) architecture. After finding the calculational regularity in the horizontal or vertical direction from the basic CGH equation, we induce the new equation to calculate the horizontal or vertical hologram pixel values in parallel. We also propose the architecture of the CGH cell consisting of a initial parameter calculator and update-phase calculator(s) on the basis of the equation and implement them in hardware. Also we show a hardware architecture to parallelize the calculation in the horizontal direction by extending CGH. In the experiments we analyze the used hardware resources. These analyses makes it possible to select the amount of hardware to the precision of the results. Here, for the CGH kernel and the structure of the processor, we used the platform from our previous works.

A New Hardware Architecture of High-Speed Motion Estimator for H.264 Video CODEC (H.264 비디오 코덱을 위한 고속 움직임 예측기의 하드웨어 구조)

  • Lim, Jeong-Hun;Seo, Young-Ho;Choi, Hyun-Jun;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.16 no.2
    • /
    • pp.293-304
    • /
    • 2011
  • In this paper, we proposed a new hardware architecture for motion estimation (ME) which is the most time-consuming unit among H.264 algorithms and designed to the type of intellectual property (IP). The proposed ME hardware consists of buffer, processing unit (PU) array, SAD (sum of absolute difference) selector, and motion vector (MVgenerator). PU array is composed of 16 PUs and each PU consists of 16 processing elements (PUs). The main characteristics of the proposed hardware are that current and reference frames are re-used to reduce the number of access to the external memory and that there is no clock loss during SAD operation. The implemented ME hardware occupies 3% hardware resources of StatixIII EP3SE80F1152C2 which is a FPGA of Altera Inc. and can operate at up to 446.43MHz. Therefore it can process up to 50 frames of 1080p in a second.

Hardware Implementation of Minimized Serial-Divider for Image Frame-Unit Processing in Mobile Phone Camera. (Mobile Phone Camera의 이미지 프레임 단위 처리를 위한 소형화된 Serial-Divider의 하드웨어 구현)

  • Kim, Kyung-Rin;Lee, Sung-Jin;Kim, Hyun-Soo;Kim, Kang-Joo;Kang, Bong-Soon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.119-122
    • /
    • 2007
  • In this paper, we propose the method of hardware-design for the division operation of image frame-unit processing in mobile phone camera. Generally, there are two types of the data processing, which are the parallel and serial type. The parallel type makes it possible to process in realtime, but it needs significant hardware size due to many comparators and buffer memories. Compare the serial type with the parallel type, the hardware size of the serial type is smaller than the other because it uses only one comparator, but serial type is not able to process in realtime. To use the hardware resources efficiently, we employ the serial divider since frame-unit operation for image processing does not need realtime process. When compared with both in the same bit size and operating frequency, the hardware size of the serial divider is approximately in the ratio of 13 percentage compared with the parallel divider.

  • PDF

Study of an In-order SMT Architecture and Grouping Schemes

  • Moon, Byung-In;Kim, Moon-Gyung;Hong, In-Pyo;Kim, Ki-Chang;Lee, Yong-Surk
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.339-350
    • /
    • 2003
  • In this paper, we propose a simultaneous multithreading (SMT) architecture that improves instruction throughput by exploiting instruction level parallelism (ILP) and thread level parallelism (TLP). The proposed architecture issues and completes instructions belonging to the same thread in exact program order. The issue and completion policy greatly reduces the design complexity and hardware cost of our architecture, compared with others that employ out-of-order issue and completion. On the other hand, when the instructions belong to different threads, the issue and completion orders for those instructions may not necessarily be identical to the fetch order. The processor issues instructions simultaneously from multiple threads to functional units by exploiting ILP and TLP, and by dynamic resource sharing. That parallel execution notably improves performance and resource utilization with minimal additional hardware cost over the conventional superscalar processors. This paper proposes an SMT architecture with grouping as well as one without grouping. Without grouping, all threads dynamically and flexibly share most resources. On the other hand, in the SMT architecture with grouping, in which resources and threads are divided into several groups for design simplification, resources are shared only among threads belonging to the same group as those resources. Simulation results show that our processors with four and eight threads improve performance by three or more times over the conventional superscalar processor with comparable execution resources and policies, and that reasonable grouping reduces the design complexity of SMT processors with little negative effect on performance.