• Title/Summary/Keyword: FPGA 합성

Search Result 263, Processing Time 0.022 seconds

FPGA/GPU-based Autostereoscopic 3D Video Generation System (FPGA/GPU 기반 다시점 영상 생성 시스템)

  • Shin, Hong-Chang;Um, Gi-Mun;Kim, Chan;Cheong, Won-Sik;Hur, Namho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.11a
    • /
    • pp.220-223
    • /
    • 2012
  • 본 논문에서는 스테레오 영상으로부터 무안경 3D 디스플레이를 위한 다시점 영상을 생성하는 시스템을 제안한다. 제안한 시스템에서는 먼저 비디오 캡쳐 카드를 통해 입력되는 스테레오 영상으로부터 FPGA 상에서 구현된 Trellis 동적 프로그래밍 기법에 의해 좌우 변이 영상을 실시간으로 추출한다. 이 변이 영상을 기반으로 좌우 영상 사이에서 중간 시점 영상을 생성한다. 이렇게 추출된 좌우 변이 영상과 좌우 스테레오 영상은 각각 USB 3.0 과 PCI-express 인터페이스를 통해 GPU 로 전송되고, GPU 에서는 이들 데이터를 사용하여 변이 기반 영상 합성 방법을 통해 다시점 영상을 생성한다. 생성된 다시점 영상은 다시점 3 차원 디스플레이 규격에 맞게 재배치되어 재생된다.

  • PDF

Logic Synthesis Algorithm for Multiplexer-based FPGA's Using BDD (멀티플렉서 구조의 FPGA를 위한 BDD를 이용한 논리 합성 알고리듬)

  • 강규현;이재흥;정정화
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.30A no.12
    • /
    • pp.117-124
    • /
    • 1993
  • In this paper we propose a new thchnology mapping algorithm for multiplexer-based FPGA's The algorithm consists of three phases` First, it converts the logic functions and the basic logic mocule into BDD's. Second. it covers the logic function with the basic logic modules. Lastly, it reduces the number of basic logic modules used to implement the logic function after going through cell merging procedure. The binate selection is employed to determine the order of input variables of the logic function to constructs the balanced BDD with low level. That enables us to constructs the circuit that has small size and delay time. Technology mapping algorithm of previous work used one basic logic module to implement a two-input or three-input function in logic functions. The algorithm proposed here merges almost all pairs of two-input and three-input functions that occupy one basic logic module. and improves the mapping results. We show the effectiveness of the algorithm by comparing the results of our experiments with those of previous systems.

  • PDF

Digital Circuit Synthesis on FPGA by using Genetic Algorithm (유전자알고리즘을 이용한 FPGA에서의 디지털 회로의 합성)

  • Park, Tae-Suh;Wee, Jae-Woo;Lee, Chong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2944-2946
    • /
    • 1999
  • In this paper, digital circuit evolution is proposed as an intrinsic evolvable system. Evolutionary hardware is a reconfigurable one which adapt itself to the environment and evolve its structure to realize desired performance. By using special FPGA and genetic algorithm, we have made a prototype of intrinsic hardware evolution system. As an example for digital circuit evolution, full adder realization is performed. As the result of this, a very complex structure of digital circuit performing full adder was created. Analysis made on the hardware revealed that some undetermined circuits were developed.

  • PDF

The Design of DWT Processor for RealTime Image Compression (실시간 영상압축을 위한 DWT 프로세서 설계)

  • Gu, Dae Seong;Kim, Jong Bin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.5C
    • /
    • pp.654-654
    • /
    • 2004
  • 본 논문에서는 이산웨이블렛 변환을 이용한 영상 압축 프로세서를 하드웨어로 구현하였다. 웨이블렛 변환을 위하여 필터뱅크 및 피라미드 알고리즘을 이용하였고 각 필터들은 FIR 필터로 구현하였다. 병렬구조로 이루어져 동일 클럭 싸이클에서 하이패스와 로패스를 동시에 수행함으로써 속도를 향상시킬 뿐 아니라 QMF 특성을 이용하여 DWT 연산에 필요한 승산기의 수를 절반으로 줄임으로써 하드웨어 크기를 줄이고 이용효율 또한 높일 수 있다. 다중 해상도 분해 시 필요한 메모리 컨트롤러를 하드웨어로 구현하여 DWT 계산이 수행되므로 이 융자는 단순한 파라메터 입력만으로 효과적인 압축율을 얻을 수 있도록 구조적으로 설계하였다. 실시간 영상압축 프로세서의 성능 예측을 위하여 MATLAB을 통하여 시뮬레이션 하였고, VHDL을 이용하여 각 모듈들을 설계하였다. 설계한 영상압축기는 Leonaro-Spectrum에서 합성하였고, ALTERA FLEX10KE(EPF10K100 EFC256) FPGA에 이식하여 하드웨어적으로 동작을 검증하였다. 설계된 부호화기는 512×512 Woman 영상에 대하여 33㏈의 PSNR값을 갖는다. 그리고 설계된 프로세서를 FPGA 구현 시 35㎒에서 정상적으로 동작한다.

Design of an Optimized GPGPU for Data Reuse in DeepLearning Convolution (딥러닝 합성곱에서 데이터 재사용에 최적화된 GPGPU 설계)

  • Nam, Ki-Hun;Lee, Kwang-Yeob;Jung, Jun-Mo
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.664-671
    • /
    • 2021
  • This paper proposes a GPGPU structure that can reduce the number of operations and memory access by effectively applying a data reuse method to a convolutional neural network(CNN). Convolution is a two-dimensional operation using kernel and input data, and the operation is performed by sliding the kernel. In this case, a reuse method using an internal register is proposed instead of loading kernel from a cache memory until the convolution operation is completed. The serial operation method was applied to the convolution to increase the effect of data reuse by using the principle of GPGPU in which instructions are executed by the SIMT method. In this paper, for register-based data reuse, the kernel was fixed at 4×4 and GPGPU was designed considering the warp size and register bank to effectively support it. To verify the performance of the designed GPGPU on the CNN, we implemented it as an FPGA and then ran LeNet and measured the performance on AlexNet by comparison using TensorFlow. As a result of the measurement, 1-iteration learning speed based on AlexNet is 0.468sec and the inference speed is 0.135sec.

Fast CA-CFAR Processor Design with Low Hardware Complexity (하드웨어 복잡도를 줄인 고속 CA-CFAR 프로세서 설계)

  • Hyun, Eu-Gin;Oh, Woo-Jin;Lee, Jong-Hun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.123-128
    • /
    • 2011
  • In this paper, we design the CA-CFAR processor using a root-square approximation approach and a fixed-point operation to improve hardware complexity and reduce computational effort. We also propose CA-CFAR processor with multi-window, which is capable of concurrent parallel processing. The proposed architecture is synthesized and implemented into the FPGA and the performance is compared with the conventional processor designed by root-square libarary licensed by FPGA corporation.

Design of an SPI Interface for multimedia cards in ARM Embedded Systems (ARM 내장 임베디드 시스템용 멀티미디어카드를 위한 SPI 인터페이스 설계)

  • Moon, San-Gook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.2
    • /
    • pp.273-278
    • /
    • 2012
  • In this contribution, we design and implement an SPI hardware interface for the microprocessor to communicate with the MMC (Multi-Media Card) in an embedded system. Proposed architecture is compatible with the APB in AMBA bus architecture. Embedding OS in an embedded system means a big burden in terms of hardware and software ending up with performance decline. In this paper, we adopt the concept of SPI communication without using OS in the embedded system and implement in a form of FPGA chip. The designed SPI module was automatically synthesized, placed, and routed. Implementation was performed through the Altera FPGA and well operated at 25MHz clock frequency, which satisfied our target speed.

A Hardware Architecture of Multibyte-based Regular Expression Pattern Matching for NIDS (NIDS를 위한 다중바이트 기반 정규표현식 패턴매칭 하드웨어 구조)

  • Yun, Sang-Kyun;Lee, Kyu-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.1B
    • /
    • pp.47-55
    • /
    • 2009
  • In recent network intrusion detection systems, regular expressions are used to represent malicious packets. In order to process incoming packets through high speed networks in real time, we should perform hardware-based pattern matching using the configurable device such as FPGAs. However, operating speed of FPGAs is slower than giga-bit speed network and so, multi-byte processing per clock cycle may be needed. In this paper, we propose a hardware architecture of multi-byte based regular expression pattern matching and implement the pattern matching circuit generator. The throughput improvements in four-byte based pattern matching circuit synthesized in FPGA for several Snort rules are $2.62{\sim}3.4$ times.

Implementation of Encryption Module for Securing Contents in System-On-Chip (콘텐츠 보호를 위한 시스템온칩 상에서 암호 모듈의 구현)

  • Park, Jin;Kim, Young-Geun;Kim, Young-Chul;Park, Ju-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.11
    • /
    • pp.225-234
    • /
    • 2006
  • In this paper, we design a combined security processor, ECC, MD-5, and AES, as a SIP for cryptography of securing contents. Each SIP is modeled and designed in VHDL and implemented as a reusable macro through logic synthesis, simulation and FPGA verification. To communicate with an ARM9 core, we design a BFM(Bus Functional Model) according to AMBA AHB specification. The combined security SIP for a platform-based SoC is implemented by integrating ECC, AES and MD-5 using the design kit including the ARM9 RISC core, one million-gate FPGA. Finally, it is fabricated into a MPW chip using Magna chip $0.25{\mu}m(4.7mm{\times}4.7mm$) CMOS technology.

  • PDF

An Area-efficient Implementation of Layered LDPC Decoder for IEEE 802.11n WLAN (IEEE 802.11n WLAN 표준용 Layered LDPC 복호기의 저면적 구현)

  • Jeong, Sang-Hyeok;Na, Young-Heon;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.486-489
    • /
    • 2010
  • This paper describes a layered LDPC decoder which supports block length of 1,944 bits and code rate 1/2 for IEEE 802.11n WLAN standard. To reduce the hardware complexity, the min-sum algorithm and layered architecture is adopted. A novel memory reduction technique suitable for min-sum algorithm reduces memory size by 75% compared with conventional method. The designed processor has 200,400 gates and 19,400 bits memory, and it is verified by FPGA implementation. The estimated throughput is about 200 Mbps at 120 MHz clock by using Xilinx Virtex-4 FPGA device.

  • PDF