• Title/Summary/Keyword: SystemVerilog

Search Result 198, Processing Time 0.026 seconds

FPGA Design of a Parallel Canny Edge Detector with Optimized Local Buffers (로컬 버퍼 최적화를 통한 병렬 처리 캐니 경계선 검출기의 FPGA 설계)

  • Ingi Min;Suhyun Sim;Seungwon Hwang;Sunhee Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.4
    • /
    • pp.59-65
    • /
    • 2023
  • Edge detection in image processing and computer vision is one of the most fundamental operations. Canny edge detection algorithm has excellent performance and is currently widely used. However, it is difficult to process the algorithm in real-time because the algorithm is complex. In this study, the equations required in the algorithm were simplified to facilitate hardware implementation, and the calculation speed was increased by using a parallel structure. In particular, the size and management of local buffers were selected in consideration of parallel processing and filter size so that data could be processed without bottlenecks. It was designed in verilog and implemented in FPGA to verify operation and performance.

  • PDF

SLEDS:A System-Level Event-Driven Simulator for Asynchronous Microprocessors (SLEDS:비동기 마이크로프로세서를 위한 상위 수준 사건구동식 시뮬레이터)

  • Choi, Sang-Ik;Lee, Jeong-Gun;Kim, Eui-Seok;Lee, Dong-Ik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.1
    • /
    • pp.42-56
    • /
    • 2002
  • It is possible but not efficient to model and simulate asynchronous microprocessors with the existing HDLs(HARDware Description Languages) such as VHDL or Verilog. The reason it that the description becomes too complex. and also the simulation time becomes too long to explore the design space. Therefore it is necessary to establish a methodology and develop a tool for modeling the handshake protocol of asynchronous microprocessors very easily and simulating it very fast. Under this objective an efficient CAD(Computer Aided Design) tool SLEDS(System Level Event-Driven Simulator) was developed which can evaluate performance of a processor through modeling with a simple description an simulating with event driven engine in the system level. The ultimate goal in the tool SLEDS is to fin the optimal conditions for a system to produce high performance by balancing the delay of each module in the system. Besides SLEDS aims at verifying the design through comparing the expected results with the actual ones by performing the defined behavior.

Comparison among Gamma(${\gamma}$) Line Systems for Non-Linear Gamma Curve (비선형 감마 커브를 위한 감마 라인 시스템의 비교)

  • Jang, Won-Woo;Lee, Sung-Mok;Ha, Joo-Young;Kim, Joo-Hyun;Kim, Sang-Choon;Kang, Bong-Soon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.2
    • /
    • pp.265-272
    • /
    • 2007
  • This proposed gamma (${\gamma}$) correction system is developed to reduce the difference between non-linear gamma curve produced by a typical formula and result produced by the proposed algorithm. In order to reduce the difference, the proposed system is using the Least Squares Polynomial which is calculating the best fitting polynomial through a set of points which is sampled. Each system is consisting of continuous several kinds of equations and having their own overlap sections to get more precise. Based on the algorithm verified by MATLAB, the proposed systems are implemented by using Verilog-HDL. This paper will compare the previous algorithm of gamma system such as Existing system with Seed Table with the latest that such as Proposed system. The former and the latter system have 1, 2 clock latency; each 1 result per clock. Because each of the error range (LSB) is $1{\sim}+1,\;0{\sim}+36$, we can how that Proposed system is improved. Under the condition of SAMSUNG STD90 0.35 worst case, each gate count is 2,063, 2,564 gates and each maximum data arrival time is 29.05[ns], 17.52[ns], respectively.

Design of 10bit gamma line system with small size of gate count and 4bit error(LSB) to implement non-linear gamma curve (비선형 감마 커브 구현을 위한 작은 크기와 4bit(LSB) 오차를 가진 10비트 감마 라인 시스템의 설계)

  • Jang, Won-Woo;Kim, Hyun-Sik;Lee, Sung-Mok;Kim, In-Kyu;Kang, Bong-Soon
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2005.11a
    • /
    • pp.353-356
    • /
    • 2005
  • In this paper, the proposed $gamma({\gamma})$ line system is developed for reducing the error between non-linear gamma curve produced by a formula and result produced by hardware implementation. The proposed algorithm and system is based on the specific gamma value 2.2, namely the formula is represented by {0,1}$^{2.2}$ and the bit width of input and out data is 10bit. In order to reduce the error, the system is using least squares polynomial of the numerical method which is calculating the best fitting polynomial through a set of points. The proposed gamma line is consisting of nine kinds of quadratic equations, each with their own overlap sections to get more precise. Based on the algorithm verified by $MATLAB^{TM}$ 7.0, the proposed system is implemented by using Verilog-HDL. The proposed system has 2 clock latency; 1 result per clock. The error range (LSB) is -4 and +3. Its standard deviation is 1.287956238. The total gate count of system is 2,083 gates and the maximum timing is 15.56[ns].

  • PDF

Design of Message Passing Engine Based on Processing Node Status for MPI Collective Communication (MPI 집합통신을 위한 프로세싱 노드 상태 기반의 메시지 전달 엔진 설계)

  • Chung, Won-Young;Lee, Yong-Surk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.8B
    • /
    • pp.668-676
    • /
    • 2012
  • In this paper, on the assumption that MPI collective communication function is converted into a group of point-to-point communication functions in the transaction level, an algorithm that optimizes broadcast, scatter and gather function among MPI collective communication is proposed. The MPI hardware engine that operates the proposed algorithm was designed, and it was named the OCC-MPE (Optimized Collective Communication Message Passing Engine). The OCC-MPE operates point-to-point communication by using the standard send mode. The transmission order is arranged according to the algorithm that proposes the most frequently used broadcast, scatter and gather functions among the collective communications, so the whole communication time is reduced. To measure the performance of the proposed algorithm, the OCC-MPE with the Bus Functional Model (BFM) based on SystemC was designed. After evaluating the performance through the BFM based on SystemC, the proposed OCC-MPE is designed by using VerilogHDL. As a result of synthesizing with the TSMC $0.18{\mu}m$, the gate count of each OCC-MPE is approximately 1978.95 with four processing nodes. That occupies approximately 4.15% in the whole system, which means it takes up a relatively small amount. Improved performance is expected with relatively small amounts of area increase if the OCC-MPE operated by the proposed algorithm is added to the MPSoC (Multi-Processor System on a Chip).

System Development and IC Implementation of High-quality and High-performance Image Downscaler Using 2-D Phase-correction Digital Filters (2차원 위상 교정 디지털 필터를 이용한 고성능/고화질의 영상 축소기 시스템 개발 및 IC 구현)

  • 강봉순;이영호;이봉근
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.3
    • /
    • pp.93-101
    • /
    • 2001
  • In this paper, we propose an image downscaler used in multimedia video applications, such as DTV, TV-PIP, PC-video, camcorder, videophone and so on. The proposed image downscaler provides a scaled image of high-quality and high-performance. This paper will explain the scaling theory using two-dimensional digital filters. It is the method that removes an aliasing noise and decreases the hardware complexity, compared with Pixel-drop and Upsamling. Also, this paper will prove it improves scaling precisians and decreases the loss of data, compared with the Scaler32, the Bt829 of Brooktree, and the SAA7114H of Philips. The proposed downscaler consists of the following four blocks: line memory, vertical scaler, horizontal scaler, and FIFO memory. In order to reduce the hardware complexity, the using digital filters are implemented by the multiplexer-adder type scheme and their all the coefficients can be simply implemented by using shifters and adders. It also decreases the loss of high frequency data because it provides the wider BW of 6MHz as adding the compensation filter. The proposed downscaler is modeled by using the Verilog-HDL and the model is verified by using the Cadence simulator. After the verification is done, the model is synthesized into gates by using the Synopsys. The synthesized downscaler is Placed and routed by the Mentor with the IDEC-C632 0.65${\mu}{\textrm}{m}$ library for further IC implementation. The IC master is fixed in size by 4,500${\mu}{\textrm}{m}$$\times$4,500${\mu}{\textrm}{m}$. The active layout size of the proposed downscaler is 2,528${\mu}{\textrm}{m}$$\times$3,237${\mu}{\textrm}{m}$.

  • PDF

Design and Implementation of Seismic Data Acquisition System using MEMS Accelerometer (MEMS형 가속도 센서를 이용한 지진 데이터 취득 시스템의 설계 및 구현)

  • Choi, Hun;Bae, Hyeon-Deok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.6
    • /
    • pp.851-858
    • /
    • 2012
  • In this paper, we design a seismic data acquisition system(SDAS) and implement it. This system is essential for development of a noble local earthquake disaster preventing system in population center. In the system, we choose a proper MEMS-type triaxial accelerometer as a sensor, and FPGA and ARM processor are used for implementing the system. In the SDAS, each module is realized by Verilog HDL and C Language. We carry out the ModelSim simulation to verify the performances of important modules. The simulation results show that the FPGA-based data acquisition module can guarantee an accurate time-synchronization for the measured data from each axis sensor. Moreover, the FPGA-ARM based embedded technology in system hardware design can reduce the system cost by the integration of data logger, communication sever, and facility control system. To evaluate the data acquisition performance of the SDAS, we perform experiments for real seismic signals with the exciter. Performances comparison between the acquired data of the SDAS and the reference sensor shows that the data acquisition performance of the SDAS is valid.

Implementation of the Digital Current Control System for an Induction Motor Using FPGA (FPGA를 이용한 유도 전동기의 디지털 전류 제어 시스템 구현)

  • Yang, Oh
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.11
    • /
    • pp.21-30
    • /
    • 1998
  • In this paper, a digital current control system using a FPGA(Field Programmable Gate Array) was implemented, and the system was applied to an induction motor widely used as an industrial driving machine. The FPGA designed by VHDL(VHSIC Hardware Description Language) consists of a PWM(Pulse Width Modulation) generation block, a PWM protection block, a speed measuring block, a watch dog timer block, an interrupt control block, a decoder logic block, a wait control block and digital input and output blocks respectively. Dedicated clock inputs on the FPGA were used for high-speed execution, and an up-down counter and a latch block were designed in parallel, in order that the triangle wave could be operated at 40 MHz clock. When triangle wave is compared with many registers respectively, gate delay occurs from excessive fan-outs. To reduce the delay, two triangle wave registers were implemented in parallel. Amplitude and frequency of the triangle wave, and dead time of PWM could be changed by software. This FPGA was synthesized by pASIC 2SpDE and Synplify-Lite synthesis tool of Quick Logic company. The final simulation for worst cases was successfully performed under a Verilog HDL simulation environment. And the FPGA programmed for an 84 pin PLCC package was applied to digital current control system for 3-phase induction motor. The digital current control system of the 3 phase induction motor was configured using the DSP(TMS320C31-40 MHz), FPGA, A/D converter and Hall CT etc., and experimental results showed the effectiveness of the digital current control system.

  • PDF

Using DSP Algorithms for CRC in a CAN Controller

  • Juan, Ronnie O. Serfa;Kim, Hi Seok
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.1
    • /
    • pp.29-34
    • /
    • 2016
  • A controller area network (CAN) controller is an integral part of an electronic control unit, particularly in an advanced driver assistance system application, and its characteristics should always be advantageous in all aspects of functionality especially in real time application. The cost should be low, while maintaining the functionality and reliability of the technology. However, a CAN protocol implementing serial operation results in slow throughput, especially in a cyclical redundancy checking (CRC) unit. In this paper, digital signal processing (DSP) algorithms are implemented, namely pipelining, unfolding, and retiming the CAN controller in the CRC unit, particularly for the encoder and decoder sections. It must attain a feasible iteration bound, a critical path that is appropriate for a CAN system, and must obtain a superior design of a high-speed parallel circuit for the CRC unit in order to have a faster transmission rate. The source code for the encoder and decoder was formulated in the Verilog hardware description language.

Implementation of the Multi-Channel Network Controller using Buffer Sharing Mechanism (버퍼공유기법을 사용한 멀티채널 네트워크 컨트롤러 구현)

  • Lee, Tae-Su;Park, Jae-Hyun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.4
    • /
    • pp.784-789
    • /
    • 2007
  • This paper presents an implementation of a new type of architecture to improve an overflow problem on the network buffer. Each receiver channel of network system stores the message in its own buffer. If some receiver channel receives many messages, buffer overflow problem may occur for the channel. This paper proposes a network controller that implements a receiver channel with shared-memory to save all of the received messages from the every incomming channels. The proposed architecture is applied to ARINC-429, a real-time control network for commercial avionics system. For verifying performance of the architecture, ARINC-429 controller is designed using a SOPC platform, designed by Verilog and targeted to Xilinx Virtex-4 with a built-in PPC405 core.