• Title/Summary/Keyword: barrel shifter

Search Result 19, Processing Time 0.02 seconds

Log N-stage self-fouting ATM concentrator (Log N 단 자기루팅 ATM 셀 집중기)

  • 이성창
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.2
    • /
    • pp.50-57
    • /
    • 1998
  • In this paper, we propoe a log N-stage ATM concentator. ATM concentrator is used in the ATM access network to concentrate the traffic offered at UNI, so that high utilization of the newtork rewources is achieved. The concetrator may be used as a construction block in the design of ATM switch. We define abasic element, named equalizer, and decribe its function and theory to construct an efficient concentrator using the element. In addition, a control scheme to enhance the concentrator to a superconcentrator is presented, which enables the concentration to start from an arbitrary output. This scheme makes it possible to construct an efficient ditributior or a barrel shifter which are often used in ATM switch and other application. The proposed concentrator has a low hardware complexity of O(N log N), thus it is economical to implement. Also, the time complexity of proposed concentrator for determining the rouging is O(N log N), which is faster than that of the existing ones.

  • PDF

Design of a Bidirectional Switching Network for High-Speed Processing of LSI Pattern Data (LSI패턴 데이타 고속처리용 양방향 스위칭 네트워크 설계)

  • Kim, Seong-Jin;Seo, Hui-Don
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.1
    • /
    • pp.99-104
    • /
    • 1994
  • This paper proposes the method to process many pattern data 2-dimensionally at high speed in designing the physical of LSI. And this study shows that the switching network,which transmits pattern data between memory and processing elements at high speed on bidirection,has been designed using the barrel shifter and simulated with VHDL design system.

  • PDF

Implementation of A Pulse-mode Digital Neural Network with On-chip Learning Using Stochastic Computation (On-Chip 학습기능을 가진 확률연산 펄스형 디지털 신경망의 구현)

  • Wee, Jae-Woo;Lee, Chong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2296-2298
    • /
    • 1998
  • In this paper, an on-chip learning pulse-mode digital neural network with a massively parallel yet compact and flexible network architecture is suggested. Algebraic neural operations are replaced by stochastic processes using pseudo-random sequences and simple logic gates are used as basic computing elements. Using Back-propagation algorithm both feed-forward and learning phases are efficiently implemented with simple logical gates. RNG architecture using LFSR and barrel shifter are adopted to avoid some correlation between pulse trains. Suggested network is designed in digital circuit and its performance is verified by computer simulation.

  • PDF

Design of a Low Power Reconfigurable DSP with Fine-Grained Clock Gating (정교한 클럭 게이팅을 이용한 저전력 재구성 가능한 DSP 설계)

  • Jung, Chan-Min;Lee, Young-Geun;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.2
    • /
    • pp.82-92
    • /
    • 2008
  • Recently, many digital signal processing(DSP) applications such as H.264, CDMA and MP3 are predominant tasks for modern high-performance portable devices. These applications are generally computation-intensive, and therefore, require quite complicated accelerator units to improve performance. Designing such specialized, yet fixed DSP accelerators takes lots of effort. Therefore, DSPs with multiple accelerators often have a very poor time-to-market and an unacceptable area overhead. To avoid such long time-to-market and high-area overhead, dynamically reconfigurable DSP architectures have attracted a lot of attention lately. Dynamically reconfigurable DSPs typically employ a multi-functional DSP accelerator which executes similar, yet different multiple kinds of computations for DSP applications. With this type of dynamically reconfigurable DSP accelerators, the time to market reduces significantly. However, integrating multiple functionalities into a single IP often results in excessive control and area overhead. Therefore, delay and power consumption often turn out to be quite excessive. In this thesis, to reduce power consumption of dynamically reconfigurable IPs, we propose a novel fine-grained clock gating scheme, and to reduce size of dynamically reconfigurable IPs, we propose a compact multiplier-less multiplication unit where shifters and adders carry out constant multiplications.

Functional-Level Design and Simulation of a Graphics Processor (그래픽스 프로세서의 기능적 설계 및 시뮬레이션)

  • Bae, Seong-Ok;Lee, Hee-Choul;Kyung, Chong-Min
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.10
    • /
    • pp.1252-1262
    • /
    • 1988
  • This paper describes a functional-level design and simulation of Graphics Processor(GP) which can be used in various graphics systems. GP is divided into two parts: One is CPU, and the other is the interface to I/O peripherals. In order to achieve fast execution of graphics instructions, the CPU has special ALU, barrel shifter and window comparator and a FIFO for instruction prefetch. I/O part controls the DRAM and VRAM which constitute the GP's local memory, generates the signals to drive monitor, and communicates with the host processor. The functional simulation of CPU was done on Daisy workstation while the I/O part was designed using GENESIL, a silicon compiler.

  • PDF

Design of Hardwired Variable Length Decoder for H.264/AVC (하드웨어 구조의 H.264/AVC 가변길이 복호기 설계)

  • Yu, Yong-Hoon;Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.71-76
    • /
    • 2008
  • H.264(or MPEG-4/AVC pt.10) is a high performance video coding standard, and is widely used. Variable length code (VLC) of the H.264 standard compresses data using the statistical distribution of values. A decoder parses the compressed bit stream and searches decoded values in lookup tables, and the decoding process is not easy to implement by hardware. We propose an architecture of variable length decoder(VLD) for the H.264 baseline profile(BP) L4. The CAVLD decodes syntax elements using the combination of arithmetic units and lookup tables for the optimized hardware architecture. A barral shifter and a first 1's detector parse NAL bit stream, and are shared by Exp-Golomb decoder and CAVLD. A FIFO memory between CAVLD and the reorder unit and a buffer at the output of the reorder unit eliminate the bottleneck of data stream. The proposed VLD is designed using Verilog-HDL and is implemented using an FPGA. The synthesis result using a 0.18um standard CMOS technology shows that the gate count is 22,604 and the decoder can process HD($1920{\times}1080$) video at 120MHz.

Design of Low Cost H.264/AVC Entropy Coding Unit Using Code Table Pattern Analysis (코드 테이블 패턴 분석을 통한 저비용 H.264/AVC 엔트로피 코딩 유닛 설계)

  • Song, Sehyun;Kim, Kichul
    • Journal of IKEEE
    • /
    • v.17 no.3
    • /
    • pp.352-359
    • /
    • 2013
  • This paper proposes an entropy coding unit for H.264/AVC baseline profile. Entropy coding requires code tables for macroblock encoding. There are patterns in codewords of each code tables. In this paper, the patterns between codewords are analyzed to reduce the hardware cost. The entropy coding unit consists of Exp-Golomb unit and CAVLC unit. The Exp-Golomb unit can process five code types in a single unit. It can perform Exp-Golomb processing using only two adders. While typical CAVLC units use various code tables which require large amounts of resources, the sizes of the tables are reduced to about 40% or less of typical CAVLC units using relationships between table elements in the proposed CAVLC unit. After the Exp-Golomb unit and the CAVLC unit generate code values, the entropy unit uses a small size shifter for bit-stream generation while typical methods are barrel shifters.

Design of Entropy Encoder for Image Data Processing (화상정보처리를 위한 엔트로피 부호화기 설계)

  • Lim, Soon-Ja;Kim, Hwan-Yong
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.1
    • /
    • pp.59-65
    • /
    • 1999
  • In this paper, we design a entorpy encoder of HDTV/DTV encoder blocks on the basis of MPEG-II. The designed entropy encoder outputs its bit stream at 9Mbps bit rate inserting zero-stepping block to protect the depletion of buffer in case that the generated bit stream is stored in buffer and uses not only PROM bit combinational circuit to solve the problem of critical path, and packer block, one of submerge, is designed to packing into 24 bit unit using barrel shifter, and it is constructed to blocks of header information encoder, input information delay, submerge, and buffer control. Designed circuits is verified by VHDL function simulation, as a result of performing P&R with Gate compiler that apply $0.8{\mu}m$ Gate Array specification, pin and gate number of total circuits has been tested to each 235 and about 120,000.

  • PDF

An Efficient Bit Stream Instruction-set for Network Packet Processing Applications (네트워크 패킷 처리를 위한 효율적인 비트 스트림 명령어 세트)

  • Yoon, Yeo-Phil;Lee, Yong-Surk;Lee, Jung-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.10
    • /
    • pp.53-58
    • /
    • 2008
  • This paper proposes a new set of instructions to improve the packet processing capacity of a network processor. The proposed set of instructions is able to achieve more efficient packet processing by accelerating integration of packet headers. Furthermore, a hardware configuration dedicated to processing overlay instructions was designed to reduce additional hardware cost. For this purpose, the basic architecture for the network processor was designed using LISA and the overlay block was optimized based on the barrel shifter. The block was synthesized to compare the area and the operation delay, and allocated to a C-level macro function using the compiler known function (CKF). The improvement in performance was confirmed by comparing the execution cycle and the execution time of an application program. Experiments were conducted using the processor designer and the compiler designer from Coware. The result of synthesis with the TSMC ($0.25{\mu}m$) from Synopsys indicated a reduction in operation delay by 20.7% and an improvement in performance of 30.8% with the proposed set of instructions for the entire execution cycle.