• Title/Summary/Keyword: SRAM Memory

Search Result 176, Processing Time 0.018 seconds

Implementation and verification of H.264 / AVC Intra Predictor for mobile environment (모바일 환경에서의 H.264 / AVC를 위한 인트라 예측기의 구현 및 검증)

  • Yun, Cheol-Hwan;Jeong, Yong-Jin
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.12
    • /
    • pp.93-101
    • /
    • 2007
  • Small area and low power implementation are important requirements for various multimedia processing hardware, especially for mobile environment. This paper presents a hardware architecture of H.264/AVC Intra Prediction module aiming on small area and low power. A single arithmetic unit was shared and processed sequentially for all mode decisions and computations to predict an image frame. As a result, we could get smaller area and smaller memory size compared to other existing implementations. The proposed architecture was verified using the Altera Excalibur device, and the implemented hardware has been described in Verilog-HDL and synthesized on Samsung STD130 0.18um CMOS Standard Cell Library using Synopsys Design Compiler. The synthesis result was about 11.9K logic gates and 1078 byte internal SRAM and the maximum operating frequency was 107Mhz. It consumes 879,617 clocks to process one QCIF frame, which means it can process 121.5 QCIF$(176\times144)$ frames per second, therefore it shows that it can be used for real time H.264/AVC encoding of various multimedia applications.

A Study on the Pixel-Parallel Usage Processing Using the Format Converter (포맷 변환기를 이용한 화소-병렬 화상처리에 관한 연구)

  • Kim, Hyeon-Gi;Lee, Cheon-Hui
    • The KIPS Transactions:PartA
    • /
    • v.9A no.2
    • /
    • pp.259-266
    • /
    • 2002
  • In this paper we implemented various image processing filtering using the format converter. This design method is based on realized the large processor-per-pixel array by integrated circuit technology. These two types of integrated structure are can be classify associative parallel processor and parallel process DRAM (or SRAM) cell. Layout pitch of one-bit-wide logic is Identical memory cell pitch to array high density PEs in integrate structure. This format converter design has control path implementation efficiently, and can be utilize the high technology without complicated controller hardware. Sequence of array instruction are generated by host computer before process start, and instructions are saved on unit controller. Host computer is executed the pixel-parallel operation starting at saved instructions after processing start. As a result, we obtained three result that 1) simple smoothing suppresses higher spatial frequencies, reducing noise but also blurring edges, 2) a smoothing and segmentation process reduces noise while preserving sharp edges, and 3) median filtering may be applied to reduce image noise. Median filtering eliminates spikes while maintaining sharp edges and preserving monotonic variations in pixel values.

A Fast IP Lookups using Dynamic Trie Compression (능동적 트라이 압축을 이용한 고속 IP 검색)

  • Oh, Seung-Hyun
    • The KIPS Transactions:PartA
    • /
    • v.10A no.5
    • /
    • pp.453-462
    • /
    • 2003
  • IP address lookup of router searches and decide proper output link using destination address of IP packet that arrie into router. The IP address lookup is essential part in te development of high-speed router needed to high-speed backbone network as one of bottleneck of router performance. This paper introduces DTC data structure that can support gigabit IP address lookup by dynamic trie compression technique that just uses small memory in conventional Pentium CPU. When make a forwarding table by trie compression, the DTC can dynamically select a size of data structure with considering correlation between table's size and searching speed. Also, when compress the prefix trie, DTC makes IP address lookup on the forwarding table of a search on the high speed SRAM cache by minimizing the size of data structure reflecting the structure of the trie. In the experiment result, the DTC data structure recorded performance of maximum $12.5{\times}10^5$ LPS (lookup per second) in conventional Pentium CPU through a dynamic building of most suitable compression over variety of routing tables.

Score Arbitration Scheme For Decrease of Bus Latency And System Performance Improvement (버스 레이턴시 감소와 시스템 성능 향상을 위한 스코어 중재 방식)

  • Lee, Kook-Pyo;Yoon, Yung-Sup
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.2
    • /
    • pp.38-44
    • /
    • 2009
  • Bus system consists of several masters, slaves, arbiter and decoder in a bus. Master means the processor that performs data command like CPU, DMA, DSP and slave means the memory that responds the data command like SRAM, SDRAM and register. Furthermore, as multiple masters can't use a bus concurrently, arbiter plays an role in bus arbitration. In compliance with the selection of arbitration method bus system performance can be charged definitely. Fixed priority and round-robin are used in general arbitration method and TDMA and Lottery bus methods are proposed currently as the improved arbitration schemes. In this stuff, we proposed the score arbitration method and composed TLM algorithm. Also we analyze the performance compared with general arbitration methods through simulation. In the future, bus arbitration policy will be developed with the basis of the score arbitration method and improve the performance of bus system.

A High Performance Flash Memory Solid State Disk (고성능 플래시 메모리 솔리드 스테이트 디스크)

  • Yoon, Jin-Hyuk;Nam, Eyee-Hyun;Seong, Yoon-Jae;Kim, Hong-Seok;Min, Sang-Lyul;Cho, Yoo-Kun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.378-388
    • /
    • 2008
  • Flash memory has been attracting attention as the next mass storage media for mobile computing systems such as notebook computers and UMPC(Ultra Mobile PC)s due to its low power consumption, high shock and vibration resistance, and small size. A storage system with flash memory excels in random read, sequential read, and sequential write. However, it comes short in random write because of flash memory's physical inability to overwrite data, unless first erased. To overcome this shortcoming, we propose an SSD(Solid State Disk) architecture with two novel features. First, we utilize non-volatile FRAM(Ferroelectric RAM) in conjunction with NAND flash memory, and produce a synergy of FRAM's fast access speed and ability to overwrite, and NAND flash memory's low and affordable price. Second, the architecture categorizes host write requests into small random writes and large sequential writes, and processes them with two different buffer management, optimized for each type of write request. This scheme has been implemented into an SSD prototype and evaluated with a standard PC environment benchmark. The result reveals that our architecture outperforms conventional HDD and other commercial SSDs by more than three times in the throughput for random access workloads.

Radiation-Induced Soft Error Detection Method for High Speed SRAM Instruction Cache (고속 정적 RAM 명령어 캐시를 위한 방사선 소프트오류 검출 기법)

  • Kwon, Soon-Gyu;Choi, Hyun-Suk;Park, Jong-Kang;Kim, Jong-Tae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.6B
    • /
    • pp.948-953
    • /
    • 2010
  • In this paper, we propose multi-bit soft error detection method which can use an instruction cache of superscalar CPU architecture. Proposed method is applied to high-speed static RAM for instruction cache. Using 1D parity and interleaving, it has less memory overhead and detects more multi-bit errors comparing with other methods. It only detects occurrence of soft errors in static RAM. Error correction is treated like a cache miss situation. When soft errors are occurred, it is detected by 1D parity. Instruction cache just fetch the words from lower-level memory to correct errors. This method can detect multi-bit errors in maximum 4$\times$4 window.

Laser Thermal Processing System for Creation of Low Temperature Polycrystalline Silicon using High Power DPSS Laser and Excimer Laser

  • Kim, Doh-Hoon;Kim, Dae-Jin
    • 한국정보디스플레이학회:학술대회논문집
    • /
    • 2006.08a
    • /
    • pp.647-650
    • /
    • 2006
  • Low temperature polycrystalline silicon (LTPS) technology using a high power laser have been widely applied to thin film transistors (TFTs) for liquid crystal, organic light emitting diode (OLED) display, driver circuit for system on glass (SOG) and static random access memory (SRAM). Recently, the semiconductor industry is continuing its quest to create even more powerful CPU and memory chips. This requires increasing of individual device speed through the continual reduction of the minimum size of device features and increasing of device density on the chip. Moreover, the flat panel display industry also need to be brighter, with richer more vivid color, wider viewing angle, have faster video capability and be more durable at lower cost. Kornic Systems Co., Ltd. developed the $KORONA^{TM}$ LTP/GLTP series - an innovative production tool for fabricating flat panel displays and semiconductor devices - to meet these growing market demands and advance the volume production capabilities of flat panel displays and semiconductor industry. The $KORONA^{TM}\;LTP/GLTP$ series using DPSS laser and XeCl excimer laser is designed for the new generation of the wafer & FPD glass annealing processing equipment combining advanced low temperature poly-silicon (LTPS) crystallization technology and object-oriented software architecture with a semistandard graphical user interface (GUI). These leading edge systems show the superior annealing ability to the conventional other method. The $KORONA^{TM}\;LTP/GLTP$ series provides technical and economical benefits of advanced annealing solution to semiconductor and FPD production performance with an exceptional level of productivity. High throughput, low cost of ownership and optimized system efficiency brings the highest yield and lowest cost per wafer/glass on the annealing market.

  • PDF

Design and implementation of the SliM image processor chip (SliM 이미지 프로세서 칩 설계 및 구현)

  • 옹수환;선우명훈
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.33A no.10
    • /
    • pp.186-194
    • /
    • 1996
  • The SliM (sliding memory plane) array processor has been proposed to alleviate disadvantages of existing mesh-connected SIMD(single instruction stream- multiple data streams) array processors, such as the inter-PE(processing element) communication overhead, the data I/O overhead and complicated interconnections. This paper presents the deisgn and implementation of SliM image processor ASIC (application specific integrated circuit) chip consisting of mesh connected 5 X 5 PE. The PE architecture implemented here is quite different from the originally proposed PE. We have performed the front-end design, such as VHDL (VHSIC hardware description language)modeling, logic synthesis and simulation, and have doen the back-end design procedure. The SliM ASIC chip used the VTI 0.8$\mu$m standard cell library (v8r4.4) has 55,255 gates and twenty-five 128 X 9 bit SRAM modules. The chip has the 326.71 X 313.24mil$^{2}$ die size and is packed using the 144 pin MQFP. The chip operates perfectly at 25 MHz and gives 625 MIPS. For performance evaluation, we developed parallel algorithms and the performance results showed improvement compared with existing image processors.

  • PDF

Multi-match Packet Classification Scheme Combining TCAM with an Algorithmic Approach

  • Lim, Hysook;Lee, Nara;Lee, Jungwon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.1
    • /
    • pp.27-38
    • /
    • 2017
  • Packet classification is one of the essential functionalities of Internet routers in providing quality of service. Since the arrival rate of input packets can be tens-of-millions per second, wire-speed packet classification has become one of the most challenging tasks. While traditional packet classification only reports a single matching result, new network applications require multiple matching results. Ternary content-addressable memory (TCAM) has been adopted to solve the multi-match classification problem due to its ability to perform fast parallel matching. However, TCAM has a fundamental issue: high power dissipation. Since TCAM is designed for a single match, the applicability of TCAM to multi-match classification is limited. In this paper, we propose a cost- and energy-efficient multi-match classification architecture that combines TCAM with a tuple space search algorithm. The proposed solution uses two small TCAM modules and requires a single-cycle TCAM lookup, two SRAM accesses, and several Bloom filter query cycles for multi-match classifications.

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC

  • Byun, Juwon;Kim, Jaeseok
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.13 no.5
    • /
    • pp.430-442
    • /
    • 2013
  • This paper presents a fast multi-reference frame integer motion estimator for H.264/AVC. The proposed system uses the previously proposed fast multi-reference frame algorithm. The previously proposed algorithm executes a full search area motion estimation in reference frames 0 and 1. After that, the search areas of motion estimation in reference frames 2, 3 and 4 are minimized by a linear relationship between the motion vector and the distances from the current frame to the reference frames. For hardware implementation, the modified algorithm optimizes the search area, reduces the overlapping search area and modifies a division equation. Because the search area is reduced, the amount of computation is reduced by 58.7%. In experimental results, the modified algorithm shows an increase of bit-rate in 0.36% when compared with the five reference frame standard. The pipeline structure and the memory controller are also adopted for real-time video encoding. The proposed system is implemented using 0.13 um CMOS technology, and the gate count is 1089K with 6.50 KB of internal SRAM. It can encode a Full HD video ($1920{\times}1080P@30Hz$) in real-time at a 135 MHz clock speed with 5 reference frames.