• Title/Summary/Keyword: multi-bit memory

Search Result 77, Processing Time 0.029 seconds

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC

  • Byun, Juwon;Kim, Jaeseok
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.13 no.5
    • /
    • pp.430-442
    • /
    • 2013
  • This paper presents a fast multi-reference frame integer motion estimator for H.264/AVC. The proposed system uses the previously proposed fast multi-reference frame algorithm. The previously proposed algorithm executes a full search area motion estimation in reference frames 0 and 1. After that, the search areas of motion estimation in reference frames 2, 3 and 4 are minimized by a linear relationship between the motion vector and the distances from the current frame to the reference frames. For hardware implementation, the modified algorithm optimizes the search area, reduces the overlapping search area and modifies a division equation. Because the search area is reduced, the amount of computation is reduced by 58.7%. In experimental results, the modified algorithm shows an increase of bit-rate in 0.36% when compared with the five reference frame standard. The pipeline structure and the memory controller are also adopted for real-time video encoding. The proposed system is implemented using 0.13 um CMOS technology, and the gate count is 1089K with 6.50 KB of internal SRAM. It can encode a Full HD video ($1920{\times}1080P@30Hz$) in real-time at a 135 MHz clock speed with 5 reference frames.

A study on the hybrid communication system to remove the communication shadow area for controller system of navigational aids (전파 음영지역 해소를 위한 항로표지관리용 하이브리드 통신 시스템에 관한 연구)

  • Jeon, Joong Sung
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.37 no.4
    • /
    • pp.409-417
    • /
    • 2013
  • Mu-communication board supported by multi-communication is designed with Atxmega 128A1 which is a low power energy consuming of 8-bit microcontroller. ATxmega128A1 microcontroller consists of 8 UART(Universal asynchronous receiver/transmitter) ports which can be setting appropriate user interface having command line interpreter(CLI) program with each port, 2 kbytes EEPROM, 128 kbytes flash memory, 8 kbytes SRAM. 8 URAT ports are used for the multi communication modem, GPS module, etc. and EEPROM is used for saving a configuration for program running, and flash memory of 128 kbytes is used for storing a Firm Ware, and 8 kbytes SRAM is used for stack, storing memory of global variables while program running. If we uses the hybrid communication of path optimization of VHF, TRS and CDMA to remote control AtoN(aid to navigation), it is able to remove the communication shadow area. Even though there is a shadow area for individual communication method, we can select an optimum communication method. The compatibility of data has been enhanced as using of same data frame per communication devices. For the test, 8640 of data has been collected from the each buoy during 30 days in every 5 minutes and the receiving rate of the data has shown more than 99.4 %.

Analysis of read speed latency in 6T-SRAM cell using multi-layered graphene nanoribbon and cu based nano-interconnects for high performance memory circuit design

  • Sandip, Bhattacharya;Mohammed Imran Hussain;John Ajayan;Shubham Tayal;Louis Maria Irudaya Leo Joseph;Sreedhar Kollem;Usha Desai;Syed Musthak Ahmed;Ravichander Janapati
    • ETRI Journal
    • /
    • v.45 no.5
    • /
    • pp.910-921
    • /
    • 2023
  • In this study, we designed a 6T-SRAM cell using 16-nm CMOS process and analyzed the performance in terms of read-speed latency. The temperaturedependent Cu and multilayered graphene nanoribbon (MLGNR)-based nanointerconnect materials is used throughout the circuit (primarily bit/bit-bars [red lines] and word lines [write lines]). Here, the read speed analysis is performed with four different chip operating temperatures (150K, 250K, 350K, and 450K) using both Cu and graphene nanoribbon (GNR) nano-interconnects with different interconnect lengths (from 10 ㎛ to 100 ㎛), for reading-0 and reading-1 operations. To execute the reading operation, the CMOS technology, that is, the16-nm PTM-HPC model, and the16-nm interconnect technology, that is, ITRS-13, are used in this application. The complete design is simulated using TSPICE simulation tools (by Mentor Graphics). The read speed latency increases rapidly as interconnect length increases for both Cu and GNR interconnects. However, the Cu interconnect has three to six times more latency than the GNR. In addition, we observe that the reading speed latency for the GNR interconnect is ~10.29 ns for wide temperature variations (150K to 450K), whereas the reading speed latency for the Cu interconnect varies between ~32 ns and 65 ns for the same temperature ranges. The above analysis is useful for the design of next generation, high-speed memories using different nano-interconnect materials.

Performance Analysis of Implementation on Image Processing Algorithm for Multi-Access Memory System Including 16 Processing Elements (16개의 처리기를 가진 다중접근기억장치를 위한 영상처리 알고리즘의 구현에 대한 성능평가)

  • Lee, You-Jin;Kim, Jea-Hee;Park, Jong-Won
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.3
    • /
    • pp.8-14
    • /
    • 2012
  • Improving the speed of image processing is in great demand according to spread of high quality visual media or massive image applications such as 3D TV or movies, AR(Augmented reality). SIMD computer attached to a host computer can accelerate various image processing and massive data operations. MAMS is a multi-access memory system which is, along with multiple processing elements(PEs), adequate for establishing a high performance pipelined SIMD machine. MAMS supports simultaneous access to pq data elements within a horizontal, a vertical, or a block subarray with a constant interval in an arbitrary position in an $M{\times}N$ array of data elements, where the number of memory modules(MMs), m, is a prime number greater than pq. MAMS-PP4 is the first realization of the MAMS architecture, which consists of four PEs in a single chip and five MMs. This paper presents implementation of image processing algorithms and performance analysis for MAMS-PP16 which consists of 16 PEs with 17 MMs in an extension or the prior work, MAMS-PP4. The newly designed MAMS-PP16 has a 64 bit instruction format and application specific instruction set. The author develops a simulator of the MAMS-PP16 system, which implemented algorithms can be executed on. Performance analysis has done with this simulator executing implemented algorithms of processing images. The result of performance analysis verifies consistent response of MAMS-PP16 through the pyramid operation in image processing algorithms comparing with a Pentium-based serial processor. Executing the pyramid operation in MAMS-PP16 results in consistent response of processing time while randomly response time in a serial processor.

Research on Fault Tolerant Avionics Memory Design through Multi Level Cell Flash Memory Reliability Analysis (멀티 레벨 셀 플래시 메모리 신뢰성 분석을 통한 항공 전자장비용 내결함성 메모리 설계 연구)

  • Jeong, Sang-gyu;Jun, Byung-kyu;Kim, Young-mok;Chang, In-ki
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.4
    • /
    • pp.321-328
    • /
    • 2016
  • Typical MLC NAND flash devices are considered less reliable than SLC NAND flash devices. Although raw bit error rate (RBER) of MLC flash had been considered approximately 1000times or more higher than that of SLC flash, recent research conducted on Google's data center shows that it is much lower than such expectation. Based on the research, we devised In Drive Data Duplication (IDDD) scheme that efficiently exploit MLC flash's sufficient capacity to improve its data reliability without structural complexity increment using SSD intrinsic firmware layer, and showed the data reliability expectation of MLC flash could be significantly higher than that of SLC flash from measured and calculated error rates. Even though RBER of SLC flash was lower than that of MLC flash in 44 out of 48 cases we studied, applying IDDD scheme, RBER of MLC flash was became lower than that of SLC in all 48 cases and uncorrectable bit error rate (UBER) of MLC flash was became lower than that of SLC flash in 45 out of 48 cases.

A Design of Parameterized Viterbi Decoder for Multi-standard Applications (다중 표준용 파라미터화된 비터비 복호기 IP 설계)

  • Park, Sang-Deok;Jeon, Heung-Woo;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.1056-1063
    • /
    • 2008
  • This paper describes an efficient design of a multi-standard Viterbi decoder that supports multiple constraint lengths and code rates. The Viterbi decoder is parameterized for the code rates 1/2, 1/3 and constraint lengths 7,9, thus it has four operation nodes. In order to achieve low hardware complexity and low power, an efficient architecture based on hardware sharing techniques is devised. Also, the optimization of ACCS (Accumulate-Subtract) circuit for the one-point trace-back algorithm reduces its area by about 35% compared to the full parallel ACCS circuit. The parameterized Viterbi decoder core has 79,818 gates and 25,600 bits memory, and the estimated throughput is about 105 Mbps at 70 MHz clock frequency. Also, the simulation results for BER (Bit Error Rate) performance show that the Viterbi decoder has BER of $10^{-4}$ at $E_b/N_o$ of 3.6 dB when it operates with code rate 1/3 and constraints 7.

A VLSI implementation of 32-bit RISC embedded controller (내장형 32비트 RISC 콘트롤러의 VLSI 구현)

  • 이문기;최병윤;이승호
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.10
    • /
    • pp.141-151
    • /
    • 1994
  • this paper describes the design and implementation of a RISC processor for embedded control systems. This RISC processor integrates a register file, a pipelined execution unit, a FPU interface, a memory interface, and an instruction prefetcher. Its characteristics include both single cycle executions of most instructions in a 2 phase 20 MHz frequency and the worst case interrupt latency of 7 cycles with the vectored interrupt handling that makes it possible to be applicable to the real time processing system. For efficient handling of multi-cycle instructions, data stationary hardwired control scheme equippedwith cycle counter was used. This chip integrates about 139K transistors and occupies 9.1mm$\times$9.1mm in a 1.0um DLM CMOS technology. The power dissipation is 0.8 Watts from a 5V supply at 20 MHz operation.

  • PDF

A SSN-Reduced 5Gb/s Parallel Transmitter

  • Lee, Seon-Kyoo;Kim, Young-Sang;Park, Hong-June;Sim, Jae-Yoon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.7 no.4
    • /
    • pp.235-240
    • /
    • 2007
  • A current-balancing segmented group-inverting transmitter is presented for multi-Gb/s single-ended parallel links. With an additional increase of 4 pins, 16-bit data is efficiently encoded to 20 pins to achieve the current balancing and eliminate the simultaneous switching noise. Since the proposed coding is a simple inversion-or-not transformation of pre-defined groups of binary data, it can be implemented with simplified logic circuits. The transmitter is designed with a $0.18{\mu}m$ CMOS technology, and simulated eye diagrams at 5Gb/s show dramatic improvements in signal integrity.

An Improving Motion Estimator based on multi arithmetic Architecture (고밀도 성능향상을 위한 다중연산구조기반의 움직임추정 프로세서)

  • Lee, Kang-Whan
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.631-632
    • /
    • 2006
  • In this paper, acquiring the more desirable to adopt design SoC for the fast hierarchical motion estimation, we exploit foreground and background search algorithm (FBSA) base on the dual arithmetic processor element(DAPE). It is possible to estimate the large search area motion displacement using a half of number PE in general operation methods. And the proposed architecture of MHME improve the VLSI design hardware through the proposed FBSA structure with DAPE to remove the local memory. The proposed FBSA which use bit array processing in search area can improve structure as like multiple processor array unit(MPAU).

  • PDF

Design of Multi-Mode Radar Signal Processor for UAV Detection (무인기 탐지를 위한 멀티모드 레이다 신호처리 프로세서 설계)

  • Lee, Seunghyeok;Jung, Yongchul;Jung, Yunho
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.2
    • /
    • pp.134-141
    • /
    • 2019
  • Radar systems are divided into the pulse Doppler (PD) radar and the frequency modulated continuous wave (FMCW) radar depending on the transmission waveform. In particular, the PD radar is advantageous for long-range target detection, and the FMCW radar is suitable for short-range target detection. In this paper, we present design and implementation results for a multi-mode radar signal processor (RSP) that can support both PD and FMCW radar systems to detect unmanned aerial vehicles (UAVs) at short distances as well as long distances. The proposed radar signal processor can be implemented based on Altera Cyclone-IV FPGA with 19,623 logic elements, 9,759 registers, and 25,190,400 memory bits. The logic elements and registers of the proposed radar signal processor are reduced by approximately 43% and 30%, respectively, compared to the sum of logic elements and registers of the conventional PD radar and FMCW radar signal processor.