• Title/Summary/Keyword: Multi-access Memory System

Search Result 53, Processing Time 0.027 seconds

A Parallel Processing System for Visual Media Applications (시각매체를 위한 병렬처리 시스템)

  • Lee, Hyung;Pakr, Jong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.1A
    • /
    • pp.80-88
    • /
    • 2002
  • Visual media(image, graphic, and video) processing poses challenge from several perpectives, specifically from the point of view of real-time implementation and scalability. There have been several approaches to obtain speedups to meet the computing demands in multimedia processing ranging from media processors to special purpose implementations. A variety of parallel processing strategies are adopted in these implementations in order to achieve the required speedups. We have investigated a parallel processing system for improving the processing speed o f visual media related applications. The parallel processing system we proposed is similar to a pipelined memory stystem(MAMS). The multi-access memory system is made up of m memory modules and a memory controller to perform parallel memory access with a variety of combinations of 1${\times}$pq, pq${\times}$1, and p${\times}$q subarray, which improves both cost and complexity of control. Facial recognition, Phong shading, and automatic segmentation of moving object in image sequences are some that have been applied to the parallel processing system and resulted in faithful processing speed. This paper describes the parallel processing systems for the speedup and its utilization to three time-consuming applications.

A Block Allocation Policy to Enhance Wear-leveling in a Flash File System (플래시 파일시스템에서 wear-leveling 개선을 위한 블록 할당 정책)

  • Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.574-577
    • /
    • 2007
  • While disk can be overwritten on updating data, because flash memory can not be overwritten on updating data, new data are updated in new area. If data are frequently updated, garbage collection, which is achieved by erasing blocks, should be performed to reclaim new area. Hence, because the number of erase operations is limited due to characteristics of flash memory, every block should be evenly written and erased. However, if data with access locality are processed by cost benefit algorithm with separation of hot block and cold block, though the performance of processing is high, wear-leveling is not even. In this paper, we propose CB-MB (Cost Benefit between Multi Bank) algorithm in which hot data are allocated in one bank and cold data in another bank, and in which role of hot bank and cold bank is exchanged every period. CB-MB showed that its performance was similar to that of others for uniform workload, however, the method provides much better performance than that of others for workload of access locality.

  • PDF

Real-time Implementation of a Tone Sender/Receiver on a High Performance DSP (고성능 DSP를 이용한 톤 송수신기의 실시간 구현)

  • 최용수;함정표;조성범;강태익;윤정현
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.4
    • /
    • pp.276-285
    • /
    • 2003
  • In this paper, we present real-time implementation of a R2MFC/DTMF (R2 Multi Frequency Combinations/Dual Tone Multiple Frequency) tone receiver/sender using a high performance DSP (Digital Signal Processor) and apply it to a carrier class VoIP (Voice over Internet Protocol) gateway system. The Receiver utilizes the Goertzel filter and the sender adopts the harmonic resonant filter. We describe, in detail, the techniques of multi-channel real-time implementation on a Texas Instruments TMS320C62x DSP such as effective PCM (Pulse Code Modulation) in/out by means of DMA (Direct Memory Access) and McBSP (Multi Channel Buffered Serial Port) and message communication via HPI (Host Port Interface), etc. From experimental results, we confirmed that the optimized code provided 780 channel capacity at 250㎒ C6202, and the our R2MFC/DTMF receiver/sender met ITU-T (International Telecommunication Union-Telecommunication) specifications.

A Dual Slotted Ring Organization for Reducing Memory Access Latency in Distributed Shared Memory System (분산 공유 메모리 시스템에서 메모리 접근지연을 줄이기 위한 이중 슬롯링 구조)

  • Min, Jun-Sik;Chang, Tae-Mu
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.419-428
    • /
    • 2001
  • Advances in circuit and integration technology are continuously boosting the speed of processors. One of the main challenges presented by such developments is the effective use of powerful processors in shared memory multiprocessor system. We believe that the interconnection problem is not solved even for small scale shared memory multiprocessor, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new powerful processors. In the past few years, point-to-point unidirectional connection have emerged as a very promising interconnection technology. The single slotted ring is the simplest form point-to-point interconnection. The main limitation of the single slotted ring architecture is that latency of access increase linearly with the number of the processors in the ring. Because of this, we proposed the dual slotted ring as an alternative to single slotted ring for cache-based multiprocessor system. In this paper, we analyze the proposed dual slotted ring architecture using new snooping protocol and enforce simulation to compare it with single slotted ring.

  • PDF

A method for improving wear-leveling of flash file systems in workload of access locality (접근 지역성을 가지는 작업부하에서 플래시 파일시스템의 wear-leveling 향상 기법)

  • Jang, Si-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.1
    • /
    • pp.108-114
    • /
    • 2008
  • Since flash memory cannot be overwritten, new data are updated in new area. If data are frequently updated, garbage collection which is achieved by erasing blocks, should be performed to reclaim new area. Hence, because the count of erase operations is limited due to characteristics of flash memory, every block should be evenly written and erased. However, if data with access locality are processed by cost benefit algorithm with separation of hot block ad cold block though the performance of processing is hight wear-leveling is not even. In this paper, we propose CB-MB (Cost Benefit between Multi Bank) algorithm in which hot data are allocated in one bank and cold data in another bank, and in which role of hot bank and cold bank is exchanged every period. CB-MB shows that its performance is 30% better than cost benefit algorithm with separation of cold block and hot block its wear-leveling is about a third of that in standard deviation.

Design of Main Computer Board for MSC on KOMPSAT-2

  • Heo, H.P.;Kong, J.P.;Yong, S.S.;Kim, Y.S.;Park, J.E.;Youn, H.S.;Paik, H.Y.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1096-1098
    • /
    • 2003
  • SBC(Single Board Computer) is being developed for MSC(Multi-Spectral Camera) on KOMPSAT-2(Korea Multi-Purpose Satellite). SBC controls all the units of MSC system and gets commands and sends telemetry to and from spacecraft bus via 1553 communication channel. Due to the fact that SBC does very important roles for MSC system operation and SBC operates with 100% duty cycle, SBC is designed to have high reliability. SBC which has Intel 80486 as a main processor includes eight serial communication channels, one mil-std-1553 interface channel and several discrete interfaces. SBC incorporates 2Mbyte radiation hardened SRAM(Static Random Access Memory) and 1Mbyte flash memory. There are also PIC(Programmable Interrupt Controller), counter, WDT(Watch Dog Timer) in the SBC. In this paper, the design result of the SBC is presented.

  • PDF

An Incremental Multi Partition Averaging Algorithm Based on Memory Based Reasoning (메모리 기반 추론 기법에 기반한 점진적 다분할평균 알고리즘)

  • Yih, Hyeong-Il
    • Journal of IKEEE
    • /
    • v.12 no.1
    • /
    • pp.65-74
    • /
    • 2008
  • One of the popular methods used for pattern classification is the MBR (Memory-Based Reasoning) algorithm. Since it simply computes distances between a test pattern and training patterns or hyperplanes stored in memory, and then assigns the class of the nearest training pattern, it is notorious for memory usage and can't learn additional information from new data. In order to overcome this problem, we propose an incremental learning algorithm (iMPA). iMPA divides the entire pattern space into fixed number partitions, and generates representatives from each partition. Also, due to the fact that it can not learn additional information from new data, we present iMPA which can learn additional information from new data and not require access to the original data, used to train. Proposed methods have been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.

  • PDF

Hot Data Identification For Flash Based Storage Systems Considering Continuous Write Operation

  • Lee, Seung-Woo;Ryu, Kwan-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.2
    • /
    • pp.1-7
    • /
    • 2017
  • Recently, NAND flash memory, which is used as a storage medium, is replacing HDD (Hard Disk Drive) at a high speed due to various advantages such as fast access speed, low power, and easy portability. In order to apply NAND flash memory to a computer system, a Flash Translation Layer (FTL) is indispensably required. FTL provides a number of features such as address mapping, garbage collection, wear leveling, and hot data identification. In particular, hot data identification is an algorithm that identifies specific pages where data updates frequently occur. Hot data identification helps to improve overall performance by identifying and managing hot data separately. MHF (Multi hash framework) technique, known as hot data identification technique, records the number of write operations in memory. The recorded value is evaluated and judged as hot data. However, the method of counting the number of times in a write request is not enough to judge a page as a hot data page. In this paper, we propose hot data identification which considers not only the number of write requests but also the persistence of write requests.

Design of a Low Memory Bandwidth Inter Predictor Using Implicit Weighted Prediction Technique (묵시적 가중 예측기법을 이용한 저 메모리 대역폭 인터 예측기 설계)

  • Kim, Jinyoung;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.12
    • /
    • pp.2725-2730
    • /
    • 2012
  • In this paper, for improving the H.264/AVC hardware performance, we propose an inter predictor hardware design using a multi reference frame selector and an implicit weighted predictor. previous reference frame are reused for Low Memory Bandwidth. The size of the reference memory in the predictor was reduced by about 46% and the external memory access rate was reduced by about 24% compared with the one in the reference software JM16.0. We designed the proposed system with Verilog-HDL and synthesized inter predictor circuit using the Magnachip 0.18um CMOS standard cell library. The synthesis result shows that the gate count is about 2,061k and the design can run at 91MHz.

Scheduler for parallel processing with finely grained tasks

  • Hosoi, Takafumi;Kondoh, Hitoshi;Hara, Shinji
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10b
    • /
    • pp.1817-1822
    • /
    • 1991
  • A method of reducing overhead caused by the processor synchronization process and common memory accesses in finely grained tasks is described. We propose a scheduler which considers the preparation time during searching to minimize the redundant accesses to shared memory. Since the suggested hardware (synchronizer) determines the access order of processors and bus arbitration simultaneously by including the synchronization process into the bus arbitration process, the synchronization time vanishes. Therefore this synchronizer has no overhead caused by the processor synchronization[l]. The proposed scheduler algorithm is processed in parallel. The processes share the upper bound derived by each searching and the lower bound function is built considering the preparation time in order to eliminate as many searches as possible. An application of the proposed method to a multi-DSP system to calculate inverse dynamics for robot arms, showed that the sampling time can be twice shorter than that of the conventional one.

  • PDF