• Title/Summary/Keyword: real memory

Search Result 1,104, Processing Time 0.029 seconds

Design of an OLED Controller to Display Realtime Moving Pictures on Mobile Display (실시간 동영상 구현을 위한 모바일용 OLED 제어기 설계)

  • Cho, Young-Sung;Lee, Yong-Hwan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.877-880
    • /
    • 2005
  • As DMB, 3D game, Internet and movie is serviced for the recent mobile devices, high resolution display devices beyond VGA become used. Implementation of real-time moving pictures of 30렌 by software programming is difficult because the performance of mobile processors is not so high. The full frame moving picture can be supported by using specific hardware. In this paper, an OLED controller that is consists of flash memory controller and OLED interface is proposed for real-time moving picture on mobile displays. The proposed OLED controller is implemented in FPGA and the performance is evaluated.

  • PDF

Development of self-centring energy-dissipative rocking columns equipped with SMA tension braces

  • Li, Yan-Wen;Yam, Michael C.H.;Zhang, Ping;Ke, Ke;Wang, Yan-Bo
    • Structural Engineering and Mechanics
    • /
    • v.82 no.5
    • /
    • pp.611-628
    • /
    • 2022
  • Energy-dissipative rocking (EDR) columns are a class of seismic mitigation device capable of dissipating seismic energy and preventing weak-story failure of moment resisting frames (MRFs). An EDR consists of two hinge-supported steel columns interconnected by steel dampers along its height. Under earthquakes, the input seismic energy can be dissipated by plastic energy of the steel dampers in the EDR column. However, the unrecoverable plastic deformation of steel dampers generally results in residual drifts in the structural system. This paper presents a proof-of-concept study on an innovative device, namely self-centring energy-dissipative rocking (SC-EDR) column, aiming at enabling self-centring capability of the EDR column by installing a set of shape memory alloy (SMA) tension braces. The working mechanism of the SC-EDR column is presented in detail, and the feasibility of the new device is carefully examined via experimental and numerical studies considering the parameters of the SMA bar diameter and the steel damper plate thickness. The seismic responses including load carrying capacities, stress distributions, base rocking behaviour, source of residual deformation, and energy dissipation are discussed in detail. A rational combination of the steel damper and the SMA tension braces can achieve excellent energy dissipation and self-centring performance.

High Performance Coprocessor Architecture for Real-Time Dense Disparity Map (실시간 Dense Disparity Map 추출을 위한 고성능 가속기 구조 설계)

  • Kim, Cheong-Ghil;Srini, Vason P.;Kim, Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.14A no.5
    • /
    • pp.301-308
    • /
    • 2007
  • This paper proposes high performance coprocessor architecture for real time dense disparity computation based on a phase-based binocular stereo matching technique called local weighted phase-correlation(LWPC). The algorithm combines the robustness of wavelet based phase difference methods and the basic control strategy of phase correlation methods, which consists of 4 stages. For parallel and efficient hardware implementation, the proposed architecture employs SIMD(Single Instruction Multiple Data Stream) architecture for each functional stage and all stages work on pipelined mode. Such that the newly devised pipelined linear array processor is optimized for the case of row-column image processing eliminating the need for transposed memory while preserving generality and high throughput. The proposed architecture is implemented with Xilinx HDL tool and the required hardware resources are calculated in terms of look up tables, flip flops, slices, and the amount of memory. The result shows the possibility that the proposed architecture can be integrated into one chip while maintaining the processing speed at video rate.

Design and Verification of Pipelined Face Detection Hardware (파이프라인 구조의 얼굴 검출 하드웨어 설계 및 검증)

  • Kim, Shin-Ho;Jeong, Yong-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.10
    • /
    • pp.1247-1256
    • /
    • 2012
  • There are many filter based image processing algorithms and they usually require a huge amount of computations and memory accesses making it hard to attain a real-time performance, expecially in embedded applications. In this paper, we propose a pipelined hardware structure of the filter based face detection algorithm to show that the real time performance can be achieved by hardware design. In our design, the whole computation is divided into three pipeline stages: resizing the image (Resize), Transforming the image (ICT), and finding candidate area (Find Candidate). Each stage is optimized by considering the parallelism of the computation to reduce the number of cycles and utilizing the line memory to minimize the memory accesses. The resulting hardware uses 507 KB internal SRAM and occupies 9,039 LUTs when synthesized and configured on Xilinx Virtex5LX330 FPGA. It can operate at maximum 165MHz clock, giving the performance of 108 frame/sec, while detecting up to 20 faces.

A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms (대용량 데이터 처리를 위한 고속 분산 인메모리 플랫폼 기반 재귀적 질의 알고리즘들의 구현 및 비교분석)

  • Kang, Minseo;Kim, Jaesung;Lee, Jaegil
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.621-626
    • /
    • 2016
  • Recursive query algorithm is used in many social network services, e.g., reachability queries in social networks. Recently, the size of social network data has increased as social network services evolve. As a result, it is almost impossible to use the recursive query algorithm on a single machine. In this paper, we implement recursive query on two popular in-memory distributed platforms, Spark and Twister, to solve this problem. We evaluate the performance of two implementations using 50 machines on Amazon EC2, and real-world data sets: LiveJournal and ClueWeb. The result shows that recursive query algorithm shows better performance on Spark for the Livejournal input data set with relatively high average degree, but smaller vertices. However, recursive query on Twister is superior to Spark for the ClueWeb input data set with relatively low average degree, but many vertices.

Real-time Volume Rendering using Point-Primitive (포인트 프리미티브를 이용한 실시간 볼륨 렌더링 기법)

  • Kang, Dong-Soo;Shin, Byeong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.10
    • /
    • pp.1229-1237
    • /
    • 2011
  • The volume ray-casting method is one of the direct volume rendering methods that produces high-quality images as well as manipulates semi-transparent object. Although the volume ray-casting method produces high-quality image by sampling in the region of interest, its rendering speed is slow since the color acquisition process is complicated for repetitive memory reference and accumulation of sample values. Recently, the GPU-based acceleration techniques are introduced. However, they require pre-processing or additional memory. In this paper, we propose efficient point-primitive based method to overcome complicated computation of GPU ray-casting. It presents semi-transparent objects, however it does not require preprocessing and additional memory. Our method is fast since it generates point-primitives from volume dataset during sampling process and it projects the primitives onto the image plane. Also, our method can easily cope with OTF change because we can add or delete point-primitive in real-time.

FPGA Implementation of SURF-based Feature extraction and Descriptor generation (SURF 기반 특징점 추출 및 서술자 생성의 FPGA 구현)

  • Na, Eun-Soo;Jeong, Yong-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.4
    • /
    • pp.483-492
    • /
    • 2013
  • SURF is an algorithm which extracts feature points and generates their descriptors from input images, and it is being used for many applications such as object recognition, tracking, and constructing panorama pictures. Although SURF is known to be robust to changes of scale, rotation, and view points, it is hard to implement it in real time due to its complex and repetitive computations. Using 3.3 GHz Pentium, in our experiment, it takes 240ms to extract feature points and create descriptors in a VGA image containing about 1,000 feature points, which means that software implementation cannot meet the real time requirement, especially in embedded systems. In this paper, we present a hardware architecture that can compute the SURF algorithm very fast while consuming minimum hardware resources. Two key concepts of our architecture are parallelism (for repetitive computations) and efficient line memory usage (obtained by analyzing memory access patterns). As a result of FPGA synthesis using Xilinx Virtex5LX330, it occupies 101,348 LUTs and 1,367 KB on-chip memory, giving performance of 30 frames per second at 100 MHz clock.

Efficient Image Data Processing using a Real Time Concurrent Single Memory Input/Output Access (실시간 단일 메모리 동시 입출력을 이용한 효율적인 영상 데이터 처리)

  • Lee, Gunjoong;Han, Geumhee;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.103-106
    • /
    • 2012
  • A memory access method that data are read with different sequences with writing order is a simple but important procedure in many image compression standards, such as JPEG, MPEG1/2/4, H.264, and HEVC. For real time processing, double buffering is widely used using two block sized buffers, that accesses buffers concurrently with alternative way to read and write. In some cases like a transpose memory in 2D DCT with a simple and regular access order, a single buffering which requires only single block sized buffer can be used. This paper shows that even in complex access orders there is a regularity among updating orders within a finite turns, and suggested an effective implementation method using a single block sized buffer to process concurrent read/write operation with different access orders.

  • PDF

VLSI architecture design of CAVLC entropy encoder/decoder for H.264/AVC (H.264/AVC를 위한 CAVLC 엔트로피 부/복호화기의 VLSI 설계)

  • Lee Dae-joon;Jeong Yong-jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.371-381
    • /
    • 2005
  • In this paper, we propose an advanced hardware architecture for the CAVLC entropy encoder/decoder engine for real time video compression. The CAVLC (Context-based Adaptive Variable Length Coding) is a lossless compression method in H.264/AVC and it has high compression efficiency but has computational complexity. The reference memory size is optimized using partitioned storing method and memory reuse method which are based on partiality of memory referencing. We choose the hardware architecture which has the most suitable one in several encoder/decoder architectures for the mobile devices and improve its performance using parallel processing. The proposed architecture has been verified by ARM-interfaced emulation board using Altera Excalibur and also synthesized on Samsung 0.18 um CMOS technology. The synthesis result shows that the encoder can process about 300 CIF frames/s at 150MHz and the decoder can process about 250 CIF frames/s at 140Mhz. The hardware architectures are being used as core modules when implementing a complete H.264/AVC video encoder/decoder chip for real-time multimedia application.

A Tool for On-the-fly Repairing of Atomicity Violation in GPU Program Execution

  • Lee, Keonpyo;Lee, Seongjin;Jun, Yong-Kee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.9
    • /
    • pp.1-12
    • /
    • 2021
  • In this paper, we propose a tool called ARCAV (Atomatic Recovery of CUDA Atomicity violation) to automatically repair atomicity violations in GPU (Graphics Processing Unit) program. ARCAV monitors information of every barrier and memory to make actual memory writes occur at the end of the barrier region or to make the program execute barrier region again. Existing methods do not repair atomicity violations but only detect the atomicity violations in GPU programs because GPU programs generally do not support lock and sleep instructions which are necessary for repairing the atomicity violations. Proposed ARCAV is designed for GPU execution model. ARCAV detects and repairs four patterns of atomicity violations which represent real-world cases. Moreover, ARCAV is independent of memory hierarchy and thread configuration. Our experiments show that the performance of ARCAV is stable regardless of the number of threads or blocks. The overhead of ARCAV is evaluated using four real-world kernels, and its slowdown is 2.1x, in average, of native execution time.