• Title/Summary/Keyword: Buffer(Memory)

검색결과 369건 처리시간 0.03초

A Buffer Replacement Policy using Hot Page Management Scheme for Improving Performance of Flash Memory (플래시 메모리 성능향상을 위한 핫 페이지 관리 기법을 이용한 버퍼교체 정책)

  • Daeyoung Kim;Junghan Kim;Hyun-jin Cho;Young Ik Eom
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2008년도 추계학술발표대회
    • /
    • pp.860-863
    • /
    • 2008
  • 플래시 메모리는 우리 생활에 널리 사용되고 있는 휴대용 저장장치 중의 하나이다. 빠른 입출력 속도와 저전력, 무소음, 작은 크기 등의 장점을 가지나 덮어쓰기가 불가능하고 읽기/쓰기의 속도에 비해 소거 연산의 속도가 매우 느리다는 단점이 있다. 이를 보완하기 위해, 호스트와 플래시 메모리 사이에 버퍼 캐시를 두어 사용하고 있으며, 버퍼 캐시에 사용되는 교체 정책에 따라 플래시 메모리 장치의 성능이 크게 영향을 받는다. 본 논문에서는 블록 단위의 LRU 기법의 단점을 개선한 HPLRU 기법을 제안한다. HPLRU 기법은 최근에 자주 참조되었던 페이지인 핫 페이지 들을 모아 리스트를 만들어 관리하고, 이를 통해 페이지 적중률을 향상시키고 다른 페이지들로 인해 핫 페이지들이 소거되는 현상을 개선하였다. 이 알고리즘은 임의 데이터 패턴에 좋은 성능을 보이며 쓰기 발생 횟수를 많이 감소시키는 결과를 보였다.

Performance Evaluation of Catalog Management Schemes for Distributed Main Memory Databases (분산 주기억장치 데이터베이스에서 카탈로그 관리 기법의 성능평가)

  • Jeong, Han-Ra;Hong, Eui-Kyeong;Kim, Myung
    • Journal of Korea Multimedia Society
    • /
    • 제8권4호
    • /
    • pp.439-449
    • /
    • 2005
  • Distributed main memory database management systems (DMM-DBMSs) store the database in main memories of the participating sites. They provide high performance through fast access to the local databases and high speed communication among the sites. Recently, a lot of research results on DMM- DBMSs has been reported. However, to the best of our knowledge, there is no known research result on the performance of the catalog management schemes for DMM-DBMSs. In this work, we evaluated the performance of the partitioned catalog management schemes through experimental analysis. First, we classified the partitioned catalog management schemes into three categories : Partitioned Catalogs Without Caching (PCWC), Partitioned Catalogs With Incremental Caching (PCWIC), and Partitioned Catalogs With Full Caching (PCWFC). Experiments were conducted by varying the number of sites, the number of terminals per site, buffer size, write query ratio, and local query ratio. Experiments show that PCWFC outperforms the other two schemes in all cases. It also means that the performance of PCWIC gradually increases as time goes by. It should be noted that PCWFC does not guarantee high performance for disk-based distributed DBMSs in cases when the workload of individual site is high, catalog write ratio is high, or remote data objects are accessed very frequently. Main reason that PCWFC outperforms for DMM-DBMSs is that query compilation and remote catalog access can be done in a very high speed, even when the catalogs of the remote data objects are frequently updated.

  • PDF

The 1/f Noise Analysis of 3D SONOS Multi Layer Flash Memory Devices Fabricated on Nitride or Oxide Layer (산화막과 질화막 위에 제작된 3D SONOS 다층 구조 플래시 메모리소자의 1/f 잡음 특성 분석)

  • Lee, Sang-Youl;Oh, Jae-Sub;Yang, Seung-Dong;Jeong, Kwang-Seok;Yun, Ho-Jin;Kim, Yu-Mi;Lee, Hi-Deok;Lee, Ga-Won
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • 제25권2호
    • /
    • pp.85-90
    • /
    • 2012
  • In this paper, we compared and analyzed 3D silicon-oxide-nitride-oxide-silicon (SONOS) multi layer flash memory devices fabricated on nitride or oxide layer, respectively. The device fabricated on nitride layer has inferior electrical properties than that fabricated on oxide layer. However, the device on nitride layer has faster program / erase speed (P/E speed) than that on the oxide layer, although having inferior electrical performance. Afterwards, to find out the reason why the device on nitride has faster P/E speed, 1/f noise analysis of both devices is investigated. From gate bias dependance, both devices follow the mobility fluctuation model which results from the lattice scattering and defects in the channel layer. In addition, the device on nitride with better memory characteristics has higher normalized drain current noise power spectral density ($S_{ID}/I^2_D$>), which means that it has more traps and defects in the channel layer. The apparent hooge's noise parameter (${\alpha}_{app}$) to represent the grain boundary trap density and the height of grain boundary potential barrier is considered. The device on nitride has higher ${\alpha}_{app}$ values, which can be explained due to more grain boundary traps. Therefore, the reason why the devices on nitride and oxide have a different P/E speed can be explained due to the trapping/de-trapping of free carriers into more grain boundary trap sites in channel layer.

Hexagon-shape Line Search Algorithm for Fast Motion Estimation on Media Processor (미디어프로세서 상의 고속 움직임 탐색을 위한 Hexagon 모양 라인 탐색 알고리즘)

  • Jung Bong-Soo;Jeon Byeung-Woo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • 제43권4호
    • /
    • pp.55-65
    • /
    • 2006
  • Most of fast block motion estimation algorithms reported so far in literatures aim to reduce the computation in terms of the number of search points, thus do not fit well with multimedia processors due to their irregular data flow. For multimedia processors, proper reuse of data is more important than reducing number of absolute difference operations because the execution cycle performance strongly depends on the number of off-chip memory access. Therefore, in this paper, we propose a Hexagon-shape line search (HEXSLS) algorithm using line search pattern which can increase data reuse from on-chip local buffer, and check sub-sampling points in line search pattern to reduce unnecessary SAD operation. Our experimental results show that the prediction error (MAE) performance of the proposed HEXSLS is similar to that of the full search block matching algorithm (FSBMA), while compared with the hexagon-based search (HEXBS), the HEXSLS outperforms. Also the proposed HEXSLS requires much lesser off-chip memory access than the conventional fast motion estimation algorithm such as the hexagon-based search (HEXBS) and the predictive line search (PLS). As a result, the proposed HEXSLS algorithm requires smaller number of execution cycles on media processor.

FPGA Implementation of Real-time 2-D Wavelet Image Compressor (실시간 2차원 웨이블릿 영상압축기의 FPGA 구현)

  • 서영호;김왕현;김종현;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제27권7A호
    • /
    • pp.683-694
    • /
    • 2002
  • In this paper, a digital image compression codec using 2D DWT(Discrete Wavelet Transform) is designed using the FPGA technology for real time operation The implemented image compression codec using wavelet decomposition consists of a wavelet kernel part for wavelet filtering process, a quantizer/huffman coder for quantization and huffman encoding of wavelet coefficients, a memory controller for interface with external memories, a input interface to process image pixels from A/D converter, a output interface for reconstructing huffman codes, which has irregular bit size, into 32-bit data having regular size data, a memory-kernel buffer to arrage data for real time process, a PCI interface part, and some modules for setting timing between each modules. Since the memory mapping method which converts read process of column-direction into read process of the row-direction is used, the read process in the vertical-direction wavelet decomposition is very efficiently processed. Global operation of wavelet codec is synchronized with the field signal of A/D converter. The global hardware process pipeline operation as the unit of field and each field and each field operation is classified as decomposition levels of wavelet transform. The implemented hardware used FPGA hardware resource of 11119(45%) LAB and 28352(9%) ESB in FPGA device of APEX20KC EP20k600CB652-7 and mapped into one FPGA without additional external logic. Also it can process 33 frames(66 fields) per second, so real-time image compression is possible.

Optimized Hardware Design using Sobel and Median Filters for Lane Detection

  • Lee, Chang-Yong;Kim, Young-Hyung;Lee, Yong-Hwan
    • Journal of Advanced Information Technology and Convergence
    • /
    • 제9권1호
    • /
    • pp.115-125
    • /
    • 2019
  • In this paper, the image is received from the camera and the lane is sensed. There are various ways to detect lanes. Generally, the method of detecting edges uses a lot of the Sobel edge detection and the Canny edge detection. The minimum use of multiplication and division is used when designing for the hardware configuration. The images are tested using a black box image mounted on the vehicle. Because the top of the image of the used the black box is mostly background, the calculation process is excluded. Also, to speed up, YCbCr is calculated from the image and only the data for the desired color, white and yellow lane, is obtained to detect the lane. The median filter is used to remove noise from images. Intermediate filters excel at noise rejection, but they generally take a long time to compare all values. In this paper, by using addition, the time can be shortened by obtaining and using the result value of the median filter. In case of the Sobel edge detection, the speed is faster and noise sensitive compared to the Canny edge detection. These shortcomings are constructed using complementary algorithms. It also organizes and processes data into parallel processing pipelines. To reduce the size of memory, the system does not use memory to store all data at each step, but stores it using four line buffers. Three line buffers perform mask operations, and one line buffer stores new data at the same time as the operation. Through this work, memory can use six times faster the processing speed and about 33% greater quantity than other methods presented in this paper. The target operating frequency is designed so that the system operates at 50MHz. It is possible to use 2157fps for the images of 640by360 size based on the target operating frequency, 540fps for the HD images and 240fps for the Full HD images, which can be used for most images with 30fps as well as 60fps for the images with 60fps. The maximum operating frequency can be used for larger amounts of the frame processing.

A Design of Fractional Motion Estimation Engine with 4×4 Block Unit of Interpolator & SAD Tree for 8K UHD H.264/AVC Encoder (8K UHD(7680×4320) H.264/AVC 부호화기를 위한 4×4블럭단위 보간 필터 및 SAD트리 기반 부화소 움직임 추정 엔진 설계)

  • Lee, Kyung-Ho;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • 제50권6호
    • /
    • pp.145-155
    • /
    • 2013
  • In this paper, we proposed a $4{\times}4$ block parallel architecture of interpolation for high-performance H.264/AVC Fractional Motion Estimation in 8K UHD($7680{\times}4320$) video real time processing. To improve throughput, we design $4{\times}4$ block parallel interpolation. For supplying the $10{\times}10$ reference data for interpolation, we design 2D cache buffer which consists of the $10{\times}10$ memory arrays. We minimize redundant storage of the reference pixel by applying the Search Area Stripe Reuse scheme(SASR), and implement high-speed plane interpolator with 3-stage pipeline(Horizontal Vertical 1/2 interpolation, Diagonal 1/2 interpolation, 1/4 interpolation). The proposed architecture was simulated in 0.13um standard cell library. The gate count is 436.5Kgates. The proposed H.264/AVC Fractional Motion Estimation can support 8K UHD at 30 frames per second by running at 187MHz.

OpenGL ES 1.1 Implementation Using OpenGL (OpenGL을 이용한 OpenGL ES 1.1 구현)

  • Lee, Hwan-Yong;Baek, Nak-Hoon
    • The KIPS Transactions:PartA
    • /
    • 제16A권3호
    • /
    • pp.159-168
    • /
    • 2009
  • In this paper, we present an efficient way of implementing OpenGL ES 1.1 standard for the environments with hardware-supported OpenGL API, such as desktop PCs. Although OpenGL ES was started from the existing OpenGL features, it becomes a new three-dimensional graphics library customized for embedded systems through introducing fixed-point arithmetic operations, buffer management with fixed-point data type supports, completely new texture mapping functionalities and others. Currently, it is the official three dimensional graphics library for Google Android, Apple iPhone, PlayStation3, etc. In this paper, we achieved improvements on the arithmetic operations for the fixed-point number representation, which is the most characteristic data type for OpenGL ES. For the conversion of fixed-point data types to the floating-point number representations for the underlying OpenGL, we show the way of efficient conversion processes even with satisfying OpenGL ES standard requirements. We also introduced a simple memory management scheme to mange the converted data for the buffer containing fixed-point numbers. In the case of texture processing, the requirements in both standards are quite different and thus we used completely new software-implementations. Our final implementation result of OpenGL ES library provides all of over than 200 functions in OpenGL ES 1.1 standard and completely passed its conformance test, to show its compliance with the standard. From the efficiency viewpoint, we measured its execution times for several OpenGL ES-specific application programs and achieved at most 33.147 times improvements, to become the fastest one among the OpenGL ES implementations in the same category.

A Design of 4×4 Block Parallel Interpolation Motion Compensation Architecture for 4K UHD H.264/AVC Decoder (4K UHD급 H.264/AVC 복호화기를 위한 4×4 블록 병렬 보간 움직임보상기 아키텍처 설계)

  • Lee, Kyung-Ho;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • 제50권5호
    • /
    • pp.102-111
    • /
    • 2013
  • In this paper, we proposed a $4{\times}4$ block parallel architecture of interpolation for high-performance H.264/AVC Motion Compensation in 4K UHD($3840{\times}2160$) video real time processing. To improve throughput, we design $4{\times}4$ block parallel interpolation. For supplying the $9{\times}9$ reference data for interpolation, we design 2D cache buffer which consists of the $9{\times}9$ memory arrays. We minimize redundant storage of the reference pixel by applying the Search Area Stripe Reuse scheme(SASR), and implement high-speed plane interpolator with 3-stage pipeline(Horizontal Vertical 1/2 interpolation, Diagonal 1/2 interpolation, 1/4 interpolation). The proposed architecture was simulated in 0.13um standard cell library. The maximum operation frequency is 150MHz. The gate count is 161Kgates. The proposed H.264/AVC Motion Compensation can support 4K UHD at 72 frames per second by running at 150MHz.

Research on the Design of TPO(Time, Place, 0Occasion)-Shift System for Mobile Multimedia Devices (휴대용 멀티미디어 디바이스를 위한 TPO(Time, Place, Occasion)-Shift 시스템 설계에 대한 연구)

  • Kim, Dae-Jin;Choi, Hong-Sub
    • Journal of the Korea Society of Computer and Information
    • /
    • 제14권2호
    • /
    • pp.9-16
    • /
    • 2009
  • While the broadband network and multimedia technology are being developed, the commercial market of digital contents as well as using IPTV has been widely spreading. In this background, Time-Shift system is developed for requirement of multimedia. This system is independent of Time but is not independent of Place and Occasion. For solving these problems, in this paper, we propose the TPO(Time, Place, Occasion)-Shift system for mobile multimedia devices. The profile that can be applied to the mobile multimedia devices is much different from that of the setter-box. And general mobile multimedia devices could not have such large memories that is for multimedia data. So it is important to continuously store and manage those multimedia data in limited capacity with mobile device's profile. Therefore we compose the basket in a way using defined time unit and manage these baskets for effective buffer management. In addition. since the file name of basket is made up to include a basket's time information, we can make use of this time information as DTS(Decoding Time Stamp). When some multimedia content is converted to be available for portable multimedia devices, we are able to compose new formatted contents using such DTS information. Using basket based buffer systems, we can compose the contents by real time in mobile multimedia devices and save some memory. In order to see the system's real-time operation and performance, we implemented the proposed TPO-Shift system on the basis of mobile device, MS340. And setter-box are desisted by using directshow player under Windows Vista environment. As a result, we can find the usefulness and real-time operation of the proposed systems.