• Title/Summary/Keyword: memory access time

Search Result 410, Processing Time 0.027 seconds

An Efficient Adaptive Bitmap-based Selective Tuning Scheme for Spatial Queries in Broadcast Environments

  • Song, Doo-Hee;Park, Kwang-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.10
    • /
    • pp.1862-1878
    • /
    • 2011
  • With the advances in wireless communication technology and the advent of smartphones, research on location-based services (LBSs) is being actively carried out. In particular, several spatial index methods have been proposed to provide efficient LBSs. However, finding an optimal indexing method that balances query performance and index size remains a challenge in the case of wireless environments that have limited channel bandwidths and device resources (computational power, memory, and battery power). Thus, mechanisms that make existing spatial indexing techniques more efficient and highly applicable in resource-limited environments should be studied. Bitmap-based Spatial Indexing (BSI) has been designed to support LBSs, especially in wireless broadcast environments. However, the access latency in BSI is extremely large because of the large size of the bitmap, and this may lead to increases in the search time. In this paper, we introduce a Selective Bitmap-based Spatial Indexing (SBSI) technique. Then, we propose an Adaptive Bitmap-based Spatial Indexing (ABSI) to improve the tuning time in the proposed SBSI scheme. The ABSI is applied to the distribution of geographical objects in a grid by using the Hilbert curve (HC). With the information in the ABSI, grid cells that have no objects placed, (i.e., 0-bit information in the spatial bitmap index) are not tuned during a search. This leads to an improvement in the tuning time on the client side. We have carried out a performance evaluation and demonstrated that our SBSI and ABSI techniques outperform the existing bitmap-based DSI (B DSI) technique.

A Study on Architecture of Motion Compensator for H.264/AVC Encoder (H.264/AVC부호화기용 움직임 보상기의 아키텍처 연구)

  • Kim, Won-Sam;Sonh, Seung-Il;Kang, Min-Goo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.527-533
    • /
    • 2008
  • Motion compensation always produces the principal bottleneck in the real-time high quality video applications. Therefore, a fast dedicated hardware is needed to perform motion compensation in the real-time video applications. In many video encoding methods, the frames are partitioned into blocks of Pixels. In general, motion compensation predicts present block by estimating the motion from previous frame. In motion compensation, the higher pixel accuracy shows the better performance but the computing complexity is increased. In this paper, we studied an architecture of motion compensator suitable for H.264/AVC encoder that supports quarter-pixel accuracy. The designed motion compensator increases the throughput using transpose array and 3 6-tap Luma filters and efficiently reduces the memory access. The motion compensator is described in VHDL and synthesized in Xilinx ISE and verified using Modelsim_6.1i. Our motion compensator uses 36-tap filters only and performs in 640 clock-cycle per macro block. The motion compensator proposed in this paper is suitable to the areas that require the real-time video processing.

Implementation of FFT on Massively Parallel GPU for DVB-T Receiver (DVB-T 수신기를 위한 대규모 병렬처리 GPU 기반의 FFT 구현)

  • Lee, Kyu Hyung;Heo, Seo Weon
    • Journal of Broadcast Engineering
    • /
    • v.18 no.2
    • /
    • pp.204-214
    • /
    • 2013
  • Recently various research have been conducted relating to the implementation of signal processing or communication system by software using the massively parallel processing capability of the GPU. In this work, we focus on reducing software simulation time of 2K/8K FFT in DVB-T by using GPU. we estimate the processing time of the DVB-T system, which is one of the standards for DTV transmission, by CPU. Then we implement the FFT processing by the software using the NVIDIA's massively parallel GPU processor. In this paper we apply stream process method to reduce the overhead for data transfer between CPU and GPU, coalescing method to reduce the global memory access time and data structure design method to maximize the shared memory usage. The results show that our proposed method is approximately 20~30 times as fast as the CPU based FFT processor, and approximately 1.8 times as fast as the CUFFT library (version 2.1) which is provided by the NVIDIA when applied to the DVB-T 2K/8K mode FFT.

Hardware Design of High Performance In-loop Filter in HEVC Encoder for Ultra HD Video Processing in Real Time (UHD 영상의 실시간 처리를 위한 고성능 HEVC In-loop Filter 부호화기 하드웨어 설계)

  • Im, Jun-seong;Dennis, Gookyi;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.401-404
    • /
    • 2015
  • This paper proposes a high-performance in-loop filter in HEVC(High Efficiency Video Coding) encoder for Ultra HD video processing in real time. HEVC uses in-loop filter consisting of deblocking filter and SAO(Sample Adaptive Offset) to solve the problems of quantization error which causes image degradation. In the proposed in-loop filter encoder hardware architecture, the deblocking filter and SAO has a 2-level hybrid pipeline structure based on the $32{\times}32CTU$ to reduce the execution time. The deblocking filter is performed by 6-stage pipeline structure, and it supports minimization of memory access and simplification of reference memory structure using proposed efficient filtering order. Also The SAO is implemented by 2-statge pipeline for pixel classification and applying SAO parameters and it uses two three-layered parallel buffers to simplify pixel processing and reduce operation cycle. The proposed in-loop filter encoder architecture is designed by Verilog HDL, and implemented by 205K logic gates in TSMC 0.13um process. At 110MHz, the proposed in-loop filter encoder can support 4K Ultra HD video encoding at 30fps in realtime.

  • PDF

Design and Implementation of Buffer Management Method for Enhancing Performance of Open GIS Components (개방형 GIS 컴포넌트의 성능 개선을 위한 버퍼 관리 방법의 설계 및 구현)

  • Cho, Dae-Soo;Min, Kyoung-Wook
    • The KIPS Transactions:PartD
    • /
    • v.11D no.1
    • /
    • pp.51-60
    • /
    • 2004
  • In open GIS environment, a GIS client can access spatial data in different types of GIS sowers with the same Interfaces. This means that open GIS components software ensures the interoperability throughout the heterogeneous GIS servers. The user response time, however, tends to be increased, if the client makes use of the standard interfaces for data accesses that can ensure interoperability. This is because the format of spatial data accessed from a specific GIS server must be transformed into common format, such as Rowset in OLE/DB, which is compatible with the standard interfaces. In this paper, we develop efficient techniques for data buffering in GIS client to reduce the response time. We design the buffer management method, which Is based on the space partitioning, and Integrate buffer management components into MapBase, an open GIS component software. And we also, show that buffer management proposed in this paper yields significant performance improvement in GIS client.

The Electrical Characteristics of SRAM Cell with Stacked Single Crystal Silicon TFT Cell (단결정 실리콘 TFT Cell의 적용에 따른 SRAM 셀의 전기적 특성)

  • Lee, Deok-Jin;Kang, Ey-Goo
    • Journal of the Korea Computer Industry Society
    • /
    • v.6 no.5
    • /
    • pp.757-766
    • /
    • 2005
  • There have been great demands for higher density SRAM in all area of SRAM applications, such as mobile, network, cache, and embedded applications. Therefore, aggressive shrinkage of 6T Full CMOS SRAM had been continued as the technology advances, However, conventional 6T Full CMOS SRAM has a basic limitation in the cell size because it needs 6 transistors on a silicon substrate compared to 1 transistor in a DRAM cell. The typical cell area of 6T Full CMOS SRAM is $70{\sim}90F^{2}$, which is too large compared to $8{\sim}9F^{2}$ of DRAM cell. With 80nm design rule using 193nm ArF lithography, the maximum density is 72M bits at the most. Therefore, pseudo SRAM or 1T SRAM, whose memory cell is the same as DRAM cell, is being adopted for the solution of the high density SRAM applications more than 64M bits. However, the refresh time limits not only the maximum operation temperature but also nearly all critical electrical characteristics of the products such as stand_by current and random access time. In order to overcome both the size penalty of the conventional 6T Full CMOS SRAM cell and the poor characteristics of the TFT load cell, we have developed $S^{3}$ cell. The Load pMOS and the Pass nMOS on ILD have nearly single crystal silicon channel according to the TEM and electron diffraction pattern analysis. In this study, we present $S^{3}$ SRAM cell technology with 100nm design rule in further detail, including the process integration and the basic characteristics of stacked single crystal silicon TFT.

  • PDF

Electrical Characteristics of SRAM Cell with Stacked Single Crystal Silicon TFT Cell (Stacked Single Crystal Silicon TFT Cell의 적용에 의한 SRAM 셀의 전기적인 특성에 관한 연구)

  • Kang, Ey-Goo;Kim, Jin-Ho;Yu, Jang-Woo;Kim, Chang-Hun;Sung, Man-Young
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.19 no.4
    • /
    • pp.314-321
    • /
    • 2006
  • There have been great demands for higher density SRAM in all area of SRAM applications, such as mobile, network, cache, and embedded applications. Therefore, aggressive shrinkage of 6 T Full CMOS SRAM had been continued as the technology advances. However, conventional 6 T Full CMOS SRAM has a basic limitation in the cell size because it needs 6 transistors on a silicon substrate compared to 1 transistor in a DRAM cell. The typical cell area of 6 T Full CMOS SRAM is $70{\sim}90\;F^2$, which is too large compared to $8{\sim}9\;F^2$ of DRAM cell. With 80 nm design rule using 193 nm ArF lithography, the maximum density is 72 Mbits at the most. Therefore, pseudo SRAM or 1 T SRAM, whose memory cell is the same as DRAM cell, is being adopted for the solution of the high density SRAM applications more than 64 M bits. However, the refresh time limits not only the maximum operation temperature but also nearly all critical electrical characteristics of the products such as stand_by current and random access time. In order to overcome both the size penalty of the conventional 6 T Full CMOS SRAM cell and the poor characteristics of the TFT load cell, we have developed S3 cell. The Load pMOS and the Pass nMOS on ILD have nearly single crystal silicon channel according to the TEM and electron diffraction pattern analysis. In this study, we present $S^3$ SRAM cell technology with 100 nm design rule in further detail, including the process integration and the basic characteristics of stacked single crystal silicon TFT.

Atomic Layer Deposition: Overview and Applications (원자층증착 기술: 개요 및 응용분야)

  • Shin, Seokyoon;Ham, Giyul;Jeon, Heeyoung;Park, Jingyu;Jang, Woochool;Jeon, Hyeongtag
    • Korean Journal of Materials Research
    • /
    • v.23 no.8
    • /
    • pp.405-422
    • /
    • 2013
  • Atomic layer deposition(ALD) is a promising deposition method and has been studied and used in many different areas, such as displays, semiconductors, batteries, and solar cells. This method, which is based on a self-limiting growth mechanism, facilitates precise control of film thickness at an atomic level and enables deposition on large and three dimensionally complex surfaces. For instance, ALD technology is very useful for 3D and high aspect ratio structures such as dynamic random access memory(DRAM) and other non-volatile memories(NVMs). In addition, a variety of materials can be deposited using ALD, oxides, nitrides, sulfides, metals, and so on. In conventional ALD, the source and reactant are pulsed into the reaction chamber alternately, one at a time, separated by purging or evacuation periods. Thermal ALD and metal organic ALD are also used, but these have their own advantages and disadvantages. Furthermore, plasma-enhanced ALD has come into the spotlight because it has more freedom in processing conditions; it uses highly reactive radicals and ions and for a wider range of material properties than the conventional thermal ALD, which uses $H_2O$ and $O_3$ as an oxygen reactant. However, the throughput is still a challenge for a current time divided ALD system. Therefore, a new concept of ALD, fast ALD or spatial ALD, which separate half-reactions spatially, has been extensively under development. In this paper, we reviewed these various kinds of ALD equipment, possible materials using ALD, and recent ALD research applications mainly focused on materials required in microelectronics.

Design of Systolic Array for High Speed Processing of Block Matching Motion Estimation Algorithm (블록 정합 움직임추정 알고리즘의 고속처리를 위한 시스토릭 어레이의 설계)

  • 추봉조;김혁진;이수진
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.119-124
    • /
    • 1998
  • Block Matching Motion Estimation(BMME) Algorithm is demands a very large amount of computing power and have been proposed many fast algorithms. These algorithms are many problem that larger size of VLSI scale due to non-localized search block data and problem of non-reuse of input data for each processing step. In this paper, we designed systolic arry of high processing capacity, constraints input output pin size and reuse of input data for small VLSI size. The proposed systolic array is optimized memory access time because of iterative reuse of input data on search block and become independent of problem size due to increase of algorithm's parallelism and total processing elements connection is localized spatial and temporal. The designed systolic array is reduced O(N6) time complexity to O(N3) on moving vector and has O(N) input/output pin size.

  • PDF

ASIP Design for Real-Time Processing of H.264 (실시간 H.264/AVC 처리를 위한 ASIP설계)

  • Kim, Jin-Soo;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.5
    • /
    • pp.12-19
    • /
    • 2007
  • This paper presents an ASIP(Application Specific Instruction Set Processor) for implementation of H.264/AVC, called VSIP(Video Specific Instruction-set Processor). The proposed VSIP has novel instructions and optimized hardware architectures for specific applications, such as intra prediction, in-loop deblocking filter, integer transform, etc. Moreover, VSIP has hardware accelerators for computation intensive parts in video signal processing, such as inter prediction and entropy coding. The VSIP has much smaller area and can dramatically reduce the number of memory access compared with commercial DSP chips, which result in low power consumption. The proposed VSIP can efficiently perform in real-time video processing and it can support various profiles and standards.