• Title/Summary/Keyword: memory load

Search Result 345, Processing Time 0.022 seconds

A dual-link CC-NUMA System Tolerant to the Multiprogramming Environment (다중 프로그램 환경에 적합한 이중 연결 CC-NUMA 시스템)

  • Suh, Hyo-Joong
    • The KIPS Transactions:PartA
    • /
    • v.11A no.3
    • /
    • pp.199-206
    • /
    • 2004
  • Under the multiprogrammed situation, the performance of multiprocessor system is affected by the process allocation policy of the operating systems. The lowest communication cost can be achieved when the related processes positioned to the adjacent processors. While the effective allocation is quite difficult to the real situation, and the processing of the allocation policy consumes some computation time. The dual-ring CC-NUMA systems exhibit a quite performance difference according to the process a1location policy due to a lot of unbalanced memory transactions on the interconnection networks. In this paper, I propose a load balanced dual-link CC-NUMA system that does not requires the processes allocation policy. By the program-driven simulation results. the proposed system shows no remarkable difference according to the allocation policy while the dual-ring systems shows 10% performance improvement by the process allocation. In addition, the proposed system outperforms the dual~ring systems about 1.5 times.

Fast Hilbert R-tree Bulk-loading Scheme using GPGPU (GPGPU를 이용한 Hilbert R-tree 벌크로딩 고속화 기법)

  • Yang, Sidong;Choi, Wonik
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.792-798
    • /
    • 2014
  • In spatial databases, R-tree is one of the most widely used indexing structures and many variants have been proposed for its performance improvement. Among these variants, Hilbert R-tree is a representative method using Hilbert curve to process large amounts of data without high cost split techniques to construct the R-tree. This Hilbert R-tree, however, is hardly applicable to large-scale applications in practice mainly due to high pre-processing costs and slow bulk-load time. To overcome the limitations of Hilbert R-tree, we propose a novel approach for parallelizing Hilbert mapping and thus accelerating bulk-loading of Hilbert R-tree on GPU memory. Hilbert R-tree based on GPU improves bulk-loading performance by applying the inversed-cell method and exploiting parallelism for packing the R-tree structure. Our experimental results show that the proposed scheme is up to 45 times faster compared to the traditional CPU-based bulk-loading schemes.

Embedded Linux for Commercial Digital TV System (상용 디지털 TV를 위한 임베디드 리눅스 시스템)

  • Moon, Sang-Pil;Seo, Dae-Wha
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.595-604
    • /
    • 2003
  • A Digital TV system is necessary for data Processing as well as video and audio processing. Especially in the case of interactive broadcasting, it should manage returning channel created by the Internet, PSTN, and so on. Because of many functionalities and multitasking jobs, it needs an Operating System. Embedded Linux as open source program can increase a cost effectiveness in market and has many advantages - reusable device drivers and application programs, more convenient developing environment using shell and file system, and easy problem resolution within Open Source Community. In this paper, we modified Embedded Linux kernel and cross developing environment for a big-endian system, redesigned devices for kernel execution, and configured system memory map in order to load a linux kernel. Also we developed an device driver for entire system control.

Synthesis of Ocean Wave Models and Simulation Using GPU (바다물결 모형의 합성 및 GPU를 이용한 시뮬레이션)

  • Lee, Dong-Min;Lee, Sung-Kee
    • The KIPS Transactions:PartA
    • /
    • v.14A no.7
    • /
    • pp.421-434
    • /
    • 2007
  • Among many other CG generated natural scenes, the representation of ocean surfaces is one of the most complicated and time-consuming problem because of its large extent and complex surface movement. We present a hybrid method to represent and animate unbound deep-water ocean surfaces by utilizing graphics processor as both simulation and rendering core. Our technique is mainly based on spectral approaches that generate a high-detailed height field using Fourier transform on a 2D regular grid. Additionally, we incorporate Gerstner model and generate low-detailed height field on a 2D projected grid in order to represent large waves and main structure of ocean surface. There is no interruption between CPU and GPU, and no need to transfer simulation results from the system memory to graphics hardware because the entire simulation and rending processes are done on graphics processor. As a result we can synthesize and render realistic water surfaces in real-time. Proposed techniques are readily adoptable to real-time applications such as computer games that have heavy work load on CPU but still demand plausible natural scenes.

A study on the Thermal Buckling and Postbuckling of a Laminated Composite Beam with Embedded SMA Actuators (형상기억합금 선을 삽입한 복합적층 보의 열좌굴 및 좌굴후 거동에 관한 연구)

  • Choi, S.;Lee, J.J.;Lee, D.C.
    • Composites Research
    • /
    • v.12 no.3
    • /
    • pp.55-65
    • /
    • 1999
  • In this paper, the thermal buckling and postbuckling behaviour of composite beam with embedded shape memory alloy (SMA) wires are investigated experimentally and analytically. The results of thermal buckling tests on uniformly heated, clamped, composite beam embedded with SMA wire actuators are presented and discussed in consideration of geometric imperfections, slenderness ratio of beam and embedding position of SMA wire actuators. The shape recovery force can reduce the thermal expansion of composite laminated beam, which result in increment of the critical buckling temperature and reduction of the lateral deflection of postbuckling behaviours. It is presented quantitatively on the temperature-load-deflection behaviour records how the shape recovery force affects the thermal buckling. The cross tangential method is suggested to calculate the critical buckling temperature on the temperature-deflection plot. Based on the experimental analysis, the new formula is also proposed to describe the critical buckling temperature of a laminated composite beam with embedded SMA wire actuators.

  • PDF

Communication Schedule for GEN_BLOCK Redistribution (GEN_BLOCK간 재분산을 위한 통신 스케줄)

  • Yook, Hyun-Gyoo;Park, Myong-Soon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.5
    • /
    • pp.450-463
    • /
    • 2000
  • Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. GEN_BLOCK redistribution, which is redistribution between different GEN_BLOCKs, is essential for load balancing. However, prior research on redistribution has been focused on regular redistribution, such as redistribution between different CYCLIC(N)s. GEN_BLOCK redistribution is very different from regular redistribution. Message passing in regular redistribution involves repetitions of basic message passing patterns, while message passing for GEN_BLOCK redistribution shows locality. This paper proves that two optimal condition, reducing the number of communication steps and minimizing redistribution size, are essential in GEN_BLOCK redistribution. Additionally, by adding a relocation phase to list scheduling, we make an optimal scheduling algorithm for GEN_BLOCK redistribution. To evaluate the performance of the algorithm, we have performed experiments on a CRAY T3E. According to the experiments, it was proven that the scheduling algorithm shows better performance and that the conditions are critical in enhancing the communication speed of GEN_BLOCK redistribution.

  • PDF

An RDF Ontology Access Control Model based on Relational Database (관계형 데이타베이스 기반의 RDF 온톨로지 접근 제어 모델)

  • Jeong, Dong-Won
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.155-168
    • /
    • 2008
  • This paper proposes a relational security model-based RDF Web ontology access control model. The Semantic Web is recognized as a next generation Web and RDF is a Web ontology description language to realize the Semantic Web. Much effort has been on the RDF and most research has been focused on the editor, storage, and inference engine. However, little attention has been given to the security issue, which is one of the most important requirements for information systems. Even though several researches on the RDF ontology security have been proposed, they have overhead to load all relevant data to memory and neglect the situation that most ontology storages are being developed based on relational database. This paper proposes a novel RDF Web ontology security model based on relational database to resolve the issues. The proposed security model provides high practicality and usability, and also we can easily make it stable owing to the stability of the relational database security model.

Improved Real-time Video Conferencing System with Memory Buffer Control Management (메모리 버퍼 제어 관리 기능을 갖춘 향상된 실시간 영상회의 시스템)

  • Yoo, Woo Jong;Kim, Sang Hyong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.6
    • /
    • pp.255-260
    • /
    • 2017
  • The limitation of real-time video conferencing system is that the delay of network and buffering and the transmission of user information are not efficiently performed between systems, so real - time performance is not guaranteed completely. In order to overcome this problem, the study on the extension of the network infrastructure and the jitter delay is actively carried out, but the study on the buffering delay is insufficient. In this paper, we propose a frame-rate control buffer management (FRCB) scheme to solve the problem caused by buffering delay. The FRCB is used to prevent overflow and underflow of the buffer by adopting the two-stage buffer threshold of Fast-play THreshold (FTH) and Slow-play THreshold (STH). Therefore, it showed better performance than jitter buffer even under high CPU load, and showed that it is suitable for high quality real time video conferencing.

AE32000B: a Fully Synthesizable 32-Bit Embedded Microprocessor Core

  • Kim, Hyun-Gyu;Jung, Dae-Young;Jung, Hyun-Sup;Choi, Young-Min;Han, Jung-Su;Min, Byung-Gueon;Oh, Hyeong-Cheol
    • ETRI Journal
    • /
    • v.25 no.5
    • /
    • pp.337-344
    • /
    • 2003
  • In this paper, we introduce a fully synthesizable 32-bit embedded microprocessor core called the AE32000B. The AE32000B core is based on the extendable instruction set computer architecture, so it has high code density and a low memory access rate. In order to improve the performance of the core, we developed and adopted various design options, including the load extension register instruction (LERI) folding unit, a high performance multiply and accumulate (MAC) unit, various DSP units, and an efficient coprocessor interface. The instructions per cycle count of the Dhrystone 2.1 benchmark for the designed core is about 0.86. We verified the synthesizability and the area and time performances of our design using two CMOS standard cell libraries: a 0.35-${\mu}m$ library and a 0.18-${\mu}m$ library. With the 0.35-${\mu}m$ library, the core can be synthesized with about 47,000 gates and operate at 70 MHz or higher, while it can be synthesized with about 53,000 gates and operate at 120 MHz or higher with the 0.18-${\mu}m$ library.

  • PDF

A 0.8-V Static RAM Macro Design utilizing Dual-Boosted Cell Bias Technique (이중 승압 셀 바이어스 기법을 이용한 0.8-V Static RAM Macro 설계)

  • Shim, Sang-Won;Jung, Sang-Hoon;Chung, Yeon-Bae
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.1
    • /
    • pp.28-35
    • /
    • 2007
  • In this paper, an ultra low voltage SRAM design method based on dual-boosted cell bias technique is described. For each read/write cycle, the wordline and cell power node of the selected SRAM cells are boosted into two different voltage levels. This enhances SNM(Static Noise Margin) to a sufficient amount without an increase of the cell size, even at sub 1-V supply voltage. It also improves the SRAM circuit speed owing to increase of the cell read-out current. The proposed design technique has been demonstrated through 0.8-V, 32K-byte SRAM macro design in a $0.18-{\mu}m$ CMOS technology. Compared to the conventional cell bias technique, the simulation confirms an 135 % enhancement of the cell SNM and a 31 % faster speed at 0.8-V supply voltage. This prototype chip shows an access time of 23 ns and a power dissipation of $125\;{\mu}W/Hz$.