• Title/Summary/Keyword: Memory systems

Search Result 2,153, Processing Time 0.029 seconds

Utilizing Channel Bonding-based M-n and Interval Cache on a Distributed VOD Server (효율적인 분산 VOD 서버를 위한 Channel Bonding 기반 M-VIA 및 인터벌 캐쉬의 활용)

  • Chung, Sang-Hwa;Oh, Soo-Cheol;Yoon, Won-Ju;kim, Hyun-Pil;Choi, Young-In
    • The KIPS Transactions:PartA
    • /
    • v.12A no.7 s.97
    • /
    • pp.627-636
    • /
    • 2005
  • This paper presents a PC cluster-based distributed video on demand (VOD) server that minimizes the load of the interconnection network by adopting channel bonding-based MVIA and the interval cache algorithm Video data is distributed to the disks of each server node of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large volume of video data transferred. We adopt two techniques to reduce the load of the interconnection network. First, an Msupporting channel bonding technique is adopted for the interconnection network. n which is a user-level communication protocol that reduces the overhead of the TCP/IP protocol in cluster systems, minimizes the time spent in communicating. We increase the bandwidth of the interconnection network using the channel bonding technique with MThe channel bonding technique expands the bandwidth by sending data concurrently through multiple network cards. Second, the interval cache reduces traffic on the interconnection network by caching the video data transferred from the remote disks in main memory Experiments using the distributed VOD server of this paper showed a maximum performance improvement of $30\%$ compared with a distributed VOD server without channel bonding-based MVIA and the interval cache, when used with a four-node PC cluster.

OpenGL ES 1.1 Implementation Using OpenGL (OpenGL을 이용한 OpenGL ES 1.1 구현)

  • Lee, Hwan-Yong;Baek, Nak-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.3
    • /
    • pp.159-168
    • /
    • 2009
  • In this paper, we present an efficient way of implementing OpenGL ES 1.1 standard for the environments with hardware-supported OpenGL API, such as desktop PCs. Although OpenGL ES was started from the existing OpenGL features, it becomes a new three-dimensional graphics library customized for embedded systems through introducing fixed-point arithmetic operations, buffer management with fixed-point data type supports, completely new texture mapping functionalities and others. Currently, it is the official three dimensional graphics library for Google Android, Apple iPhone, PlayStation3, etc. In this paper, we achieved improvements on the arithmetic operations for the fixed-point number representation, which is the most characteristic data type for OpenGL ES. For the conversion of fixed-point data types to the floating-point number representations for the underlying OpenGL, we show the way of efficient conversion processes even with satisfying OpenGL ES standard requirements. We also introduced a simple memory management scheme to mange the converted data for the buffer containing fixed-point numbers. In the case of texture processing, the requirements in both standards are quite different and thus we used completely new software-implementations. Our final implementation result of OpenGL ES library provides all of over than 200 functions in OpenGL ES 1.1 standard and completely passed its conformance test, to show its compliance with the standard. From the efficiency viewpoint, we measured its execution times for several OpenGL ES-specific application programs and achieved at most 33.147 times improvements, to become the fastest one among the OpenGL ES implementations in the same category.

A Case Study on the Service Programs at the Presidential Library and Museum (대통령 기록관의 서비스 프로그램 사례 연구)

  • Jo, Min-Ji
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.6 no.2
    • /
    • pp.157-184
    • /
    • 2006
  • Presidential records which have produced during a presidency as a national center are the evidence of the presidency and main historical records. We have the responsibility to establish fundamental systems to produce such main historical records and to manage such main historical records which could help people and history to judge the presidency based upon the evidence of their activities. The historical appraisal could be achieved not by memory but by evidence. A draft of a proposed law on the management of presidential records which includes the establishment of presidential libraries for the presidential records Mecca is being moored at the National Assembly now. The presidential library is to be considered as a multi-functional national institution which is carrying out the role as an Archives, Museums and Center for the education. In addition, it is imperative for a presidential library to provide user-oriented services to enrich the usability and the value of records, recognizing the change of administration paradigm from a supplier-oriented system to a customer-oriented system. This dissertation, in order to develop presidential library service programs focusing on customers rather than the convenience of administration, reviewed programs of the U.S. presidential libraries as a developed case and proposes guidelines and applicable samples for the development of the Korean presidential library service programs.

A High Speed Block Turbo Code Decoding Algorithm and Hardware Architecture Design (고속 블록 터보 코드 복호 알고리즘 및 하드웨어 구조 설계)

  • 유경철;신형식;정윤호;김근회;김재석
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.7
    • /
    • pp.97-103
    • /
    • 2004
  • In this paper, we propose a high speed block turbo code decoding algorithm and an efficient hardware architecture. The multimedia wireless data communication systems need channel codes which have the high-performance error correcting capabilities. Block turbo codes support variable code rates and packet sizes, and show a high performance due to a soft decision iteration decoding of turbo codes. However, block turbo codes have a long decoding time because of the iteration decoding and a complicated extrinsic information operation. The proposed algorithm using the threshold that represents a channel information reduces the long decoding time. After the threshold is decided by a simulation result, the proposed algorithm eliminates the calculation for the bits which have a good channel information and assigns a high reliability value to the bits. The threshold is decided by the absolute mean and the standard deviation of a LLR(Log Likelihood Ratio) in consideration that the LLR distribution is a gaussian one. Also, the proposed algorithm assigns '1', the highest reliable value, to those bits. The hardware design result using verilog HDL reduces a decoding time about 30% in comparison with conventional algorithm, and includes about 20K logic gate and 32Kbit memory sizes.

The use of SMA wire dampers to enhance the seismic performance of two historical Islamic minarets

  • El-Attar, Adel;Saleh, Ahmed;El-Habbal, Islam;Zaghw, Abdel Hamid;Osman, Ashraf
    • Smart Structures and Systems
    • /
    • v.4 no.2
    • /
    • pp.221-232
    • /
    • 2008
  • This paper represents the final results of a research program sponsored by the European Commission through project WIND-CHIME ($\underline{W}$ide Range Non-$\underline{IN}$trusive $\underline{D}$evices toward $\underline{C}$onservation of $\underline{HI}$storical Monuments in the $\underline{ME}$diterranean Area), in which the possibility of using advanced seismic protection technologies to preserve historical monuments in the Mediterranean area is investigated. In the current research, the dynamic characteristics of two outstanding Mamluk-Style minarets, which similar minarets were reported to experience extensive damage during Dahshur 1992 earthquake, are investigated. The first minaret is the Qusun minaret (1337 A.D, 736 Hijri Date (H.D)) located in El-Suyuti cemetery on the southern side of the Salah El-Din citadel. The minaret is currently separated from the surrounding building and is directly resting on the ground (no vaults underneath). The total height of the minaret is 40.28 meters with a base rectangular shaft of about 5.42 ${\times}$ 5.20 m. The second minaret is the southern minaret of Al-Sultaniya (1340 A.D, 739 H.D). It is located about 30.0 meters from Qusun minaret, and it is now standing alone but it seems that it used to be attached to a huge unidentified structure. The style of the minaret and its size attribute it to the first half of the fourteenth century. The minaret total height is 36.69 meters and has a 4.48 ${\times}$ 4.48 m rectangular base. Field investigations were conducted to obtain: (a) geometrical description of the minarets, (b) material properties of the minarets' stones, and (c) soil conditions at the minarets' location. Ambient vibration tests were performed to determine the modal parameters of the minarets such as natural frequencies and mode shapes. A $1/16^{th}$ scale model of Qusun minaret was constructed at Cairo University Concrete Research Laboratory and tested under free vibration with and without SMA wire dampers. The contribution of SMA wire dampers to the structural damping coefficient was evaluated under different vertical loads and vibration amplitudes. Experimental results were used along with the field investigation data to develop a realistic 3-D finite element model that can be used for seismic risk evaluation of the minarets. Examining the updated finite element models under different seismic excitations indicated the vulnerability of such structures to earthquakes with medium to high a/v ratio. The use of SMA wire dampers was found feasible for reducing the seismic risk for this type of structures.

Selective Skin Tone Reproduction using Preferred Skin Colors (선호 피부색을 사용한 선택적인 피부색 재현 기법)

  • Kim, Dae-Chul;Kyung, Wang-Jun;Ha, Yeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.4
    • /
    • pp.10-15
    • /
    • 2012
  • In a color image, people and especially facial patterns are important and interesting visual objects. Thus, effective skin color reproduction is essential, as skin color is a key memory color in color application systems. Previous studies suggested skin color reproduction by mapping only to the center value of preferred skin region. However, it is not suitable to determine one preference color because preference color from the observer's preference test is not dominant. In this paper, skin color reproduction using multiple preferred skin colors for each race is proposed. The proposed method first defines multiple preferred skin colors for each race according to their luminance level. After that, skin region is detected in an image. The race is then selected by calculating distance between average chromaticity of detected region and that of each racial skin from a database to assign preferred skin color for each race. Next, each corresponding preferred skin color is determined for each selected race. Finally, input skin color is proportionally mapped toward preferred skin color according to the difference between the input skin color and the preferred skin color for a smoothly reproduced skin color. In the experimental results, the proposed method gives better color correction on the objective and subjective evaluation than the previous methods.

Characteristics and Automatic Detection of Block Reference Patterns (블록 참조 패턴의 특성 분석과 자동 발견)

  • Choe, Jong-Mu;Lee, Dong-Hui;No, Sam-Hyeok;Min, Sang-Ryeol;Jo, Yu-Geun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1083-1095
    • /
    • 1999
  • 최근 처리기와 입출력 시스템의 속도 차이가 점점 커짐에 따라 버퍼 캐쉬의 효율적인 관리가 더욱 중요해지고 있다. 버퍼 캐쉬는 블록 교체 정책과 선반입 정책에 의해 관리되며, 각 정책은 버퍼 캐쉬에서 블록의 가치 즉 어떤 블록이 더 가까운 미래에 참조될 것인가를 결정해야 한다. 블록의 가치는 응용들의 블록 참조 패턴의 특성에 기반하며, 블록 참조 패턴의 특성에 대한 정확한 분석은 올바른 결정을 가능하게 하여 버퍼 캐쉬의 효율을 높일 수 있다. 본 논문은 각 응용들의 블록 참조 패턴에 대한 특성을 분석하고 이를 자동으로 발견하는 기법을 제안한다. 제안된 기법은 블록의 속성과 미래 참조 거리간의 관계를 이용해 블록 참조 패턴을 발견한다. 이 기법은 2 단계 파이프라인 방법을 이용하여 온라인으로 참조 패턴을 발견할 수 있으며, 참조 패턴의 변화가 발생하면 이를 인식할 수 있다. 본 논문에서는 8개의 실제 응용 트레이스를 이용해 블록 참조 패턴의 발견을 실험하였으며, 제안된 기법이 각 응용의 블록 참조 패턴을 정확히 발견함을 확인하였다. 그리고 발견된 참조 패턴 정보를 블록 교체 정책에 적용해 보았으며, 실험 결과 기존의 대표적인 블록 교체 정책인 LRU에 비해 최대 57%까지 디스크 입출력 횟수를 줄일 수 있었다.Abstract As the speed gap between processors and disks continues to increase, the role of the buffer cache located in main memory is becoming increasingly important. The buffer cache is managed by block replacement policies and prefetching policies and each policy should decide the value of block, that is which block will be accessed in the near future. The value of block is based on the characteristics of block reference patterns of applications, hence accurate characterization of block reference patterns may improve the performance of the buffer cache. In this paper, we study the characteristics of block reference behavior of applications and propose a scheme that automatically detects the block reference patterns. The detection is made by associating block attributes of a block with the forward distance of the block. With the periodic detection using a two-stage pipeline technique, the scheme can make on-line detection of block reference patterns and monitor the changes of block reference patterns. We measured the detection capability of the proposed scheme using 8 real workload traces and found that the scheme accurately detects the block reference patterns of applications. Also, we apply the detected block reference patterns into the block replacement policy and show that replacement policies appropriate for the detected block reference patterns decreases the number of DISK I/Os by up to 57%, compared with the traditional LRU policy.

Implementation of a TCP/IP Offload Engine Using Lightweight TCP/IP on an Embedded System (임베디드 시스템상에서 Lightweight TCP/IP를 이용한 TCP/IP Offload Engine의 구현)

  • Yoon In-Su;Chung Sang-Hwa;Choi Bong-Sik;Jun Yong-Tae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.7
    • /
    • pp.413-420
    • /
    • 2006
  • The speed of present-day network technology exceeds a gigabit and is developing rapidly. When using TCP/IP in these high-speed networks, a high load is incurred in processing TCP/IP protocol in a host CPU. To solve this problem, research has been carried out into TCP/IP Offload Engine (TOE). The TOE processes TCP/IP on a network adapter instead of using a host CPU; this reduces the processing burden on the host CPU. In this paper, we developed two software-based TOEs. One is the TOE implementation using an embedded Linux. The other is the TOE implementation using Lightweight TCP/IP (lwIP). The TOE using an embedded Linux did not have the bandwidth more than 62Mbps. To overcome the poor performance of the TOE using an embedded Linux, we ported the lwIP to the embedded system and enhanced the lwIP for the high performance. We eliminated the memory copy overhead of the lwIP. We added a delayed ACK and a TCP Segmentation Offload (TSO) features to the lwIP and modified the default parameters of the lwIP for large data transfer. With the aid of these modifications, the TOE using the modified lwIP shows a bandwidth of 194 Mbps.

A Study on the Prediction Accuracy Bounds of Instruction Prefetching (명령어 선인출 예측 정확도의 한계에 관한 연구)

  • Kim, Seong-Baeg;Min, Sang-Lyul;Kim, Chong-Sang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.8
    • /
    • pp.719-729
    • /
    • 2000
  • Prefetching aims at reducing memory latency by fetching, in advance, data that are likely to be requested by the processor in a near future. The effectiveness of prefetching is determined by how accurate the prediction on the needed instructions and data is. Most previous studies on prefetching were limited to proposing a particular prefetch scheme and its performance evaluation, paying little attention to theoretical aspects of prefetching. This paper focuses on the theoretical aspects of instruction prefetching. For this purpose, we propose a clairvoyant prefetch model that makes use of perfect history information. Based on this theoretical model, we analyzed upper limits on the prefetch prediction accuracies of the SPEC benchmarks. The results show that the prefetch prediction accuracy is very high when there is no cache. However, as the size of the instruction cache increases, the prefetch prediction accuracy drops drastically. For example, in the case of the spice benchmark, the prefetch prediction accuracy drops from 53% to 39% when the cache size increases from 2Kbyte to 16Kbyte (assuming 16byte block size). These results indicate that as the cache size increases, most localities are captured by the cache and that instruction prefetching based on the information extracted from the references that missed in the cache suffers from prediction inaccuracies

  • PDF

Implementation of the AMBA AXI4 Bus interface for effective data transaction and optimized hardware design (효율적인 데이터 전송과 하드웨어 최적화를 위한 AMBA AXI4 BUS Interface 구현)

  • Kim, Hyeon-Wook;Kim, Geun-Jun;Jo, Gi-Ppeum;Kang, Bong-Soon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.2
    • /
    • pp.70-75
    • /
    • 2014
  • Recently, the demand for high-integrated, low-powered, and high-powered SoC design has been increasing due to the multi-functionality and the miniaturization of digital devices and the high capacity of service informations. With the rapid evolution of the system, the required hardware performances have become diversified, the FPGA system has been increasingly adopted for the rapid verification, and SoC system using the FPGA and the ARM core for control has been growingly chosen. While the AXI bus is used in these kinds of systems in various ways, it is traditionally designed with AXI slave structure. In slave structure, there are problems with the CPU resources because CPU is continually involved in the data transfer and can't be used in other jobs, and with the decreased transmission efficiency because the time not used of AXI bus beomes longer. In this paper, an efficient AXI master interface is proposed to solve this problem. The simulation results show that the proposed system achieves reductions in the consumption clock by an average of 51.99% and in the slice by 31% and that the maximum operating frequency is increased to 107.84MHz by about 140%.