• Title/Summary/Keyword: Processor-sharing

Search Result 112, Processing Time 0.023 seconds

A Fair Scheduling Model Covering the History-Sensitiveness Spectrum (과거민감도 스펙트럼을 포괄하는 공정 스케줄링 모델)

  • Park, Kyeong-Ho;Hwang, Ho-Young;Lee, Chang-Gun;Min, Sangl-Yul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.5_6
    • /
    • pp.249-256
    • /
    • 2007
  • GPS(generalized processor sharing) is a fair scheduling scheme that guarantees fair distribution of resources in an instantaneous manner, while virtual clock pursues fairness in the sense of long-term. In this paper, we notice that the degree of memorylessness is the key difference of the two schemes, and propose a unified scheduling model that covers the whole spectrum of history-sensitiveness. In this model, each application's resource right is represented in a value called deposit, which is accumulated at a predefined rate and is consumed for services. The unused deposit, representing non-usage history, gives the application more opportunity to be scheduled, hence relatively enhancing its response time. Decay of the deposit means partial erase of the history and, by adjusting the decaying rate, the degree of history-sensitiveness is controlled. In the spectrum, the memoryless end corresponds GPS and the other end with full history corresponds virtual clock. And there exists a tradeoff between average delay and long-term fairness. We examine the properties of the model by analysis and simulation.

Study of an In-order SMT Architecture and Grouping Schemes

  • Moon, Byung-In;Kim, Moon-Gyung;Hong, In-Pyo;Kim, Ki-Chang;Lee, Yong-Surk
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.339-350
    • /
    • 2003
  • In this paper, we propose a simultaneous multithreading (SMT) architecture that improves instruction throughput by exploiting instruction level parallelism (ILP) and thread level parallelism (TLP). The proposed architecture issues and completes instructions belonging to the same thread in exact program order. The issue and completion policy greatly reduces the design complexity and hardware cost of our architecture, compared with others that employ out-of-order issue and completion. On the other hand, when the instructions belong to different threads, the issue and completion orders for those instructions may not necessarily be identical to the fetch order. The processor issues instructions simultaneously from multiple threads to functional units by exploiting ILP and TLP, and by dynamic resource sharing. That parallel execution notably improves performance and resource utilization with minimal additional hardware cost over the conventional superscalar processors. This paper proposes an SMT architecture with grouping as well as one without grouping. Without grouping, all threads dynamically and flexibly share most resources. On the other hand, in the SMT architecture with grouping, in which resources and threads are divided into several groups for design simplification, resources are shared only among threads belonging to the same group as those resources. Simulation results show that our processors with four and eight threads improve performance by three or more times over the conventional superscalar processor with comparable execution resources and policies, and that reasonable grouping reduces the design complexity of SMT processors with little negative effect on performance.

An Efficient Hardware Implementation of ARIA Block Cipher Algorithm Supporting Four Modes of Operation and Three Master Key Lengths (4가지 운영모드와 3가지 마스터 키 길이를 지원하는 블록암호 알고리듬 ARIA의 효율적인 하드웨어 구현)

  • Kim, Dong-Hyeon;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.11
    • /
    • pp.2517-2524
    • /
    • 2012
  • This paper describes an efficient implementation of KS(Korea Standards) block cipher algorithm ARIA. The ARIA crypto-processor supports three master key lengths of 128/192/256-bit and four modes of operation including ECB, CBC, OFB and CTR. A hardware sharing technique, which shares round function in encryption/decryption with key initialization, is employed to reduce hardware complexity. It reduces about 20% of gate counts when compared with straightforward implementation. The ARIA crypto-processor is verified by FPGA implementation, and synthesized with a $0.13-{\mu}m$ CMOS cell library. It has 46,100 gates on an area of $684-{\mu}m{\times}684-{\mu}m$ and the estimated throughput is about 1.28 Gbps at 200 MHz@1.2V.

Sliding-DFT based multi-channel phase measurement FPGA system (Sliding-DFT를 이용한 다채널 위상 측정 FPGA 시스템)

  • Eo, Jin-Woo;Chang, Tae-Gyu
    • Journal of IKEEE
    • /
    • v.8 no.1 s.14
    • /
    • pp.128-135
    • /
    • 2004
  • This paper proposes a phase measurement algorithm which is based on the recursive implementation of sliding-DFT. The algorithm is designed to have a robust behavior against the erroneous factors of frequency drift, additive noise, and twiddle factor approximation. The size of phase error caused by the finite wordlength implementation of DFT twiddle factors is shown significantly lower than that of magnitude error. The drastic reduction of the phase error is achieved by the exploitation of the quadruplet symmetry characteristics of the approximated twiddle factors in the complex plane. Four channel power-line phase measurement system is also designed and implemented based on the time-multiplexed sharing architecture of the proposed algorithm. The operation of the developed system is also verified by the experiment performed under the test environment implemented with the multi-channel function generator and the on-line interfaced host processor system. The proposed algorithm's features of phase measurement accuracy and its robustness against the finite wordlength effects can provide a significant impact especially for the ASIC or microprocessor based embedded system applications where the enhanced processing speed and implementation simplicity are crucial design considerations.

  • PDF

Performance Comparison between Hardware & Software Cache Partitioning Techniques (하드웨어 캐시 파티셔닝과 소프트웨어 캐시 파티셔닝의 성능 비교)

  • Park, JiWoong;Yeom, HeonYoung;Eom, Hyeonsang
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.177-182
    • /
    • 2015
  • The era of multi-core processors has begun since the limit of the clock speed has been reached. These days, multi-core technology is used not only in desktops, servers, and table PCs, but also in smartphones. In this architecture, there is always interference between processes, because of the sharing of system resources. To address this problem, cache partitioning is used, which can be roughly divided into two types: software and hardware cache partitioning. When it comes to dynamic cache partitioning, hardware cache partitioning is superior to software cache partitioning, because it needs no page copy. In this paper, we compare the effectiveness of hardware and software cache partitioning on the AMD Opteron 6282 SE, which is the only commodity processor providing hardware cache partitioning, to see whether this technique can be effectively deployed in dynamic environments.

Efficient Interface circuits of Embedded Memory for RISC-based DSP Microprocessor (RICS-based DSP의 효율적인 임베디드 메모리 인터페이스)

  • Kim, You-Jin;Cho, Kyoung-Rok;Kim, Sung-Sik;Cheong, Eui-Seok
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.9
    • /
    • pp.1-12
    • /
    • 1999
  • In this paper, we designed an embedded processor with 128Kbytes EPROM and 4Kbytes SRAM based on GMS30C2132 which RISC processor with DSP functions. And a new architecture of bus sharing to control the embedded memory and external memory unit i proposed aiming at one-cycle access between memories and CPU. For embedded 128Kbytes EPROM, we designed the new expansion interface for data size at data ordering with memory organization and the efficient interface for test. The embedded SRAM supports an extended stack area high speed DSP operation, instruction cache and variable data-length control which is accessed with 4K modulo addressing schemes. The proposed new architecture and circuits reduced the memory access cycle time from 40ns and improved operation speed 2-times for program benchmark test. The chip is occupied $108.68mm^2$ using $0.6{\mu}m$ CMOS technology.

  • PDF

A Study on Data Recording and Play Method between Tactical Situations to Ensure Data Integrity with Data Link Processor Based on Multiple Data Links (다중데이터링크 기반에서 데이터링크 처리기와의 데이터 무결성 보장을 위한 전술상황전시기 간 데이터 기록 및 재생 방법 연구)

  • Lee, Hyunju;Jung, Eunmi;Lee, Sungwoo;Yeom, Jaegeol;Kim, Sangjun;Park, Jihyeon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.13 no.2
    • /
    • pp.13-25
    • /
    • 2017
  • Recently, the high performance of tactical situation display console and tactical data links are used to integrate the operational situations in accordance with information age and NCW (Network Centric Warfare). The tendency to maximize the efficiency of task execution has been developed by sharing information and the state of the battle quickly through complex and diverse information exchange. Tactical data link is a communication system that shares the platform with core components of weapons systems and battlefield situation between the command and control systems to perform a Network Centric Warfare and provides a wide range of tactical data required for decision-making and implementation.It provides the tactical information such as tactical information such as operational information, the identification of the peer, and the target location in real time or near real time in the battlefield situation, and it is operated for the exchange of mass tactical information between the intellectuals by providing common situation recognition and cooperation with joint operations. In this study, still image management, audio file management, tactical screen recording and playback using the storage and playback, NITF (National Imagery Transmission Format) message received from the displayer integrates the tactical situation in three dimensions according to multiple data link operation to suggest ways to ensure data integrity between the data link processor during the entire operation time.

Bandwidth Management of WiMAX Systems and Performance Modeling

  • Li, Yue;He, Jian-Hua;Xing, Weixi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.2 no.2
    • /
    • pp.63-81
    • /
    • 2008
  • WiMAX has been introduced as a competitive alternative for metropolitan broadband wireless access technologies. It is connection oriented and it can provide very high data rates, large service coverage, and flexible quality of services (QoS). Due to the large number of connections and flexible QoS supported by WiMAX, the uplink access in WiMAX networks is very challenging since the medium access control (MAC) protocol must efficiently manage the bandwidth and related channel allocations. In this paper, we propose and investigate a cost-effective WiMAX bandwidth management scheme, named the WiMAX partial sharing scheme (WPSS), in order to provide good QoS while achieving better bandwidth utilization and network throughput. The proposed bandwidth management scheme is compared with a simple but inefficient scheme, named the WiMAX complete sharing scheme (WCPS). A maximum entropy (ME) based analytical model (MEAM) is proposed for the performance evaluation of the two bandwidth management schemes. The reason for using MEAM for the performance evaluation is that MEAM can efficiently model a large-scale system in which the number of stations or connections is generally very high, while the traditional simulation and analytical (e.g., Markov models) approaches cannot perform well due to the high computation complexity. We model the bandwidth management scheme as a queuing network model (QNM) that consists of interacting multiclass queues for different service classes. Closed form expressions for the state and blocking probability distributions are derived for those schemes. Simulation results verify the MEAM numerical results and show that WPSS can significantly improve the network’s performance compared to WCPS.

A Kernel Module to Support High-Performance Intra-Node Communication for Multi-Core Systems (멀티 코어 시스템을 위한 고속 노드내 통신 지원 모듈)

  • Jin, Hyun-Wook;Kang, Hyun-Goo;Kim, Jong-Soon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.9
    • /
    • pp.407-415
    • /
    • 2007
  • In parallel cluster computing systems, the efficiency of communication between computing nodes is one of important factors that decide overall system performance. Accordingly, many researchers have studied on high-performance inter-node communication. The recently launched multi-core processor, however. increases the importance of intra-node communication as well because the more the number of cores in a node, the more the number of parallel processes running in the same node. Though there have been studies on intra-node communications, these have limited considerations on the state-of-the-art systems. In this paper, we propose a Linux kernel module that minimizes the number of data copy by exploiting the memory mapping mechanism for high-performance intra-node communication. The proposed kernel module supports the Linux kernel version 2.6. The performance measurements over a multi-core system present that the proposed kernel module can achieve lower latency up to 62% and higher throughput up to 144% than an existing kernel module approach. In addition, the measurements reveal that the performance of intra-node communication can vary significantly based on whether the cores that run the communication processes are belong to the same processor package (i.e., sharing the L2 cache).

Design and Fabrication of Low Power Sensor Network Platform for Ubiquitous Health Care

  • Lee, Young-Dong;Jeong, Do-Un;Chung, Wan-Young
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1826-1829
    • /
    • 2005
  • Recent advancement in wireless communications and electronics has enabled the development of low power sensor network. Wireless sensor network are often used in remote monitoring control applications, health care, security and environmental monitoring. Wireless sensor networks are an emerging technology consisting of small, low-power, and low-cost devices that integrate limited computation, sensing, and radio communication capabilities. Sensor network platform for health care has been designed, fabricated and tested. This system consists of an embedded micro-controller, Radio Frequency (RF) transceiver, power management, I/O expansion, and serial communication (RS-232). The hardware platform uses Atmel ATmega128L 8-bit ultra low power RISC processor with 128KB flash memory as the program memory and 4KB SRAM as the data memory. The radio transceiver (Chipcon CC1000) operates in the ISM band at 433MHz or 916MHz with a maximum data rate of 76.8kbps. Also, the indoor radio range is approximately 20-30m. When many sensors have to communicate with the controller, standard communication interfaces such as Serial Peripheral Interface (SPI) or Integrated Circuit ($I^{2}C$) allow sharing a single communication bus. With its low power, the smallest and low cost design, the wireless sensor network system and wireless sensing electronics to collect health-related information of human vitality and main physiological parameters (ECG, Temperature, Perspiration, Blood Pressure and some more vitality parameters, etc.)

  • PDF