• Title/Summary/Keyword: processor sharing

Search Result 112, Processing Time 0.028 seconds

THE INFLUENCE OF THE TIME SLICING OF A PROCESSOR SHARING COMMUNICATION MODEL

  • LIM JONG SEUL;PARK CHIN HONG;AHN SEONG JOON
    • Journal of applied mathematics & informatics
    • /
    • v.17 no.1_2_3
    • /
    • pp.737-746
    • /
    • 2005
  • Average memory occupancy and congestion in computer system or communication system may be reduced further if new jobs are admitted only when the number of jobs queued at CPU is below a certain threshold, run queue cutoff (RQ). In our previous paper we showed that response time of a job is invariant with respect to RQ if jobs do not communicate each other. In this paper, we prove that the invariance property by considering the evolution of the queue lengths as point processes. We also present an approximate method for the delay due to context switching under time slicing.

Excess Bandwidth Fair Queueing Using Excess Bandwidth Consumer Queue (잉여 대역폭 소비 큐를 이용한 잉여 대역폭 페어 큐잉)

  • 추호철;김영한
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.39 no.10
    • /
    • pp.1-10
    • /
    • 2002
  • Scheduling technology is one of the most important elements required to support the Qos(quality of service) in the Internet and a lot of scheduling algorithms have been developed. However, most of there algorithms are not flexible in the distribution of the excess bandwidth. In order to improve the weakness of existing algorithms, DGPS(decoupled generalized processor sharing)has suggested recently. But, the DGPS algorithm is complex to implement and difficult to apply to the existing algorithms. In this paper, we propose a scheduling algorithm for distribution of the excess bandwidth which improves the implementation complexity of the DGPS and easy to be applied to ordinary algorithms.

A VLSI implementation of image processor for facsimile and digital copier (팩시밀리 및 디지털 복사기를 위한 고속 영상 처리기의 VLSI구현)

  • 박창대;정영훈;김형수;김진수;권오준;홍기상;장동구;박기용;김윤수
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.1
    • /
    • pp.105-113
    • /
    • 1998
  • A new image processor is implemented for high-speed digital copiers and facsimiles. The imgage processor performs CCD and CIS interface, pre-processing, enlargement andreduction of gray level image, and various halftoning algorithms. Implemented halftoning algorithms are simple thresholding, fuzzy based mixed mode thresholding, dithering, and edge enhanced error diffusion. The result of binarization is transferred to a printer with serial or paralel output ports. Line by line pipelined data prodessing architecture is employed with time sharing access of the external memory. In receiving mode, it converts the resolution of received binary image for compatibility with conventional facsimile. In copy mode, a line of A3 paper with 400 dpi is processed with in 2.5 ms. The prototype of image processor was implemented usig Laser Programmable Gate Array (LPGA) with 0.8.mu.m technology.

  • PDF

Memory Allocation Scheme for Reducing False Sharing on Multiprocessor Systems (다중처리기 시스템에서 거짓 공유 완화를 위한 메모리 할당 기법)

  • Han, Boo-Hyung;Cho, Seong-Je
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.4
    • /
    • pp.383-393
    • /
    • 2000
  • In shared memory multiprocessor systems, false sharing occurs when several independent data objects, not shared but accessed by different processors, are allocated to the same coherency unit of memory. False sharing is one of the major factors that may degrade the performance of memory coherency protocols. This paper presents a new shared memory allocation scheme to reduce false sharing of parallel applications where master processor controls allocation of all the shared objects. Our scheme allocates the objects to temporary address space for the moment, and actually places each object in the address space of processor that first accesses the object later. Its goal is to allocate independent objects that may have different access patterns to different pages. We use execution-driven simulation of real parallel applications to evaluate the effectiveness of our scheme. Experimental results show that by using our scheme a considerable amount of false sharing faults can be reduced with low overhead.

  • PDF

A Cumulative Fair Service Model in Single Server (단일서버에서의 누적적 공정서비스 모델)

  • Lee Ju-Hyun;Park Kyeong-Ho;Hwang Ho-Young;Min Sang-Lyul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.9
    • /
    • pp.585-591
    • /
    • 2006
  • Generalized Processor Sharing(GPS) model provides instantaneous fair services to currently backlogged sessions. Since this fair service distributes server capacity to backlogged sessions in proportion to their weights, the fairness is only valid between the sessions serviced at the same time. From the long time view, however, this fair service provides different server capacity with one session or another, even if these sessions have the same weights. This paper proposes a cumulative fair service(CFS) model to provide fair server capacity to all sessions in the long time view. This model provides fair service in session Viewpoint because it distributes server capacity in proportion to the weights of sessions. The model and an algorithm referencing that model are analyzed for their properties and performances. Performance evaluations verify that the proposed algorithm provides proportional service capacity to sessions in the long time view.

A Comparative Performance Study for Compute Node Sharing

  • Park, Jeho;Lam, Shui F.
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.287-293
    • /
    • 2012
  • We introduce a methodology for the study of the application-level performance of time-sharing parallel jobs on a set of compute nodes in high performance clusters and report our findings. We assume that parallel jobs arriving at a cluster need to share a set of nodes with the jobs of other users, in that they must compete for processor time in a time-sharing manner and other limited resources such as memory and I/O in a space-sharing manner. Under the assumption, we developed a methodology to simulate job arrivals to a set of compute nodes, and gather and process performance data to calculate the percentage slowdown of parallel jobs. Our goal through this study is to identify a better combination of jobs that minimize performance degradations due to resource sharing and contention. Through our experiments, we found a couple of interesting behaviors for overlapped parallel jobs, which may be used to suggest alternative job allocation schemes aiming to reduce slowdowns that will inevitably result due to resource sharing on a high performance computing cluster. We suggest three job allocation strategies based on our empirical results and propose further studies of the results using a supercomputing facility at the San Diego Supercomputing Center.

Design and Implementation of Multi-mode Sensor Signal Processor on FPGA Device (다중모드 센서 신호 처리 프로세서의 FPGA 기반 설계 및 구현)

  • Soongyu Kang;Yunho Jung
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.4
    • /
    • pp.246-251
    • /
    • 2023
  • Internet of Things (IoT) systems process signals from various sensors using signal processing algorithms suitable for the signal characteristics. To analyze complex signals, these systems usually use signal processing algorithms in the frequency domain, such as fast Fourier transform (FFT), filtering, and short-time Fourier transform (STFT). In this study, we propose a multi-mode sensor signal processor (SSP) accelerator with an FFT-based hardware design. The FFT processor in the proposed SSP is designed with a radix-2 single-path delay feedback (R2SDF) pipeline architecture for high-speed operation. Moreover, based on this FFT processor, the proposed SSP can perform filtering and STFT operation. The proposed SSP is implemented on a field-programmable gate array (FPGA). By sharing the FFT processor for each algorithm, the required hardware resources are significantly reduced. The proposed SSP is implemented and verified on Xilinxh's Zynq Ultrascale+ MPSoC ZCU104 with 53,591 look-up tables (LUTs), 71,451 flip-flops (FFs), and 44 digital signal processors (DSPs). The FFT, filtering, and STFT algorithm implementations on the proposed SSP achieve 185x average acceleration.

Design of Processor Duplication using Extend Warm standby sharing (Warm standby sharing을 이용한 프로세서 이중화의 설계)

  • Goo, Jung-Du
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.336-338
    • /
    • 2010
  • 이동통신시스템에서 RNC의 MCP는 호 처리를 담당하는 부분으로, 신뢰도와 실시간성이 요구된다. MCP는 높은 견고성을 갖도록 구현되지만 다소간의 오류 율(Fault late)은 존재할 수밖에 없으므로 프로세서를 이중화하여 활성화된 프로세서가 장애를 일으키더라도 대기중인 프로세서가 연속적인 서비스를 제공할 수 있어야 한다. Warm standby sharing에 비하여 Hot standby sharing은 데이터 손실이 없고 오류 데이터가 확산되지 않는 등의 다수의 장점을 갖지만 동기화 문제로 인하여 이를 시스템에 실제로 구현하는 것은 어렵다. 따라서 본 연구에서는 동기화의 장점에 데이터 손실 및 거짓 데이터의 확산 문제를 개선 함으로서, 실제 구현의 용이성 및 성능 향상이라는 결과를 얻으려 하였다.

  • PDF

KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

  • Vo, Viet Tan;Kim, Cheol Hong
    • Journal of Information Processing Systems
    • /
    • v.17 no.6
    • /
    • pp.1157-1169
    • /
    • 2021
  • Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernel progress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warp schedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.

Design and Implementation Systolic Array FFT Processor Based on Shared Memory (공유 메모리 기반 시스토릭 어레이 FFT 프로세서 설계 및 구현)

  • Jeong, Dongmin;Roh, yunseok;Son, Hanna;Jung, Yongchul;Jung, Yunho
    • Journal of IKEEE
    • /
    • v.24 no.3
    • /
    • pp.797-802
    • /
    • 2020
  • In this paper, we presents the design and implementation results of the FFT processor, which supports 4096 points of operation with less memory by sharing several memory used in the base-4 systolic array FFT processor into one memory. Sharing memory provides the advantage of reducing the area, and also simplifies the flow of data as I/O of the data progresses in one memory. The presented FFT processor was implemented and verified on the FPGA device. The implementation resulted in 51,855 CLB LUTs, 29,712 CLB registers, 8 block RAM tiles and 450 DSPs, and confirmed that the memory area could be reduced by 65% compared to the existing base-4 systolic array structure.