• Title/Summary/Keyword: shared memory multiprocessor

Search Result 52, Processing Time 0.025 seconds

A Functional Design of Programmable Logic Controller Based on Parallel Architecture (병렬 구조에 의한 가변 논리제어장치의 기능적 설계)

  • 이정훈;신현식
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.40 no.8
    • /
    • pp.836-844
    • /
    • 1991
  • PLC(programmable logic controller) system is widely used for the control of factory. PLC system receives ladder diagram which is drawn by the user to implement hardware logic, converts the ladder diagram into sequence program which is executable in the PLC system, and executes the sequence program indefinitely unless user breaks. The sequence program processes the data of on/off signal, and endures 1 scan delay and missing of pulse-type signal shorter than a scan time. So, data dependency doesn't exist. By applying theis characteristics to multiprocessor architecture, we design parellel PLC functionally and evaluate performance upgrade. Parallel PLC consists of central processing module, N general processing unit, and a shared memory by master-slave type. Each module executes allocated sequence program by the control of central processing module. We can expect performance upgrade by parallel processing, and reliability by relocation of sequence program when error occurs in processing module.

  • PDF

Implementation of a block transfer protocol for a pipelined bus (파이프라인드 버스에서 블록 전송 방법의 구현)

  • 한종석;심원세;기안도;윤석한
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.9
    • /
    • pp.70-79
    • /
    • 1996
  • Block data transfer poses a serious problem is a pipelined bus where each data transfer step is pipelined. In this paper, we describe the design and implementation of a variable data block transfer protocol for a pipelined bus of a shared-memory multiprocessor. The proposed method maintains compatibility with the existing protocol for the pipelined bus and ensures fairness and effectiveness by preventing starvation. We present flow charts of requester and responder during a block transfer in the pipelined bus that uses the proposed protocol. The proposed protocol was implemented for the TICOM-III HiPi+Bus.

  • PDF

Design and Simulation of Interconnection Network Based on Topological Combination (위상 결합을 기반으로 한 연결 망 설계 및 시뮬레이션)

  • 장창수;최창훈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.6B
    • /
    • pp.563-574
    • /
    • 2004
  • In this paper, we propose a new class of MIN(Multistage Interconnection Network) called Combine MIN which combines static network topology and apimic network topology. Combine U provides multiple paths at a hardware cost lower than that of MIN with unique path property. Combine MIN can be constructed suitable for localized communication by providing the shortcut path and multiple paths inside the processor-memory cluster which has frequent data communications. According to the results of analysis and simulation for performance evaluation, Combine MIN shows higher performance than MINs of the same network size in the highly localized communication Therefore, Combine MIN can be used as an attractive interconnection network for parallel applications with a localized communication pattern in shared-memory multiprocessor systems.

Cache Performance Analysis of Multiprocessor Systems for OLTP Applications based on a Memory-Resident DBMS (메모리 상주 DBMS 기반의 OLTP 응용을 위한 다중프로세서 시스템 캐쉬 성능 분석)

  • Chung, Yong-Wha;Hahn, Woo-Jong;Yoon, Suk-Han;Park, Jin-Won;Lee, Kang-Woo;Kim, Yang-Woo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.4
    • /
    • pp.383-392
    • /
    • 2000
  • Currently, multiprocessors are evaluated almost exclusively with scientific applications. Commercial applications are rarely explored because it is difficult to obtain the source codes of commercial DBMS. Even when the source code is available, such as for POSTGRES, understanding the source code enough to perform detailed meaningful performance evaluations is a daunting task for computer architects.To evaluate multiprocessors with commercial applications, we have developed our own DBMS, called EZDB. EZDB is a parallelized DBMS, loosely inspired from POSTGRES, and running on top of a software architecture simulator. It is capable of executing parallel programs written in SQL. Contrary to POSTGRES, EZDB is not intended as a prototype for a production-quality DBMS. Its purpose is to easily run and evaluate the performance of commercial applications on multiprocessor architectures. To illustrate the usefulness of EZDB, we showed the cache performance data collected for the TPC-B benchmark on a shared-memory multiprocessor. The simulation results showed that the data structures exhibited unique sharing characteristics and that their locality properties and working sets were very different from those in scientific applications.

  • PDF

Proposition and Evaluation of Parallelism-Independent Scheduling Algorithms for DAGs of Tasks with Non-Uniform Execution Time

  • Kirilka Nikolova;Atusi Maeda;Sowa, Masa-Hiro
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.289-293
    • /
    • 2000
  • We propose two new algorithms for parallelism-independent scheduling. The machine code generated from the compiler using these algorithms in its scheduling phase is parallelism-independent code, executable in minimum time regardless of the number of the processors in the parallel computer. Our new algorithms have the following phases: finding the minimum number of processors on which the program can be executed in minimal time, scheduling by an heuristic algorithm for this predefined number of processors, and serialization of the parallel schedule according to the earliest start time of the tasks. At run time tasks are taken from the serialized schedule and assigned to the processor which allows the earliest start time of the task. The order of the tasks decided at compile time is not changed at run time regardless of the number of the available processors which means there is no out-of-order issue and execution. The scheduling is done predominantly at compile time and dynamic scheduling is minimized and diminished to allocation of the tasks to the processors. We evaluate the proposed algorithms by comparing them in terms of schedule length to the CP/MISF algorithm. For performance evaluation we use both randomly generated DAGs (directed acyclic graphs) and DACs representing real applications. From practical point of view, the algorithms we propose can be successfully used for scheduling programs for in-order superscalar processors and shared memory multiprocessor systems. Superscalar processors with any number of functional units can execute the parallelism-independent code in minimum time without necessity for dynamic scheduling and out-of-order issue hardware. This means that the use of our algorithms will lead to reducing the complexity of the hardware of the processors and the run-time overhead related to the dynamic scheduling.

  • PDF

Parallel Sorting Algorithm by Median-Median (중위수의 중위수에 의한 병렬 분류 알고리즘)

  • Min, Yong-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.14-21
    • /
    • 1995
  • This paper presents a parallel sorting algorithm suitable for the SIMD multiprocessor. The algorithm finds pivots for partitioning the data into ordered subsets. The data can be evenly distributed to be sorted since it uses the probability theory. For n data elements to be sorted on p processors, when $n{\geq}p^2$, the algorithm is shown to be asymptotically optimal. In practice, sorting 8 million data items on 64 processors achieved a 48.43-fold speedup, while the PSRS required a 44.4-fold speedup. On a variety of shared and distributed memory machines, the algorithm achieved better than half-linear speedups.

  • PDF

New Transient Request with Loose Ordering for Token Coherence Protocol (토큰 코히런스 프로토콜을 위한 경서열 트렌지언트 요청 처리 방법)

  • Park, Yun Kyung;Kim, Dae Young
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.10
    • /
    • pp.615-619
    • /
    • 2005
  • Token coherence protocol has many good reasons against snooping/directory-based protocol in terms of latency, bandwidth, and complexity. Token counting easily maintains correctness of the protocol without global ordering of request which is basis of other dominant cache coherence protocols. But this lack of global ordering causes starvation which is not happening in snooping/directory-based protocols. Token coherence protocol solves this problem by providing an emergency mechanism called persistent request. It enforces other processors in the competition (or accessing same shared memory block, to give up their tokens to feed a starving processor. However, as the number of processors grows in a system, the frequency of starvation occurrence increases. In other words, the situation where persistent request occurs becomes too frequent to be emergent. As the frequency of persistent requests increases, not only the cost of each persistent matters since it is based on broadcasting to all processors, but also the increased traffic of persistent requests will saturate the bandwidth of multiprocessor interconnection network. This paper proposes a new request mechanism that defines order of requests to reduce occurrence of persistent requests. This ordering mechanism has been designed to be decentralized since centralized mechanism in both snooping-based protocol and directory-based protocol is one of primary reasons why token coherence protocol has advantage in terms of latency and bandwidth against these two dominant Protocols.

Efficient Hardware Support: The Lock Mechanism without Retry (하드웨어 지원의 재시도 없는 잠금기법)

  • Kim Mee-Kyung;Hong Chul-Eui
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.9
    • /
    • pp.1582-1589
    • /
    • 2006
  • A lock mechanism is essential for synchronization on the multiprocessor systems. The conventional queuing lock has two bus traffics that are the initial and retry of the lock-read. %is paper proposes the new locking protocol, called WPV (Waiting Processor Variable) lock mechanism, which has only one lock-read bus traffic command. The WPV mechanism accesses the shared data in the initial lock-read phase that is held in the pipelined protocol until the shared data is transferred. The nv mechanism also uses the cache state lock mechanism to reduce the locking overhead and guarantees the FIFO lock operations in the multiple lock contentions. In this paper, we also derive the analytical model of WPV lock mechanism as well as conventional memory and cache queuing lock mechanisms. The simulation results on the WPV lock mechanism show that about 50% of access time is reduced comparing with the conventional queuing lock mechanism.

Data Communication Prediction Model in Multiprocessors based on Robust Estimation (로버스트 추정을 이용한 다중 프로세서에서의 데이터 통신 예측 모델)

  • Jun Janghwan;Lee Kangwoo
    • The KIPS Transactions:PartA
    • /
    • v.12A no.3 s.93
    • /
    • pp.243-252
    • /
    • 2005
  • This paper introduces a noble modeling technique to build data communication prediction models in multiprocessors, using Least-Squares and Robust Estimation methods. A set of sample communication rates are collected by using a few small input data sets into workload programs. By applying estimation methods to these samples, we can build analytic models that precisely estimate communication rates for huge input data sets. The primary advantage is that, since the models depend only on data set size not on the specifications of target systems or workloads, they can be utilized to various systems and applications. In addition, the fact that the algorithmic behavioral characteristics of workloads are reflected into the models entitles them to model diverse other performance metrics. In this paper, we built models for cache miss rates which are the main causes of data communication in shared memory multiprocessor systems. The results present excellent prediction error rates; below $1\%$ for five cases out of 12, and about $3\%$ for the rest cases.

A Reconfigurable Load and Performance Balancing Scheme for Parallel Loops in a Clustered Computing Environment (클러스터 컴퓨팅 환경에서 병렬루프 처리를 위한 재구성 가능한 부하 및 성능 균형 방법)

  • 김태형
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.1
    • /
    • pp.49-56
    • /
    • 2004
  • Load imbalance is a serious impediment to achieving good performance in parallel processing. Global load balancing schemes cannot adequately manage to balance parallel tasks generated from a single application. Dynamic loop scheduling methods are known to be useful in balancing parallel loops on shared-memory multiprocessor machines. However, their centralized nature causes a bottleneck for the relatively small number of processors in a network of workstations because of order-of-magniture differences in communication overheads. Moreover, improvements of basis loops scheduling methods have not effectively dealt with irregularly distributed workloads in parallel loops, which commonly occur in applications for a network of workstation. In this paper, we present a new reconfigurable and decentralized balancing method for parallel loops on a network of workstations. Since our method supplements performance balancing with those tranditional load balancing methods, it minimizes the overall execution time.