• Title/Summary/Keyword: speedup

Search Result 273, Processing Time 0.023 seconds

An Insight of Speedup (속도향상에 대한 고찰)

  • Ki, Ando
    • Electronics and Telecommunications Trends
    • /
    • v.14 no.2 s.56
    • /
    • pp.53-57
    • /
    • 1999
  • Speedup is often used to show scalability, but its classical definition fails to explain some real measurements such as superlinear speedup. This leads to scaled speedup which scales other system parameters as number of rocessors changes. In this paper, scaled speedup and architectural speedup are introduced and superlinear speedup is explained with its cause.

Parallel solution of linear systems on the CRAY-2 using multi/micro tasking library (CRAY-2에서 멀티/마이크로 태스킹 라이브러리를 이용한 선형시스템의 병렬해법)

  • Ma, Sang-Back
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2711-2720
    • /
    • 1997
  • Multitasking and microtasking on the CRAY machine provides still another way to improve computational power. Since CRAY-2 has 4 processors we can achieve speedup up to 4 properly designed algorithms. In this paper we present two parallelizations of linear system solution in the CRAY-2 with multitasking and microtasking library. One is the LU decomposition on the dense matrices and the other is the iterative solution of large sparse linear systems with the preconditioner proposed by Radicati di Brozolo. In the first case we realized a speedup of 1.3 with 2 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 8192 with the microtasking. In the first case the speedup is limited because of the nonuniform vector lenghts. In the second case the ILU(0) preconditioner with Radicati's technique seem to realize a reasonable high speedup with 4 processors.

  • PDF

Reinforcement learning Speedup method using Q-value Initialization (Q-value Initialization을 이용한 Reinforcement Learning Speedup Method)

  • 최정환
    • Proceedings of the IEEK Conference
    • /
    • 2001.06c
    • /
    • pp.13-16
    • /
    • 2001
  • In reinforcement teaming, Q-learning converges quite slowly to a good policy. Its because searching for the goal state takes very long time in a large stochastic domain. So I propose the speedup method using the Q-value initialization for model-free reinforcement learning. In the speedup method, it learns a naive model of a domain and makes boundaries around the goal state. By using these boundaries, it assigns the initial Q-values to the state-action pairs and does Q-learning with the initial Q-values. The initial Q-values guide the agent to the goal state in the early states of learning, so that Q-teaming updates Q-values efficiently. Therefore it saves exploration time to search for the goal state and has better performance than Q-learning. 1 present Speedup Q-learning algorithm to implement the speedup method. This algorithm is evaluated. in a grid-world domain and compared to Q-teaming.

  • PDF

A Study on Sorting in A Computer Using The Binary Multi-level Multi-access Protocol

  • Jung Chang-Duk
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2006.06a
    • /
    • pp.303-310
    • /
    • 2006
  • The sorting algorithms have been developed to take advantage of distributed computers. But the speedup of parallel sorting algorithms decrease rapidly with increased number of processors due to parallel processing overhead such as context switching time and inter-processor communication cost. In this paper, we propose a parallel sorting method which provides linear speedup of an optimal serial algorithm for a system with a large number of processors. This algorithm may even provide superlinear speedup for a practical system. The algorithm takes advantage of an interconnection network properties and its protocol.

  • PDF

SPEEDUP applications in control and optimization of process plant

  • Mushin, D.A.;Ward, P.S.;Pantelides, C.C;Macchietto, S.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1989.10a
    • /
    • pp.841-843
    • /
    • 1989
  • Aspects of modelling, performance monitoring, control and optimisation are discussed, with particular reference to the application of SPEEDUP. A new facility is described which allows SPEEDUP to operate in conjunction with other systems and several examples are briefly given of its power and flexibility. In particular, its use in on-line applications alongside plant management and distributed control systems is described and how it can be used in scheduling/sequencing problems in investigating batch and cyclic problems.

  • PDF

A Parallel Algorithm for Merging Heaps on MasPar Machine (MasPar 머쉰상의 병렬 힙 병합 알고리즘)

  • Min, Yong-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.2 no.4
    • /
    • pp.554-560
    • /
    • 1995
  • In this paper, we suggest a parallel algorithm to merge priority queues organized in two heaps, kheap and nheap of sizes k and n, correspondingly. Employing max(2$^{-1}$, $\ulcorner$(m+1)/4$\lrcorner$'s processors, this algorithm requires O(log(n/k)*log(n)) on an EREW-PRAM, where i is the height of the heap and m is the summation of sizes n and k. Also, when we run it on the MasPar machine, this method achieves a 33.934-fold speedup with 64 processors to merge 8 million data items which consist of two heaps of different sizes. So our parallel algorithm's EPU is close to 1, which is considered as an optimal speedup ratio.eedup ratio.

  • PDF

Parallel Sorting Algorithm by Median-Median (중위수의 중위수에 의한 병렬 분류 알고리즘)

  • Min, Yong-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.14-21
    • /
    • 1995
  • This paper presents a parallel sorting algorithm suitable for the SIMD multiprocessor. The algorithm finds pivots for partitioning the data into ordered subsets. The data can be evenly distributed to be sorted since it uses the probability theory. For n data elements to be sorted on p processors, when $n{\geq}p^2$, the algorithm is shown to be asymptotically optimal. In practice, sorting 8 million data items on 64 processors achieved a 48.43-fold speedup, while the PSRS required a 44.4-fold speedup. On a variety of shared and distributed memory machines, the algorithm achieved better than half-linear speedups.

  • PDF

Inelastic vector finite element analysis of RC shells

  • Min, Chang-Shik;Gupta, Ajaya Kumar
    • Structural Engineering and Mechanics
    • /
    • v.4 no.2
    • /
    • pp.139-148
    • /
    • 1996
  • Vector algorithms and the relative importance of the four basic modules (computation of element stiffness matrices, assembly of the global stiffness matrix, solution of the system of linear simultaneous equations, and calculation of stresses and strains) of a finite element computer program for inelastic analysis of reinforced concrete shells are presented. Performance of the vector program is compared with a scalar program. For a cooling tower problem, the speedup factor from the scalar to the vector program is 34 for the element stiffness matrices calculation, 25.3 for the assembly of global stiffness matrix, 27.5 for the equation solver, and 37.8 for stresses, strains and nodal forces computations on a Gray Y-MP. The overall speedup factor is 30.9. When the equation solver alone is vectorized, which is computationally the most intensive part of a finite element program, a speedup factor of only 1.9 is achieved. When the rest of the program is also vectorized, a large additional speedup factor of 15.9 is attained. Therefore, it is very important that all the modules in a nonlinear program are vectorized to gain the full potential of the supercomputers. The vector finite element computer program for inelastic analysis of RC shells with layered elements developed in the present study enabled us to perform mesh convergence studies. The vector program can be used for studying the ultimate behavior of RC shells and used as a design tool.

Speedup Analysis Model for High Speed Network based Distributed Parallel Systems (고속 네트웍 기반의 분산병렬시스템에서의 성능 향상 분석 모델)

  • 김화성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12C
    • /
    • pp.218-224
    • /
    • 2001
  • The objective of Distributed Parallel Computing is to solve the computationally intensive problems, which have several types of parallelism, on a suite of high performance and parallel machines in a manner that best utilizes the capabilities of each machine. In this paper, we propose a computational model including the generalized graph representation method of distributed parallel systems for speedup analysis, and analyze how the super-linear speedup is achieved when scheduling of programs with diverse embedded parallelism modes onto a distributed heterogeneous supercomputing network environment. The proposed representation method can also be applied to simple homogeneous or heterogeneous systems whose components are heterogeneous only in terms of the processor speed. In order to obtain the core speedup, the matching of the parallelism characteristics between tasks and parallel machines should be carefully handled while minimizing the communication overhead.

  • PDF

Atom Number and Bounding Sphere Based Search Speedup Technique for Similar Proteins Screening (원자개수와 경계구에 기반한 유사 단백질 스크리닝을 위한 검색 가속 기법)

  • Lee, Jaeho;Park, JoonYoung
    • Korean Journal of Computational Design and Engineering
    • /
    • v.20 no.4
    • /
    • pp.321-327
    • /
    • 2015
  • In the protein database search, 3D structural shape comparison for protein screening plays a important role. Protein databases have big size and have been grown rapidly. Exhaustive search methods cannot provide a satisfactory performance. As protein is composed of a set of spheres, the similarity calculation of two set of spheres is very expensive. Thus, a reasonable filtering method could be an answer for the speedup of protein screening. In this paper, we suggest a speedup method for protein screening with atom number and bounding sphere. We also show some experimental results for the validity of our method.