Search | Korea Science

A Study on Shared Memory Optimization for Multi-Processor System (다중 프로세서 시스템에서의 공유 메모리 최적화 연구)

Kim, Jong-Su;Moon, Jong-Wook;Yim, Kang-Bin;Jung, Gi-Hyun;Choi, Kyung-Hee
- Proceedings of the Korea Information Processing Society Conference
- /
- 2001.10a
- /
- pp.685-688
- /
- 2001
고속 I/O 를 갖는 Loosely coupled 다중 프로세서 시스템은 데이터의 처리 성능 향상과 I/O 집중화에 따른 병목 현상을 줄여줄 수 있다. 이 때 프로세서간의 데이터 전송에 사용되는 공유메모리는 그 구성과 이용 방법에 따라 시스템의 성능에 많은 영향을 미친다. 본 연구에서는 공유메모리의 사용 방법을 비동기, 메일박스를 통한 인터럽트 전달 인지 방식으로, I/O 사용방법을 고속 이더넷(IEEE 802.3u)으로 한 시스템 모델에서 다중 프로세서 시스템 구성에 필요한 공유메모리의 최적 사용량을 입출력 데이터의 Bandwidth와 Burstness관점에서 실험을 통해 분석하였다.
PDF

Performance Analysis of A Distributed Shared Memory System Including Minor Performance Factors (군소 성능요인을 고려한 분산공유메모리 시스템 성능의 정밀분석)

박준석;전창호
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.10c
- /
- pp.671-673
- /
- 2000
본 논문에서는 분산공유메모리 다중프로세서 시스템에서 하드웨어 구성요소와 실행환경이 시스템의 전체 성능에 미치는 영향을 시뮬레이션을 통하여 분석한다. PARSEC[1,2]을 이용하여 분산공유메모리 다중프로세서 시스템을 실제 실행환경에 근접하게 모델링하고 그 모델링된 시스템상에 2D FFT를 가상 실행하는 방식의 시뮬레이션 결과, 일반적으로 성능분석을 할 때 성능요소로 고려하지 않는 군소 하드웨어 요소들이 시스템 구성에 따라 시스템의 전체 성능에 상당한 영향을 미침을 밝힌다. 또한 반복순환 구문의 오버헤드, 코드최적화 등 실행조건에 따른 성능의 변화도 정량적으로 분석한다.
PDF

Makespan Minimization Problem for A Job - Multiple Machines Using Simulated Annealing (Simulated Annealing을 이용한 한 작업-다중 기계문제에서의 Makespan 최소화)

이동주;황인극;김진호
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.5 no.2
- /
- pp.137-140
- /
- 2004
다중 프로세서 시스템이 개발됨에 따라, 새로운 일정계획문제, 하나의 작업이 하나이상의 기계에 의해 동시에 처리되어야 하는 문제가 대두되었다. 이 연구에서는 선행관계를 가진 이러한 다중 프로세서 일정계획문제에 대해 다루어 보았다. 이 연구의 목적은 makespan을 최소화하는 일정계획을 찾는 것이다. 일반적으로 Branch and Bound 기법을 이용하여 선행관계를 가진 다중 프로세서 일정계획문제의 최적해를 찾았는데, 해의 탐색시간이 너무 오래 걸린다는 단점이 있었다. 본 연구에서는 짧은 시간 내에 최적해와 가까운 근사해를 simulated annealing(SA)방법을 이용하여 구해보았다. SA의 성능을 측정하기 위하여, SA의 CPU 처리시간과 구한 근사해를 40개의 예제문제를 통하여 Kramer의 방법의 CPU 처리시간과 최적해와 비교해 보았다.
PDF

A Worst Case Execution Timing Analysis Technique for Multiple-Issue Processors (다중 이슈 프로세서를 위한 최악 실행시간 분석 기법)

Im, Seong-Su;Han, Jeong-Hui;Kim, Ji-Hong;Min, Sang-Ryeol
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.10
- /
- pp.848-860
- /
- 2000
본 논문에서는 한 번에 여러 개의 명령어를 이슈할 수 있는 다중 이슈 프로세서(in-order, multiple-issue processors)에 대해 최악 실행시간을 분석하는 기법을 제시한다. 명령어들의 이슈 형태를 분석하기 위해서 명령어들 사이의 의존성 간계를 표현하는 IDG(Instruction Dependence Graph)라고 하는 자료구조를 사용한다. 이 자료구조로부터 각 명령어들의 이슈간 거리 범위를 구하고, 프로그램의 계층적인 분석 과정에서 점차로 더 정확한 이슈간 거리 범위로 갱신한다. 프로그램의 최악 실행시간은 최종적으로 얻어진 프로그램 전체에 대한 IDG를 분석하여 얻은 명령어들의 이슈간 거리 범위로부터 계산한다. 제안하는 기법을 구현한 시간 분석기를 사용하여 실험한 결과, 논문에서 사용한 다중 이슈 프로세서 모델에 대해서 정확하게 다중 이슈 형태를 분석할 수 있었다.
PDF

Improvement in Reconstruction Time Using Multi-Core Processor on Computed Tomography (다중코어 프로세서를 이용한 전산화단층촬영의 재구성 시간 개선)

Chon, Kwon Su
- Journal of the Korean Society of Radiology
- /
- v.9 no.7
- /
- pp.487-493
- /
- 2015
The reconstruction on the computed tomography requires much time for calculation. The calculation time rapidly increases with enlarging matrix size for improving image quality. Multi-core processor, multi-core CPU, has widely used nowadays and has provided the reduction of the calculation time through multi-threads. In this study, the calculation time of the reconstruction process would improved using multi-threads based on the multi-core processor. The Pthread and the OpenMP used for multi-threads were used in convolution and back projection steps that required much time in the reconstruction. The Pthread and the OpenMP showed similar results in the speedup and the efficiency.
https://doi.org/10.7742/jksr.2015.9.7.487 인용 PDF KSCI

Parallelization of Multi-Block Flow Solver with Multi-Block/Multi-Partitioning Method (다중블록/다중영역분할 기법을 이용한 유동해석 코드 병렬화)

Ju, Wan-Don;Lee, Bo-Sung;Lee, Dong-Ho;Hong, Seung-Gyu
- Journal of the Korean Society for Aeronautical & Space Sciences
- /
- v.31 no.7
- /
- pp.9-14
- /
- 2003
In this work, a multi-block/multi-partitioning method is suggested for a multi-block parallelization. It has an advantage of uniform load balance via subdividing of each block on each processor. To make a comparison of parallel efficiency according to domain decomposition method, a multi-block/single-partitioning and a multi-block/ multi-partitioning methods are applied to the flow analysis solver. The multi-block/ multi-partitioning method has more satisfactory parallel efficiency because of optimized load balancing. Finally, it has applied to the CFDS code. As a result, the computing speed with sixteen processors is over twelve times faster than that of sequential solver.
https://doi.org/10.5139/JKSAS.2003.31.7.009 인용 PDF KSCI

Multiple Targets Detection by using CLEAN Algorithm in Matched Field Processing (정합장처리에서 CLEAN알고리즘을 이용한 다중 표적 탐지)

Lim Tae-Gyun;Lee Sang-Hak;Cha Young-Wook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.9
- /
- pp.1545-1550
- /
- 2006
In this paper, we propose a method for applying the CLEAN algorithm to an minimum variance distortionless response(MVDR) to estimate the location of multiple targets distributed in the ocean. The CLEAN algorithm is easy to implement in a linear processor, yet not in a nonlinear processor. In the proposed method, the CSDM of a Dirty map is separated into the CSDM of a Clean beam and the CSDM of the Residual, then an individual ambiguity surface(AMS) is generated. As such, the CLEAN algorithm can be applied to an MVDR, a nonlinear processor. To solve the ill-conditioned problem related to the matrix inversiion by an MVDR when using the CLEAN algorithm, Singular value decomposition(SVD) is carried out, then the reciprocal of small eigenvalues is replaced with zero. Experimental results show that the proposed method improves the performance of an MVDR.
PDF KSCI

Efficient Processor Allocation based on Join Selectivity in Multiple Hash Joins using Synchronization of Page Execution Time (페이지 실행시간 동기화를 이용한 다중 해쉬 결합에서 결합률에 따른 효율적인 프로세서 할당 기법)

Lee, Gyu-Ok;Hong, Man-Pyo
- Journal of KIISE:Computer Systems and Theory
- /
- v.28 no.3
- /
- pp.144-154
- /
- 2001
다중 결합 질의에 포함된 다수의 결합 연산지를 효율적으로 처리하기 위해 서는 효율적인 병렬 알고리즘이 필요하다. 최근 다중 해쉬 결합 질의의 처리를 위해 할당 트리를 이용한 방법이 가장 우수한 것으로 알려져 있다. 그러나 이 방법은 실제 결합 시에 할당 트리의 각 노드에서 필연적인 지연이 발생되는 데 이는 튜플-시험 단계에서 외부 릴레이션을 디스크로부터 페이지 단위로 읽는 비용과 이미 읽는 페이지에 대한 해쉬 결합 비용간의 차이에 의해 발생하게 된다. 이들 사이의 실행시간을 가급적 일치시키기 위한 '페이지 실행시간 동기화'기법이 제안되었고 이를 통해 할당 트리 한 노드 실행에 있어서의 지연 시간을 줄일 수 있었다. 하지만 지연 시간을 최소화하기 위해 할당되어질 프로세서의 수 즉, 페이지 실행시간 동기화 계수(k)는 실제 결합 시의 결합률에 따라 상당한 차이를 보이게 되고 결국, 이 차이를 고려하지 않은 다중 해쉬 결합은 성능 면에서 크게 저하될 수밖에 없다. 본 논문에서는 결합 이전에 어느 정도의 결합률을 예측할 수 있다는 전제하에 다중 해쉬 결합 실행 시에 발생할 수 있는 지연 시간을 최소화 할 수 있도록 결합률에 따라 최적의 프로세서들을 노드에 할당함으로서 다중 해쉬 결합의 실행 성능을 개선하였다. 그리고 분석적 비용 모형을 세워 기존 방식과의 다양한 성능 분석을 통해 비용 모형의 타당성을 입증하였다.
PDF

Performance Analysis of A Distributed Shared Memory Multiprocessor System Using PASEC (PARSEC을 이용한 분산공유메모리 다중프로세서 시스템의 성능분석)

Park, Joon-Seok;Jeon, Chang-Ho
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.10
- /
- pp.3049-3054
- /
- 2000
In this paper, the effects of the hardware components and runtime environments on the overall performance of a distributed shared memory system are analyzed through simulation. In simulation, the system is modeled using PARSE[1.2] closely to the real runtime environment and the 2D FFT is virtually executed on it. The results of simulation show that the minor hardware components such as bus interfaces and local bus of a processor, which are usuallyignored or neglected when analyzing performance. have significant impacts on the overall system performance. Performance variations caused from runtime environments such as loop overhead and code optimuzatio are also analyzed quantitatively.
PDF

Implementation and Performance Analysis of Efficient Packet Processing Method For DPI (Deep Packet Inspection) System using Dual-Processors (듀얼 프로세서 기반 DPI (Deep Packet Inspection) 엔진을 위한 효율적 패킷 프로세싱 방안 구현 및 성능 분석)

Yang, Joon-Ho;Han, Seung-Jae
- The KIPS Transactions:PartC
- /
- v.16C no.4
- /
- pp.417-422
- /
- 2009
Implementation of DPI(Deep Packet Inspection) system on a general purpose multiprocessor platform is an attractive option from the implementation cost point of view, since it does not require high-cost customized hardware. Load balancing has been considered as a primary means to achieve high performance in multi processor systems. We claim, however, that in case of DPI system design simply balancing the load of each processor does not necessarily yield the highest system performance. Instead, we propose a method in which tasks are allocated to processors based on their functions. We implemented the proposed method in dual processor Linux system and compare its performance with the existing load balancing methods. Under the proposed method, one processor is dedicated to deal with interrupt handling and generic packet processing, while another processor is dedicated to DPI processing. According to experimental results, the proposed scheme outperforms the existing schemes by 60%, mainly because of the reduction of cache miss and spin lock occurrences.
https://doi.org/10.3745/KIPSTC.2009.16-C.4.417 인용 PDF KSCI

Search Result 412, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)