Search | Korea Science

Parallelization of A Load balancing Algorithm for Parallel Computations (병렬계산을 위한 부하분산 알고리즘의 병렬화)

In-Jae Hwang
- Journal of the Institute of Convergence Signal Processing
- /
- v.5 no.3
- /
- pp.236-242
- /
- 2004
In this paper, we propose an approach to parallelize a load balancing algorithm that was shown to be very effective in distributing workload for parallel computations. Load balancing algorithms are required in executing parallel program efficiently As a parallel computation model, we used dynamically growing tree structure that can be found in many application problems. The load balancing algorithm tries to balance the workload among processors while keeping the communication cost under certain limit. We show how the load balancing algorithm is effectively parallelized on mesh and hypercube interconnection networks, and analyzed the time complexity for each case to show that parallel algorithm actually reduced the various overhead.
PDF

A Study on the Communication Performance Improvement of the Parallel Finite-Different Time-Domain Simulator by using the MPI Persistent Communication (MPI의 지속 통신 메커니즘을 이용한 병렬 유한차분시간영역 전산모사 프로그램의 통신 성능 향상에 관한 연구)

Kim, Huioon;Chun, Kyungwon;Kim, Hyeong-gyu;Hong, Hyunpyo;Chung, Youngjoo
- Annual Conference of KIPS
- /
- 2009.04a
- /
- pp.942-945
- /
- 2009
유한차분시간영역 방법은 전자기파 관련 분야의 전산모사에 많이 사용되는 수치해석기법이다. 이 방법을 이용하여 구현한 전산모사 프로그램은 많은 계산 자원 필요로 하기 때문에 병렬 계산 환경을 이용하게 되는 경우가 많다. 병렬 계산 환경에서 전산모사를 수행할 경우, 병렬로 수행되는 각 프로세스 간의 통신 속도와 네트워크의 지연 시간은 계산의 병목 현상을 초래하여 전체적인 성능을 저하시키는 원인이 된다. 따라서, 본 논문에서는 MPI의 지속 통신 메커니즘을 이용하여 병렬 프로세스 간 동기화 속도를 증가시킴으로써 유한차분시간영역 전산모사 프로그램에서의 MPI 통신 성능의 향상을 꾀하고, 그 결과를 그래프로 도시하였다. 또한 기존의 양방향 통신과 단방향 통신 메커니즘을 사용했을 때의 성능과 비교/분석하여, 병렬 유한차분시간영역 전산모사 프로그램에 있어서 지속 통신 메커니즘의 장/단점을 제시하고, 그 효용성에 관해 논의한다.
https://doi.org/10.3745/PKIPS.y2009m04a.942 인용 PDF

안전하고 고속적인 디지탈 서명을 위한 병렬 알고리즘 설계

Seo, Jang-Won;Moon, Pil-Joo;Bang, Hye-Ja;Jeon, Moon-Seok;Lee, Chul-Hee
- Review of KIISC
- /
- v.4 no.2
- /
- pp.23-39
- /
- 1994
본 논문은 예전의 방법들에서 일어나고 있는 수행 속도 문제점들을 향상하기 위하여 병렬 처리를 이용하여 난수를 발생하는 방법중에서 가장 보편적이고 빠른 방법으로 알려진 저차 합동 다항식에 기초한 새로운 고속 디지탈 서명방식에 대한 병렬 알고리즘을 제안한다. 새로운 디지탈 서명 방식은 비밀키로써 큰 소수 p,q를 이용하여, 공개 정보로써 n=$p^2$ * q를 이용한다. 난수는 서명을 생성할때 이용되며, 서명을 검증하기 위하여 부등식을 이용하며 병렬 알고리즘을 이용하여 서명을 생성하는 처리속도의 향상된 성능을 위하여 전처리와 디지탈 서명을 구축하는 계산방법의 새로운 병렬 알고리즘을 작성하였다. 본 논문에서 새로 제안한 서명방식에 대한 병렬 알고기즘을 검증하고 비도를 산출할 것이며, 시뮬레이션을 통하여 예전의 방식들과 비교 분석한다. 본 논문은 공개키를 이용한 병렬 암호와 시스템과 신호 처리에 대한 병렬 알고리즘으로 응용될 수 있을 것이며, 병렬과 분산 처리 환경하에서 개발되는 정보서비스 특히 메세지 처리 시스템 서비시, 전자교환 서비스 등의 디지탈 서명에 유용될 수 있을 것이다.
PDF

Parallel Computing Environment based on Windows Operating System (Windows 운영체제 기반의 병렬 계산 환경)

Choe, Jeong Yeol;Sin, Jae Ryeol;Kim, Myeong Ho
- Journal of the Korean Society for Aeronautical & Space Sciences
- /
- v.31 no.4
- /
- pp.16-25
- /
- 2003
A parallel computing environment based on Windows operating system was constructed and a performance test was mode in comparison with Linux based systems. The Windows 2000 cluster was composed with servers and clients connected by Fast-ethernet, within which two sub-clusters may operates together or separately. Compaq Visual Fortran complier and two MPI libraries, MPICH.NT.1.2.2 and NT-MPICHNT.1.2 were installed as computing tools. Parallel computing performance tests were carried out using two-dimensional preconditioned Navier-Stokes code to examine the dependency on the number of processors, problem size and MPI libraries, those were compared with results from Linux clusters. Results shows that a cluster based on the user-friendly Windows operating system is also useful for the parallel computing and has good performance comparable to the previous Linux clusters.
https://doi.org/10.5139/JKSAS.2003.31.4.016 인용 PDF KSCI

High Performance Parallel Computer for Scientific Computations (과학계산전용 병렬처리 컴퓨터 구조)

박규호;정봉준
- The Magazine of the IEIE
- /
- v.22 no.9
- /
- pp.14-27
- /
- 1995
KAICUBE/한빛-1호는 하이퍼큐브 형태의 연결망을 가진 병렬 컴퓨터이고 각 노드는 i860프로세서와 통신용의 i82380 DMA 콘트롤러를 탑재하고 있다. 40Mh2 CPU클럭을 사용하는 32노드로 구성되어 있고 컴퓨터의 최고 속도는 2.5G-flops 정도로써 이것은 국내 최초의 Giga급 컴퓨터이다. DMA콘트롤러에 의해 구동되는 노드간 통신은 채널 대역폭이 100Mbps정도이다. 0번 노드는 UNIX를 탑재한 호스트 컴퓨터와 연결되어 있고 호스트 컴퓨터는 병렬 프로그래밍 환경과 각 노드를 관리하는 역할을 한다. 익스프레스는 호스트 컴퓨터에 탑재된 병렬 운영 체제이고 사용하기 간편한 사용자 환경과 프로그래밍 방법에 따라 호스트-노드방법과 cubits 프로그래밍 환경을 각각 제공한다. 그밖에 고수준의 병렬 프로그래밍 환경으로써 기존의 순차 프로그램에 기초한 입력 프로그램을 병렬 프로그램으로 자동 변환 해주는 KAPPA가 있다. 여러 분야의 과학 계산용 프로그램이 수행되고 있으며 그의 성능 측정을 통하여 탁월한 성능을 보여 주었다. 보다 편리한 병렬 프로그래밍 환경의 개발과 범용 계산 전응 서버로써 자유로이 사용할 수 있도록 네트워크 기능을 강화하는 일이 남아있다.
PDF

The development of parallel computation method for the fire-driven-flow in the subway station (도시철도역사에서 화재유동에 대한 병렬계산방법연구)

Jang, Yong-Jun;Lee, Chang-Hyun;Kim, Hag-Beom;Park, Won-Hee
- Proceedings of the KSR Conference
- /
- 2008.06a
- /
- pp.1809-1815
- /
- 2008
This experiment simulated the fire driven flow of an underground station through parallel processing method. Fire analysis program FDS(Fire Dynamics Simulation), using LES(Large Eddy Simulation), has been used and a 6-node parallel cluster, each node with 3.0Ghz_2set installed, has been used for parallel computation. Simulation model was based on the Kwangju-geumnan subway station. Underground station, and the total time for simulation was set at 600s. First, the whole underground passage was divided to 1-Mesh and 8-Mesh in order to compare the parallel computation of a single CPU and Multi-CPU. With matrix numbers($15{\times}10^6$) more than what a single CPU can handle, fire driven flow from the center of the platform and the subway itself was analyzed. As a result, there seemed to be almost no difference between the single CPU's result and the Multi-CPU's ones. $3{\times}10^6$ grid point one employed to test the computing time with 2CPU and 7CPU computation were computable two times and fire times faster than 1CPU respectively. In this study it was confirmed that CPU could be overcome by using parallel computation.
PDF

Computer Vision Platform using PVM (PVM을 이용한 컴퓨터비젼 플랫폼)

;;;;R.S.Ramakrishna
- Proceedings of the Korean Information Science Society Conference
- /
- 1998.10c
- /
- pp.544-546
- /
- 1998
컴퓨터 비젼은 많은 계산을 요구하는 작업으로 구조적인 계산작업(low-level vision)과 비구조적 계산작업(high-level vision)을 가지고 실시간 처리를 요구한다. 이러한 점에서 비젼 작업의 병렬처리와 그것들의 구현에 대한 스케쥴링 schemes이 본 논문에서 중요시 된다. 그리고 PVM이 동작하는 저가의 네트워크로 연결된 워크스테이션 클러스터상에서 구현될 알고리즘을 구현하고 제안된 아이디어는 실용적인 예 (eye location from image sequence)를 들어서 보였다. 차세대의 멀티미디어 환경은 이러한 고성능의 컴퓨팅 플랫폼을 사용하리라 기대된다.
PDF

Improving the Performance of Document Similarity by using GPU Parallelism (GPU 병렬성을 이용한 문서 유사도 계산 성능 개선)

Park, Il-Nam;Bae, Byung-Gurl;Im, Eun-Jin;Kang, Seung-Shik
- The KIPS Transactions:PartB
- /
- v.19B no.4
- /
- pp.243-248
- /
- 2012
In the information retrieval systems like vector model implementation and document clustering, document similarity calculation takes a great part on the overall performance of the system. In this paper, GPU parallelism has been explored to enhance the processing speed of document similarity calculation in a CUDA framework. The proposed method increased the similarity calculation speed almost 15 times better compared to the typical CPU-based framework. It is 5.2 and 3.4 times better than the methods by using CUBLAS and Thrust, respectively.
https://doi.org/10.3745/KIPSTB.2012.19B.4.243 인용 PDF KSCI

A Parallel Emulation Scheme for Data-Flow Architecture on Loosely Coupled Multiprocessor Systems (이완 결합형 다중 프로세서 시스템을 사용한 데이터 플로우 컴퓨터 구조의 병렬 에뮬레이션에 관 한 연구)

이용두;채수환
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.12
- /
- pp.1902-1918
- /
- 1993
Parallel architecture based on the von Neumann computation model has a limitation as a massively parallel architecture due to its inherent drawback of architectural features. The data-flow model of computation has a high programmability in software perspective and high scalability in hardware perspective. However, the practical programming and experimentaion of date-flow architectures are hardly available due to the absence of practical data-flow, we present a programming environment for performing the data-flow computation on conventional parallel machines in general, loosely compled multiprocessor system in particular. We build an emulator for tagged token data-flow architecture on the iPSC/2 hypercube, a loosely coupled multiprocessor system. The emulator is a shallow layer of software executing on an iPSC/2 system, and thus makes the iPSC/2 system work as a data-flow architecture from the programmer`s viewpoint. We implement various numerical and non-numerical algorithm in a data-flow assembler language, and then compare the performance of the program with those of the versions of conventional C language, Consequently, We verify the effectiveness of this programming environment based on the emulator in experimenting the data-flow computation on a conventional parallel machine.
PDF

Design of Parallel Rasterizer for 3D Graphics Accelerators (3D 그래픽 가속엔진을 위한 병렬 Rasterizer 설계)

O, In-Heung;Park, Jae-Seong;Kim, Sin-Deok
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.1
- /
- pp.82-97
- /
- 1999
3차원 그래픽 렌더링은 화면상의 각 화소에 대하여 색깔뿐만 아니라 깊이 정보가지 계산해야 하기 때문에 방대한 계산량과 메모리 접근, 그리고 데이터 전송량을 필요로 하기 때문이다. 따라서 실시간 3차원 그래픽 처리를 위해서 병렬 처리 기법을 도입한다. 그러나 기존 그래픽 가속엔진은 병렬처리 기법으로 영상-병렬성을 이용한 화면 분할 방식을 사용하기 때문에 크게 두 가지 단점이 발생한다. 첫 번재는 화면 영역의 경게에 위치하는 다각형들에 대한 중복계산이고, 두 번째는 낮은 PE(Processing Element) 활용도이다. 본 논문에서는 이러한 문제를 해결하기 위한 방법으로 객체 기반 렌더링(OBR : Object Based Rendering)방식을 바탕으로 하는 그래픽 가속엔진을 제안하였다. OBR 시스템의 목적은 화면 분할 방식의 불필요한 오버헤드를 제거하여 수행 성능을 높이고, 자원을 효율적으로 사용하여 하드웨어 구성비용을 줄이는 것이다. 본 논문에서는 시뮬레이션을 통하여 OBR 시스템이 화면 분할 방식의 대표적인 그래픽 가속기인 PixelFlow와의 성능을 상대적으로 비교하였다. 결론적으로 OBR 시스템은 화면 분할 방식보다 더 적은 하드웨어 자원으로 보다 효율적으로 렌더링을 수해하였다.

Search Result 443, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)