• 제목/요약/키워드: Parallel Computing Method

검색결과 283건 처리시간 0.022초

CUDA를 이용한 Particle Swarm Optimization 구현 (Implementation of Particle Swarm Optimization Method Using CUDA)

  • 김조환;김은수;김종욱
    • 전기학회논문지
    • /
    • 제58권5호
    • /
    • pp.1019-1024
    • /
    • 2009
  • In this paper, particle swarm optimization(PSO) is newly implemented by CUDA(Compute Unified Device Architecture) and is applied to function optimization with several benchmark functions. CUDA is not CPU but GPU(Graphic Processing Unit) that resolves complex computing problems using parallel processing capacities. In addition, CUDA helps one to develop GPU softwares conveniently. Compared with the optimization result of PSO executed on a general CPU, CUDA saves about 38% of PSO running time as average, which implies that CUDA is a promising frame for real-time optimization and control.

이기종 컴퓨팅 환경에서 OpenCL을 사용한 포토모자이크 응용의 효율적인 작업부하 분배 (Efficient Workload Distribution of Photomosaic Using OpenCL into a Heterogeneous Computing Environment)

  • 김희곤;사재원;최동휘;김혜련;이성주;정용화;박대희
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제4권8호
    • /
    • pp.245-252
    • /
    • 2015
  • 최근 고성능 컴퓨팅과 모바일 컴퓨팅에서 성능가속기를 사용하는 병렬처리 방법들이 소개되어왔다. 포토모자이크 응용은 내재된 데이터 병렬성을 활용하고 성능가속기를 사용하여 병렬처리가 가능하다. 본 논문에서는 CPU와 GPU로 구성된 이기종 컴퓨팅 환경에서 포토모자이크 수행 시 작업부하 분배 방법을 제안한다. 즉, 포토모자이크 응용을 비동기 방식으로 병렬화하여 CPU와 GPU 자원을 동시에 활용하고, 각 처리기에 할당할 최적의 작업부하량을 예측하기 위해 CPU-only와 GPU-only 작업 분배 환경에서 수행시간을 측정한다. 제안 방법은 간단하지만 매우 효과적이고, CPU와 GPU로 구성된 이기종 컴퓨팅 환경에서 다른 응용을 병렬화하 데에도 적용될 수 있다. 실험 결과, 이기종 컴퓨팅 환경에서 최적의 작업 분배량으로 수행한 경우, GPU-only의 방법과 비교하여 141%의 성능이 개선되었음을 확인한다.

Optical Look-ahead Carry Full-adder Using Dual-rail Coding

  • Gil Sang Keun
    • Journal of the Optical Society of Korea
    • /
    • 제9권3호
    • /
    • pp.111-118
    • /
    • 2005
  • In this paper, a new optical parallel binary arithmetic processor (OPBAP) capable of computing arbitrary n-bit look-ahead carry full-addition is proposed and implemented. The conventional Boolean algebra is considered to implement OPBAP by using two schemes of optical logic processor. One is space-variant optical logic gate processor (SVOLGP), the other is shadow-casting optical logic array processor (SCOLAP). SVOLGP can process logical AND and OR operations different in space simultaneously by using free-space interconnection logic filters, while SCOLAP can perform any possible 16 Boolean logic function by using spatial instruction-control filter. A dual-rail encoding method is adopted because the complement of an input is needed in arithmetic process. Experiment on OPBAP for an 8-bit look-ahead carry full addition is performed. The experimental results have shown that the proposed OPBAP has a capability of optical look-ahead carry full-addition with high computing speed regardless of the data length.

비정렬 격자 볼륨 렌더링을 위한 다중코어 CPU기반 메모리 효율적 광선 투사 병렬 알고리즘 (Memory Efficient Parallel Ray Casting Algorithm for Unstructured Grid Volume Rendering on Multi-core CPUs)

  • 김덕수
    • 정보과학회 논문지
    • /
    • 제43권3호
    • /
    • pp.304-313
    • /
    • 2016
  • 본 논문은 비정렬 격자 볼륨 렌더링을 위한 다중 코어 CPU기반의 메모리 효율적 광선 투사 병렬처리 알고리즘을 제안한다. 본 연구는 Bunyk 광선 투사(ray casting) 알고리즘에 기반을 두며, Bunyk 알고리즘의 높은 메모리 소모량 문제를 개선하기 위해 스레드별로 고정된 크기의 지역 버퍼를 할당한다. 지역 버퍼는 최근 방문된 면(face)의 정보를 저장하며, 이 정보는 다른 광선들에 의해 재사용되거나 다른 면의 정보로 대체된다. 지역 버퍼에 저장된 정보의 활용률을 높이기 위해 본 연구는 이미지 평면을 기반으로 일관성(coherency)이 높은 광선들을 하나의 광선 그룹으로 묶고, 생성된 광선 그룹들을 스레드들에게 분배한다. 각각의 스레드들은 할당 받은 광선 그룹들을 지역 버퍼를 활용하여 독립적으로 처리한다. 본 연구는 또한 지역 버퍼 활용률을 더욱 높이기 위해 면의 번호에 기반을 둔 해시 함수를 제안한다. 본 연구의 효용성을 확인하기 위해 제안하는 알고리즘을 서로 다른 크기의 비정렬 격자에 적용하였으며, 면 정보 저장을 위해 Bunyk 알고리즘 대비 약 6%의 메모리만 사용하여 정확한 볼륨 렌더링을 수행할 수 있었다. 이처럼 훨씬 적은 메모리 사용에도 불구하고 Bunyk 알고리즘과 대등한 성능을 보여주었으며, 대용량 데이터에 대해서는 최대 22% 높은 성능을 보여주었다. 이는 본 연구의 효용성 및 대용량 데이터의 볼륨 렌더링에 대한 적합성을 증명하는 결과이다.

PC 클러스터 기반 병렬 유전 알고리즘-타부 탐색을 이용한 배전계통 고장 복구 (PC Cluster Based Parallel Genetic Algorithm-Tabu Search for Service Restoration of Distribution Systems)

  • 문경준;이화석;박준호;김형수
    • 대한전기학회논문지:전력기술부문A
    • /
    • 제54권8호
    • /
    • pp.375-387
    • /
    • 2005
  • This paper presents an application of parallel Genetic Algorithm-Tabu Search (GA-TS) algorithm to search an optimal solution of a service restoration in distribution systems. The main objective of service restoration of distribution systems is, when a fault or overload occurs, to restore as much load as possible by transferring the do-energized load in the out of service area via network reconfiguration to the appropriate adjacent feeders at minimum operational cost without violating operating constraints, which is a combinatorial optimization problem. This problem has many constraints with many local minima to solve the optimal switch position. This paper develops parallel GA-TS algorithm for service restoration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solutions of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper $10\%$ of the population to enhance the local searching capabilities. With migration operation, best string of each node is transferred to the neighboring node after predetermined iterations are executed. For parallel computing, we developed a PC cluster system consists of 8 PCs. Each PC employs the 2 GHz Pentium IV CPU and is connected with others through ethernet switch based fast ethernet. To show the validity of the proposed method, proposed algorithm has been tested with a practical distribution system in Korea. From the simulation results, we can find that the proposed algorithm is efficient for the distribution system service restoration in terms of the solution quality, speedup, efficiency and computation time.

배전계통 최적 재구성 문제에 PC 클러스터 시스템을 이용한 병렬 유전 알고리즘-타부 탐색법 구현 (Parallel Genetic Algorithm-Tabu Search Using PC Cluster System for Optimal Reconfiguration of Distribution Systems)

  • 문경준;송명기;김형수;김철홍;박준호;이화석
    • 대한전기학회논문지:전력기술부문A
    • /
    • 제53권10호
    • /
    • pp.556-564
    • /
    • 2004
  • This paper presents an application of parallel Genetic Algorithm-Tabu Search(GA-TS) algorithm to search an optimal solution of a reconfiguration in distribution system. The aim of the reconfiguration of distribution systems is to determine switch position to be opened for loss minimization in the radial distribution systems, which is a discrete optimization problem. This problem has many constraints and very difficult to solve the optimal switch position because it has many local minima. This paper develops parallel GA-TS algorithm for reconfiguration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solution of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper 10% of the population to enhance the local searching capabilities. With migration operation, best string of each node is transferred to the neighboring node aster predetermined iterations are executed. For parallel computing, we developed a PC-cluster system consisting of 8 PCs. Each PC employs the 2 GHz Pentium Ⅳ CPU and is connected with others through ethernet switch based fast ethernet. To show the usefulness of the proposed method, developed algorithm has been tested and compared on a distribution systems in the reference paper. From the simulation results, we can find that the proposed algorithm is efficient and robust for the reconfiguration of distribution system in terms of the solution qualify. speedup. efficiency and computation time.

IT 모듈의 자유 낙하 모사를 위한 병렬처리시스템의 적용 (Application of Parallel Processing System for free drop simulation of IT-related modules)

  • 박영재;이준성;고한옥;장윤석;최재붕;김영진
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 2006년도 춘계학술대회 논문집
    • /
    • pp.405-406
    • /
    • 2006
  • Recently, the flat display modules such as plasma or TFT-LCD employ thin crystallized panels which are normally weak to high level transient mechanical energy inputs. As a result, anti-shock performance is one of the most important design specifications for TFT-LCD modules. However, most of large display module designs are generated based on engineers own experiences. Also, a large-scale analysis to evaluate complex material and structural behaviors is one of interesting topic in diverse engineering and scientific fields. The utilization of massively parallel processors has also been a recent trend of high performance computing. The objective of this paper is to introduce a parallel process system which consists of general purpose finite element analysis solver as well as parallelized PC cluster. The parallel processing system is constructed using thirty-two processing elements and the finite element program is developed by adopting hierarchical domain decomposition method. In order to verify the efficiency of the established system, an impact analysis on thin and complex sub-parts of flat display modules is performed. The evaluation results showed a good agreement with the corresponding reference solutions, and thus, the parallel process system seems to be a useful tool fur the complex structural analysis such as IT related products.

  • PDF

Parallel Genetic Algorithm-Tabu Search Using PC Cluster System for Optimal Reconfiguration of Distribution Systems

  • Mun Kyeong-Jun;Lee Hwa-Seok;Park June-Ho
    • KIEE International Transactions on Power Engineering
    • /
    • 제5A권2호
    • /
    • pp.116-124
    • /
    • 2005
  • This paper presents an application of the parallel Genetic Algorithm-Tabu Search (GA- TS) algorithm, and that is to search for an optimal solution of a reconfiguration in distribution systems. The aim of the reconfiguration of distribution systems is to determine the appropriate switch position to be opened for loss minimization in radial distribution systems, which is a discrete optimization problem. This problem has many constraints and it is very difficult to solve the optimal switch position because of its numerous local minima. This paper develops a parallel GA- TS algorithm for the reconfiguration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solution of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper 10$\%$ of the population to enhance the local searching capabilities. With migration operation, the best string of each node is transferred to the neighboring node after predetermined iterations are executed. For parallel computing, we developed a PC-cluster system consisting of 8 PCs. Each PC employs the 2 GHz Pentium IV CPU and is connected with others through switch based rapid Ethernet. To demonstrate the usefulness of the proposed method, the developed algorithm was tested and is compared to a distribution system in the reference paper From the simulation results, we can find that the proposed algorithm is efficient and robust for the reconfiguration of distribution system in terms of the solution quality, speedup, efficiency, and computation time.

코어레이와 MPI를 이용한 병렬 파동 전파 모델링과 거꿀 참반사 보정 성능 비교 (A Performance Comparison between Coarray and MPI for Parallel Wave Propagation Modeling and Reverse-time Migration)

  • 류동현;김아름;하완수
    • 지구물리와물리탐사
    • /
    • 제19권3호
    • /
    • pp.131-135
    • /
    • 2016
  • 코어레이는 포트란 2008 표준에 도입된 병렬 연산 기법이다. 코어레이를 이용하면 간단한 문법으로 분산 메모리시스템에서 병렬 연산을 구현할 수 있다. 본 연구에서는 탄성파 자료 처리 프로그램에 코어레이와 MPI를 적용하여 병렬 처리 성능을 비교하고 이를 통해 코어레이의 적용 가능성을 살펴보았다. 파동 전파 모델링을 이용해 연산 성능을 비교하였고, 영역 분해 기법을 이용해 일대일 통신 성능을 비교하였다. 또한 거꿀 참 반사 보정 프로그램을 이용해 병렬 처리 성능을 비교하였다. 그 결과 연산 성능은 코어레이 프로그램과 MPI 프로그램에서 큰 차이가 없었지만 통신 성능은 MPI가 우수했다.

Procedural Fluid Animation using Mirror Image Method

  • Park, Jin-Ho
    • International Journal of Contents
    • /
    • 제7권4호
    • /
    • pp.1-5
    • /
    • 2011
  • Physics based fluid animation schemes need large computation cost due to tremendous degree of freedom. Many researchers tried to reduce the cost for solving the large linear system that is involved in grid-based schemes. GPU based algorithms and advanced numerical analysis methods are used to efficiently solve the system. Other groups studied local operation methods such as SPH (Smoothed Particle Hydrodynamics) and LBM (Lattice Boltzmann Method) for enhancing the efficiency. Our method investigates this efficiency problem thoroughly, and suggests novel paradigm in fluid animation field. Rather than physics based simulation, we propose a robust boundary handling technique for procedural fluid animation. Our method can be applied to arbitrary shaped objects and potential fields. Since only local operations are involved in our method, parallel computing can be easily implemented.