• Title/Summary/Keyword: Parallel Computing(병렬컴퓨팅)

Search Result 229, Processing Time 0.022 seconds

Parallel Computation of a Flow Field Using FEM and Domain Decomposition Method (영역분할법과 유한요소해석을 이용한 유동장의 병렬계산)

  • Choi Hyounggwon;Kim Beomjun;Kang Sungwoo;Yoo Jung Yul
    • Proceedings of the KSME Conference
    • /
    • 2002.08a
    • /
    • pp.55-58
    • /
    • 2002
  • Parallel finite element code has been recently developed for the analysis of the incompressible Wavier-Stokes equations using domain decomposition method. Metis and MPI libraries are used for the domain partitioning of an unstructured mesh and the data communication between sub-domains, respectively. For unsteady computation of the incompressible Navier-Stokes equations, 4-step splitting method is combined with P1P1 finite element formulation. Smagorinsky and dynamic model are implemented for the simulation of turbulent flows. For the validation performance-estimation of the developed parallel code, three-dimensional Laplace equation has been solved. It has been found that the speed-up of 40 has been obtained from the present parallel code fir the bench mark problem. Lastly, the turbulent flows around the MIRA model and Tiburon model have been solved using 32 processors on IBM SMP cluster and unstructured mesh. The computed drag coefficient agrees better with the existing experiment as the mesh resolution of the region increases, where the variation of pressure is severe.

  • PDF

D-Class Computing Parallel Algorithm the on Grid Computing Environment (그리드 컴퓨팅 환경에서의 D-클래스 계산 병렬 알고리즘)

  • Shin, Chul-Gyu;Han, Jae-Il
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.929-932
    • /
    • 2005
  • D-클래스의 계산은 NP-완전 문제로서 그 결과를 개인키, 공개키로 이용하여 보안에 응용될 수 있는 가능성을 가지고 있으나 계산 복잡도로 인해 현재 극히 제한된 크기의 행렬에 대한 D-클래스만이 알려져 있다. 이 문제를 해결하기 위해 D-클래스 계산을 효율적으로 할 수 있는 수식과 알고리즘을 설계 및 구현하였지만, 행렬의 크기가 증가함에 따라 결과를 얻는 것에는 한계가 있다. 이것을 해결하기 위해 많은 컴퓨터를 사용할 수 있는 그리드 컴퓨팅이 필요하다. 본 논문은 그리드 컴퓨팅 환경에서 최적화된 알고리즘 설계 및 구현을 위해 Globus 가 설치된 클러스터를 구축하고, MPICH 를 이용 효율적인 D-클래스의 계산 알고리즘을 설계 및 구현하여 실행 결과에 대해 논한다.

  • PDF

The Development of a MATLAB-based Discrete Event Simulation Framework for the Engagement Simulations of the Weapon Systems (무기체계 교전 시뮬레이션을 위한 매트랩 기반 이산사건시뮬레이션 프레임워크의 개발)

  • Hwang, Kun-Chul;Lee, Min-Gyu;Kim, Jung-Hoon
    • Journal of the Korea Society for Simulation
    • /
    • v.21 no.2
    • /
    • pp.31-39
    • /
    • 2012
  • Simulation Framework is a basic software tool used to develop simulation applications. This paper describes the development of a discrete event simulation framework based on DEVS(Discrete EVent System Specification) formalism, using MATLAB language which is widely used in technical computing and engineering disciplines. The newly developed framework utilizing MATLAB object oriented programming combines the convenience of MATLAB language and the sophisticated architecture of the DEVS formalism. Hence, it supports the productivity, flexibility, extensibility that are required for the simulation application software development of the weapon systems engagement. Moreover, it promises a simulation application the increased the computation speed proportional to the number of CPU of a multi-core processor, providing the batch simulation functionality based on MATLAB parallel computing technology.

Distributed AI Learning-based Proof-of-Work Consensus Algorithm (분산 인공지능 학습 기반 작업증명 합의알고리즘)

  • Won-Boo Chae;Jong-Sou Park
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.1-14
    • /
    • 2022
  • The proof-of-work consensus algorithm used by most blockchains is causing a massive waste of computing resources in the form of mining. A useful proof-of-work consensus algorithm has been studied to reduce the waste of computing resources in proof-of-work, but there are still resource waste and mining centralization problems when creating blocks. In this paper, the problem of resource waste in block generation was solved by replacing the relatively inefficient computation process for block generation with distributed artificial intelligence model learning. In addition, by providing fair rewards to nodes participating in the learning process, nodes with weak computing power were motivated to participate, and performance similar to the existing centralized AI learning method was maintained. To show the validity of the proposed methodology, we implemented a blockchain network capable of distributed AI learning and experimented with reward distribution through resource verification, and compared the results of the existing centralized learning method and the blockchain distributed AI learning method. In addition, as a future study, the thesis was concluded by suggesting problems and development directions that may occur when expanding the blockchain main network and artificial intelligence model.

Implementation and Performance Evaluation of an Object-Oriented Parallel Programming Environment with Multithreaded Computational Model (다중스레드 계산 모델을 이용한 병렬 객체 지향 프로그래밍 환경의 구현 및 성능 평가)

  • Song, Jong-Hun;Kim, Heung-Hwan;Han, Sang-Yeong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.5 no.6
    • /
    • pp.708-718
    • /
    • 1999
  • 본 논문에서 제안하는 시스템은 일반적인 병렬 시스템의 하드웨어 구조에서, 다중 스레드 계산 모델을 이용하여 객체 지향 프로그래밍 환경을 구현한 시스템이다. 제안하는 시스템을 효과적으로 구현하기 위하여 컴파일러와 실행 시간 시스템의 측면에서 여러 가지 기법을 제시한다. 컴파일러의 측면에서는 멤버 변수의 접근 분석, 메소드의 병렬성 분석 기법을 제시하고, 실행 시간 시스템에서는 실시간 스레드/메시지 결합, 프레임 공유 기법을 제시한다. 본 논문에서 제안된 프로그래밍 환경은, MPI 메시지 인터페이스를 이용하여 구현하였으며, 벤치마크 프로그램을 실행함으로써 성능 분석을 하였다. 분석의 결과는 실행시간 시스템의 여러 가지 기법들이 성능 향상에 많은 효과가 있음을 보여주며, 이러한 결과는 일반적인 병렬 시스템에서도 적용 가능하다.Abstract In this paper, we suggest an object-oriented programming environment with multithreaded computation model on general parallel processing systems. We developed many methods for our environment to be efficient : in compiler, the analysis of member variable and method parallelism, and in runtime system, thread/message merging and frame sharing. The programming environment is implemented with MPI message interface, and its performance is analyzed with executing benchmark programs. The results show that the developed methods have influence on performance improvement, and this improvement can be applied to general parallel processing systems.

Compression-Based Volume Rendering on Distributed Memory Parallel Computers (분산 메모리 구조를 갖는 병렬 컴퓨터 상에서의 압축 기반 볼륨 렌더링)

  • Koo, Gee-Bum;Park, Sang-Hun;Song, Dong-Sub;Ihm, In-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.5
    • /
    • pp.457-467
    • /
    • 2000
  • 본 논문에서는 분산 메모리 구조를 갖는 병렬 컴퓨터 상에서 방대한 크기를 갖는 볼륨 데이터의 효과적인 가시화를 위한 병렬 광선 투사법을 제안한다. 데이터의 압축을 기반으로 하는 본 기법은 다른 프로세서의 메모리로부터 데이터를 읽기보다는 자신의 지역 메모리에 존재하는 압축된 데이터를 빠르게 복원함으로써 병렬 렌더링 성능을 향상시키는 것을 목표로 한다. 본 기법은 객체-순서와 영상-순서 탐색 알고리즘 모두의 정점을 이용하여 성능을 향상시켰다. 즉, 블록 단위의 최대-최소 팔진트리의 탐색과 각 픽셀의 불투명도 값을 동적으로 유지하는 실시간 사진트리를 응용함으로써 객체-공간과 영상-공간 각각의 응집성을 이용하였다. 본 논문에서 제안하는 압축 기반 병렬 볼륨 렌더링 방법은 렌더링 수행 중 발생하는 프로세서간의 통신을 최소화하도록 구현되었는데, 이러한 특징은 프로세서 사이의 상당히 높은 데이터 통신 비용을 감수하여야 하는 PC 및 워크스테이션의 클러스터와 같은 더욱 실용적인 분산 환경에서 매우 유용하다. 본 논문에서는 Cray T3E 병렬 컴퓨터 상에서 Visible Man 데이터를 이용하여 실험을 수행하였다.

  • PDF

Parallel Finite Element Simulation of the Incompressible Navier-stokes Equations (병렬 유한요소 해석기법을 이용한 유동장 해석)

  • Choi H. G.;Kim B. J.;Kang S. W.;Yoo J. Y.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2002.05a
    • /
    • pp.8-15
    • /
    • 2002
  • For the large scale computation of turbulent flows around an arbitrarily shaped body, a parallel LES (large eddy simulation) code has been recently developed in which domain decomposition method is adopted. METIS and MPI (message Passing interface) libraries are used for domain partitioning and data communication between processors, respectively. For unsteady computation of the incompressible Wavier-Stokes equation, 4-step splitting finite element algorithm [1] is adopted and Smagorinsky or dynamic LES model can be chosen fur the modeling of small eddies in turbulent flows. For the validation and performance-estimation of the parallel code, a three-dimensional laminar flow generated by natural convection inside a cube has been solved. Then, we have solved the turbulent flow around MIRA (Motor Industry Research Association) model at $Re = 2.6\times10^6$, which is based on the model height and inlet free stream velocity, using 32 processors on IBM SMP cluster and compared with the existing experiment.

  • PDF

Implementation of a Parallel Web Crawler for the Odysseus Large-Scale Search Engine (오디세우스 대용량 검색 엔진을 위한 병렬 웹 크롤러의 구현)

  • Shin, Eun-Jeong;Kim, Yi-Reun;Heo, Jun-Seok;Whang, Kyu-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.567-581
    • /
    • 2008
  • As the size of the web is growing explosively, search engines are becoming increasingly important as the primary means to retrieve information from the Internet. A search engine periodically downloads web pages and stores them in the database to provide readers with up-to-date search results. The web crawler is a program that downloads and stores web pages for this purpose. A large-scale search engines uses a parallel web crawler to retrieve the collection of web pages maximizing the download rate. However, the service architecture or experimental analysis of parallel web crawlers has not been fully discussed in the literature. In this paper, we propose an architecture of the parallel web crawler and discuss implementation issues in detail. The proposed parallel web crawler is based on the coordinator/agent model using multiple machines to download web pages in parallel. The coordinator/agent model consists of multiple agent machines to collect web pages and a single coordinator machine to manage them. The parallel web crawler consists of three components: a crawling module for collecting web pages, a converting module for transforming the web pages into a database-friendly format, a ranking module for rating web pages based on their relative importance. We explain each component of the parallel web crawler and implementation methods in detail. Finally, we conduct extensive experiments to analyze the effectiveness of the parallel web crawler. The experimental results clarify the merit of our architecture in that the proposed parallel web crawler is scalable to the number of web pages to crawl and the number of machines used.

A Hierarchical Server Structure for Parallel Location Information Search of Mobile Hosts (이동 호스트의 병렬적 위치 정보 탐색을 위한 서버의 계층 구조)

  • Jeong, Gwang-Sik;Yu, Heon-Chang;Hwang, Jong-Seon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.1_2
    • /
    • pp.80-89
    • /
    • 2001
  • The development in the mobile computing systems have arisen new and previously unforeseen problems, such as problems in information management of mobile host, disconnection of mobile host and low bandwidths of wireless communications. Especially, location information management strategy of mobile host results in an increased overhead in mobile computing systems. Due to the mobility of the mobiles host, the changes in the mobile host's address depends on the mobile host's location, and is maintained by mapping physical address on virtual address, Since previously suggested several strategies for mapping method between physical address and virtual address did not tackle the increase of mobile host and distribution of location information, it was not able to support the scalability in mobile computing systems. Thus, to distribute the location inrormation, we propose an advanced n-depth LiST (Location information Search Tree) and the parallel location search and update strategy based on the advanced n-depth LiST. The advanced n-depth LiST is logically a hierarchical structure that clusters the location information server by ring structure and reduces the location information search and update cost by parallel seatch and updated method. The experiment shows that even though the distance of two MHs that communicate with each other is large, due to the strnctural distribution of location information, advanced n-depth LiST results in good performance. Moreover, despite the reduction in the location information search cost, there was no increase in the location information update cost.

  • PDF

Solving the Monkey and Banana Problem Using DNA Computing (DNA 컴퓨팅을 이용한 원숭이와 바나나 문제 해결)

  • 박의준;이인희;장병탁
    • Korean Journal of Cognitive Science
    • /
    • v.14 no.2
    • /
    • pp.15-25
    • /
    • 2003
  • The Monkey and Banana Problem is an example commonly used for illustrating simple problem solving. It can be solved by conventional approaches, but this requires a procedural aspect when inferences are processed, and this fact works as a limitation condition in solving complex problems. However, if we use DNA computing methods which are naturally able to realize massive parallel processing. the Monkey and Banana Problem can be solved effectively without weakening the fundamental aims above. In this paper, we design a method of representing the problem using DNA molecules, and show that various solutions are generated through computer-simulations based on the design. The simulation results are obviously interesting in that these are contrary to the fact that the Prolog program for the Monkey and Banana Problem, which was implemented from the conventional point of view, gives us only one optimal solution. That is, DNA computing overcomes the limitations of conventional approaches.

  • PDF