• 제목/요약/키워드: Parallel computing

검색결과 810건 처리시간 0.026초

High Performance Computing: Infrastructure, Application, and Operation

  • Park, Byung-Hoon;Kim, Youngjae;Kim, Byoung-Do;Hong, Taeyoung;Kim, Sungjun;Lee, John K.
    • Journal of Computing Science and Engineering
    • /
    • 제6권4호
    • /
    • pp.280-286
    • /
    • 2012
  • The last decades have witnessed an increasingly indispensible role of high performance computing (HPC) in science, business and financial sectors, as well as military and national security areas. To introduce key aspects of HPC to a broader community, an HPC session was organized for the first time ever for the United States and Korea Conference (UKC) during 2012. This paper summarizes four invited talks that each covers scientific HPC applications, large-scale parallel file systems, administration/maintenance of supercomputers, and green technology towards building power efficient supercomputers of the next generation.

능동 섬유 복합재의 직접적 수치 모사 (Direct Numerical Simulation of Active Fiber Composite)

  • 백승훈;김승조
    • 한국복합재료학회:학술대회논문집
    • /
    • 한국복합재료학회 2003년도 춘계학술발표대회 논문집
    • /
    • pp.5-9
    • /
    • 2003
  • Stress and deflection of Active Fiber Composite(AFC) embedded and/or attached composite structures are numerically investigated at the constituent level by the Direct Numerical Simulation(DNS). The DNS approach which models and simulates the fiber and matrix directly using 3D finite elements need to be solved by efficient way. To handle this large scale problem, parallel program for solving piezoelectric behavior was developed and run on the parallel computing environment. Also, the stress result from DNS approach is compared with that from uniform field model.

  • PDF

Deep Web and MapReduce

  • Tao, Yufei
    • Journal of Computing Science and Engineering
    • /
    • 제7권3호
    • /
    • pp.147-158
    • /
    • 2013
  • This invited paper introduces results on Web science and technology obtained during work with the Korea Advanced Institute of Science and Technology. In the first part, we discuss algorithms for exploring the deep Web, which refers to the collection of Web pages that cannot be reached by conventional Web crawlers. In the second part, we discuss sorting algorithms on the MapReduce system, which has become a dominant paradigm for massive parallel computing.

A Comparative Performance Study for Compute Node Sharing

  • Park, Jeho;Lam, Shui F.
    • Journal of Computing Science and Engineering
    • /
    • 제6권4호
    • /
    • pp.287-293
    • /
    • 2012
  • We introduce a methodology for the study of the application-level performance of time-sharing parallel jobs on a set of compute nodes in high performance clusters and report our findings. We assume that parallel jobs arriving at a cluster need to share a set of nodes with the jobs of other users, in that they must compete for processor time in a time-sharing manner and other limited resources such as memory and I/O in a space-sharing manner. Under the assumption, we developed a methodology to simulate job arrivals to a set of compute nodes, and gather and process performance data to calculate the percentage slowdown of parallel jobs. Our goal through this study is to identify a better combination of jobs that minimize performance degradations due to resource sharing and contention. Through our experiments, we found a couple of interesting behaviors for overlapped parallel jobs, which may be used to suggest alternative job allocation schemes aiming to reduce slowdowns that will inevitably result due to resource sharing on a high performance computing cluster. We suggest three job allocation strategies based on our empirical results and propose further studies of the results using a supercomputing facility at the San Diego Supercomputing Center.

순차 및 병렬처리 환경에서 효율적인 다분야통합최적설계 문제해결 방법 (An Efficient Solution Method to MDO Problems in Sequential and Parallel Computing Environments)

  • 이세정
    • 한국CDE학회논문집
    • /
    • 제16권3호
    • /
    • pp.236-245
    • /
    • 2011
  • Many researchers have recently studied multi-level formulation strategies to solve the MDO problems and they basically distributed the coupling compatibilities across all disciplines, while single-level formulations concentrate all the controls at the system-level. In addition, approximation techniques became remedies for computationally expensive analyses and simulations. This paper studies comparisons of the MDO methods with respect to computing performance considering both conventional sequential and modem distributed/parallel processing environments. The comparisons show Individual Disciplinary Feasible (IDF) formulation is the most efficient for sequential processing and IDF with approximation (IDFa) is the most efficient for parallel processing. Results incorporating to popular design examples show this finding. The author suggests design engineers should firstly choose IDF formulation to solve MDO problems because of its simplicity of implementation and not-bad performance. A single drawback of IDF is requiring more memory for local design variables and coupling variables. Adding cheap memories can save engineers valuable time and effort for complicated multi-level formulations and let them free out of no solution headache of Multi-Disciplinary Analysis (MDA) of the Multi-Disciplinary Feasible (MDF) formulation.

COMPUTATIONAL EFFICIENCY OF A MODIFIED SCATTERING KERNEL FOR FULL-COUPLED PHOTON-ELECTRON TRANSPORT PARALLEL COMPUTING WITH UNSTRUCTURED TETRAHEDRAL MESHES

  • Kim, Jong Woon;Hong, Ser Gi;Lee, Young-Ouk
    • Nuclear Engineering and Technology
    • /
    • 제46권2호
    • /
    • pp.263-272
    • /
    • 2014
  • Scattering source calculations using conventional spherical harmonic expansion may require lots of computation time to treat full-coupled three-dimensional photon-electron transport in a highly anisotropic scattering medium where their scattering cross sections should be expanded with very high order (e.g., $P_7$ or higher) Legendre expansions. In this paper, we introduce a modified scattering kernel approach to avoid the unnecessarily repeated calculations involved with the scattering source calculation, and used it with parallel computing to effectively reduce the computation time. Its computational efficiency was tested for three-dimensional full-coupled photon-electron transport problems using our computer program which solves the multi-group discrete ordinates transport equation by using the discontinuous finite element method with unstructured tetrahedral meshes for complicated geometrical problems. The numerical tests show that we can improve speed up to 17~42 times for the elapsed time per iteration using the modified scattering kernel, not only in the single CPU calculation but also in the parallel computing with several CPUs.

MPI-GWAS: a supercomputing-aided permutation approach for genome-wide association studies

  • Paik, Hyojung;Cho, Yongseong;Cho, Seong Beom;Kwon, Oh-Kyoung
    • Genomics & Informatics
    • /
    • 제20권1호
    • /
    • pp.14.1-14.4
    • /
    • 2022
  • Permutation testing is a robust and popular approach for significance testing in genomic research that has the advantage of reducing inflated type 1 error rates; however, its computational cost is notorious in genome-wide association studies (GWAS). Here, we developed a supercomputing-aided approach to accelerate the permutation testing for GWAS, based on the message-passing interface (MPI) on parallel computing architecture. Our application, called MPI-GWAS, conducts MPI-based permutation testing using a parallel computing approach with our supercomputing system, Nurion (8,305 compute nodes, and 563,740 central processing units [CPUs]). For 107 permutations of one locus in MPI-GWAS, it was calculated in 600 s using 2,720 CPU cores. For 107 permutations of ~30,000-50,000 loci in over 7,000 subjects, the total elapsed time was ~4 days in the Nurion supercomputer. Thus, MPI-GWAS enables us to feasibly compute the permutation-based GWAS within a reason-able time by harnessing the power of parallel computing resources.

트랜스퓨터를 이용한 유안영속해석의 병렬계산 (A Parallel Computation of Finite Element Analysis on a Transputer System)

  • 김근환;최경;정현교;이기식;한송엽
    • 대한전기학회논문지
    • /
    • 제41권7호
    • /
    • pp.735-741
    • /
    • 1992
  • This paper presents a parallel algorithm for the finite element analysis using relatively inexpensive transputer parallel system. The substructure method, which is highly parallel in nature, is used to improve the parallel computing efficiency by splitting up the whole structure into substructures. The proposed algorithm is applied to a simple two-dimensional magnetostatic problem. It is found that the more the number of transputer is increased, the more the total computation time is reduced. And the computational efficiency becomes better as the number of internal boundary nodes becomes smaller.

  • PDF

최적화된 CUDA 소프트웨어 제작을 위한 프로그래밍 기법 분석 (Analysis of Programming Techniques for Creating Optimized CUDA Software)

  • 김성수;김동헌;우상규;임인성
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제16권7호
    • /
    • pp.775-787
    • /
    • 2010
  • GPU(Graphics Processing Unit)는 범용 CPU와는 달리 다수코어 스트리밍 프로세서(manycore streaming processor) 형태로 특화되어 발전되어 왔으며, 최근 뛰어난 병렬 처리 연산 능력으로 인하여 점차 많은 영역에서 CPU의 역할을 대체하고 있다. 이러한 추세에 따라 최근 NVIDIA 사에서는 GPGPU(General Purpose GPU) 아키텍처인 CUDA(Compute Unified Device Architecture)를 발표하여 보다 유연한 GPU 프로그래밍 환경을 제공하고 있다. 일반적으로 CUDA API를 사용한 프로그래밍 작업시 GPU의 계산구조에 관한 여러 가지 요소들에 대한 특성을 정확히 파악해야 효율적인 병렬 소프트웨어를 개발할 수 있다. 본 논문에서는 다양한 실험과 시행착오를 통하여 획득한 CUDA 프로그래밍에 관한 최적화 기법에 대하여 설명하고, 그러한 방법들이 프로그램 수행의 효율에 어떠한 영향을 미치는지 알아본다. 특히 특정 예제 문제에 대하여 효과적인 계층 구조 메모리의 접근과 코어 활성화 비율(occupancy), 지연 감춤(latency hiding) 등과 같이 성능에 영향을 미치는 몇 가지 규칙을 실험을 통해 분석해봄으로써, 향후 CUDA를 기반으로 하는 효과적인 병렬 프로그래밍에 유용하게 활용할 수 있는 구체적인 방안을 제시한다.

Aglet을 이용한 웹 기반 병렬컴퓨팅 환경설계 (Design of Web-based Parallel Computing Environment Using Aglet)

  • 김윤호
    • 한국컴퓨터산업학회논문지
    • /
    • 제3권2호
    • /
    • pp.209-216
    • /
    • 2002
  • 웹은 브라우저를 통한 단순한 정보의 전달과 정보의 공유수단으로서가 아니라, 수많은 컴퓨터 자원이 연결되어 있는 병렬 컴퓨팅을 위한 기반구조로서 이용될 수 있는 잠재적인 가능성을 가지고 있다. 웹을 기반으로 한 병렬컴퓨팅의 접근방법은 기존의 다른 방법들에 비하여 일반 사용자들의 접근에 대한 용이성, 확장성, 비용대비 효과적인 병렬시스템 구축의 용이성, 기존의 네트워크를 활용할 수 있다는 측면에서 많은 장점을 가진다. 자바언어에서의 이동코드(mobile code)의 개념을 가지고 있는 applet은 많은 계산을 필요로 하는 프로그램이 독립된 병렬작업으로 분할되어 웹 상의 여러 노드들로 이동되어 실행이 되어질 수 있는 가능성을 제시하여 주고 있다. 그러나 자바 applet은 보안에 대한 모델상의 제약으로 인하여 제한된 범위 내에서만 실행이 가능하며 클라이언트가 applet을 포함하고 있는 호스트들에 접속을 해야 한다는 점에서 유연성이 부족하다. 따라서 본 논문에서는 applet의 개념에 자치적으로 작업을 처리할 수 있는 기능을 추가하여 이동형 에이전트라 할 수 있는 Aglet(Agile applet)을 이용하여 웹 기반 병렬 컴퓨팅 환경을 설계하였으며, 웹 기반 병렬컴퓨팅 환경을 구축할 때 필요한 기술과 구조가 분석되었다. 또한 applet 기반의 방식과 비교하여 간단한 시뮬레이션과 분석이 이루어졌다.

  • PDF