• Title/Summary/Keyword: Parallel Computing Method

Search Result 283, Processing Time 0.03 seconds

A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing (IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구)

  • Cho, Doosan
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.1
    • /
    • pp.69-77
    • /
    • 2021
  • The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.

High Performance Hybrid Direct-Iterative Solution Method for Large Scale Structural Analysis Problems

  • Kim, Min-Ki;Kim, Seung-Jo
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.9 no.2
    • /
    • pp.79-86
    • /
    • 2008
  • High performance direct-iterative hybrid linear solver for large scale finite element problem is developed. Direct solution method is robust but difficult to parallelize, whereas iterative solution method is opposite for direct method. Therefore, combining two solution methods is desired to get both high performance parallel efficiency and numerical robustness for large scale structural analysis problems. Hybrid method mentioned in this paper is based on FETI-DP (Finite Element Tearing and Interconnecting-Dual Primal method) which has good parallel scalability and efficiency. It is suitable for fourth and second order finite element elliptic problems including structural analysis problems. We are using the hybrid concept of theses two solution method categories, combining the multifrontal solver into FETI-DP based iterative solver. Hybrid solver is implemented for our general structural analysis code, IPSAP.

A Parallel Genetic Algorithm for Solving Deadlock Problem within Multi-Unit Resources Systems

  • Ahmed, Rabie;Saidani, Taoufik;Rababa, Malek
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12
    • /
    • pp.175-182
    • /
    • 2021
  • Deadlock is a situation in which two or more processes competing for resources are waiting for the others to finish, and neither ever does. There are two different forms of systems, multi-unit and single-unit resource systems. The difference is the number of instances (or units) of each type of resource. Deadlock problem can be modeled as a constrained combinatorial problem that seeks to find a possible scheduling for the processes through which the system can avoid entering a deadlock state. To solve deadlock problem, several algorithms and techniques have been introduced, but the use of metaheuristics is one of the powerful methods to solve it. Genetic algorithms have been effective in solving many optimization issues, including deadlock Problem. In this paper, an improved parallel framework of the genetic algorithm is introduced and adapted effectively and efficiently to deadlock problem. The proposed modified method is implemented in java and tested on a specific dataset. The experiment shows that proposed approach can produce optimal solutions in terms of burst time and the number of feasible solutions in each advanced generation. Further, the proposed approach enables all types of crossovers to work with high performance.

Development of a Monitoring and Verification Tool for Sensor Fusion (센서융합 검증을 위한 실시간 모니터링 및 검증 도구 개발)

  • Kim, Hyunwoo;Shin, Seunghwan;Bae, Sangjin
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.22 no.3
    • /
    • pp.123-129
    • /
    • 2014
  • SCC (Smart Cruise Control) and AEBS (Autonomous Emergency Braking System) are using various types of sensors data, so it is important to consider about sensor data reliability. In this paper, data from radar and vision sensor is fused by applying a Bayesian sensor fusion technique to improve the reliability of sensors data. Then, it presents a sensor fusion verification tool developed to monitor acquired sensors data and to verify sensor fusion results, efficiently. A parallel computing method was applied to reduce verification time and a series of simulation results of this method are discussed in detail.

Fast Hologram Generating of 3D Object with Super Multi-Light Source using Parallel Distributed Computing (병렬 분산 컴퓨팅을 이용한 초다광원 3차원 물체의 홀로그램 고속 생성)

  • Song, Joongseok;Kim, Changseob;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.20 no.5
    • /
    • pp.706-717
    • /
    • 2015
  • The computer generated hologram (CGH) method is the technology which can generate a hologram by using only a personal computer (PC) commonly used. However, the CGH method requires a huge amount of calculational time for the 3D object with a super multi-light source or a high-definition hologram. Hence, some solutions are obviously necessary for reducing the computational complexity of a CGH algorithm or increasing the computing performance of hardware. In this paper, we propose a method which can generate a digital hologram of the 3D object with a super multi-light source using parallel distributed computing. The traditional methods has the limitation of improving CGH performance by using a single PC. However, the proposed method where a server PC efficiently uses the computing power of client PCs can quickly calculate the CGH method for 3D object with super multi-light source. In the experimental result, we verified that the proposed method can generate the digital hologram with 1,5361,536 resolution size of 3D object with 157,771 light source in 121 ms. In addition, in the proposed method, we verify that the proposed method can reduce generation time of a digital hologram in proportion to the number of client PCs.

Real-Time IoT Big-data Processing for Stream Reasoning (스트림-리즈닝을 위한 실시간 사물인터넷 빅-데이터 처리)

  • Yun, Chang Ho;Park, Jong Won;Jung, Hae Sun;Lee, Yong Woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.1-9
    • /
    • 2017
  • Smart Cities intelligently manage numerous infrastructures, including Smart-City IoT devices, and provide a variety of smart-city applications to citizen. In order to provide various information needed for smart-city applications, Smart Cities require a function to intelligently process large-scale streamed big data that are constantly generated from a large number of IoT devices. To provide smart services in Smart-City, the Smart-City Consortium uses stream reasoning. Our stream reasoning requires real-time processing of big data. However, there are limitations associated with real-time processing of large-scale streamed big data in Smart Cities. In this paper, we introduce one of our researches on cloud computing based real-time distributed-parallel-processing to be used in stream-reasoning of IoT big data in Smart Cities. The Smart-City Consortium introduced its previously developed smart-city middleware. In the research for this paper, we made cloud computing based real-time distributed-parallel-processing available in the cloud computing platform of the smart-city middleware developed in the previous research, so that we can perform real-time distributed-parallel-processing with them. This paper introduces a real-time distributed-parallel-processing method and system for stream reasoning with IoT big data transmitted from various sensors of Smart Cities and evaluate the performance of real-time distributed-parallel-processing of the system where the method is implemented.

Parallel Computation on the Three-dimensional Electromagnetic Field by the Graph Partitioning and Multi-frontal Method (그래프 분할 및 다중 프론탈 기법에 의거한 3차원 전자기장의 병렬 해석)

  • Kang, Seung-Hoon;Song, Dong-Hyeon;Choi, JaeWon;Shin, SangJoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.50 no.12
    • /
    • pp.889-898
    • /
    • 2022
  • In this paper, parallel computing method on the three-dimensional electromagnetic field is proposed. The present electromagnetic scattering analysis is conducted based on the time-harmonic vector wave equation and the finite element method. The edge-based element and 2nd -order absorbing boundary condition are used. Parallelization of the elemental numerical integration and the matrix assemblage is accomplished by allocating the partitioned finite element subdomain for each processor. The graph partitioning library, METIS, is employed for the subdomain generation. The large sparse matrix computation is conducted by MUMPS, which is the parallel computing library based on the multi-frontal method. The accuracy of the present program is validated by the comparison against the Mie-series analytical solution and the results by ANSYS HFSS. In addition, the scalability is verified by measuring the speed-up in terms of the number of processors used. The present electromagnetic scattering analysis is performed for a perfect electric conductor sphere, isotropic/anisotropic dielectric sphere, and the missile configuration. The algorithm of the present program will be applied to the finite element and tearing method, aiming for the further extended parallel computing performance.

Parallel Contact Treatment and Parallel Performance of Impact Simulation Based on Lagrangian Scheme (Lagrangian 기법에 의한 충돌 해석 시 접촉처리의 병렬화 및 병렬효율 평가)

  • Back, Seung-Hoon;Kim, Seung-Jo;Lee, Min-Hyung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.30 no.11 s.254
    • /
    • pp.1447-1454
    • /
    • 2006
  • The evaluation of parallel performance of a high speed impact simulation is not an easy task because not only the development of parallel explicit code is difficult but also a large number of processors is not easily accessible. In this paper, the parallel performance of a new Lagrangian FEM impact code carried out on cluster supercomputer has been described in high speed range. In the case of metal sphere impacting to oblique plate, the overall speed-up continuously increases even up to 128 CPUs. Investigation of elapsed time of each part reveals that most of the inefficiency comes from the load imbalance of contact.

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • v.18 no.5
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

A STUDY ON THE EFFICIENCY OF AERODYNAMIC DESIGN OPTIMIZATION IN DISTRIBUTED COMPUTING ENVIRONMENT (분산컴퓨팅 환경에서 공력 설계최적화의 효율성 연구)

  • Kim Y.J.;Jung H.J.;Kim T.S.;Son C.H.;Joh C.Y.
    • Journal of computational fluids engineering
    • /
    • v.11 no.2 s.33
    • /
    • pp.19-24
    • /
    • 2006
  • A research to evaluate the efficiency of design optimization was carried out for aerodynamic design optimization problem in distributed computing environment. The aerodynamic analyses which take most of computational work during design optimization were divided into several jobs and allocated to associated PC clients through network. This is not a parallel process based on domain decomposition in a single analysis rather than a simultaneous distributed-analyses using network-distributed computers. GBOM(gradient-based optimization method), SAO(Sequential Approximate Optimization) and RSM(Response Surface Method) were implemented to perform design optimization of transonic airfoils and evaluate their efficiencies. dimensional minimization followed by direction search involved in the GBOM was found an obstacle against improving efficiency of the design process in the present distributed computing system. The SAO was found fairly suitable for the distributed computing environment even it has a handicap of local search. The RSM is apparently the most efficient algorithm in the present distributed computing environment, but additional trial and error works needed to enhance the reliability of the approximation model deteriorate its efficiency from the practical point of view.