• Title/Summary/Keyword: distributed parallel processing

Search Result 257, Processing Time 0.028 seconds

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.

Generic Scheduling Method for Distributed Parallel Systems (분산병렬 시스템에서 유전자 알고리즘을 이용한 스케쥴링 방법)

  • Kim, Hwa-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.1B
    • /
    • pp.27-32
    • /
    • 2003
  • This paper presents the Genetic Algorithm based Task Scheduling (GATS) method for the scheduling of programs with diverse embedded parallelism types in Distributed Parallel Systems, which consist of a set of loosely coupled parallel and vector machines connected via high speed networks The distributed parallel processing tries to solve computationally intensive problems that have several types of parallelism, on a suite of high performance and parallel machines in a manner that best utilizes the capabilities of each machine. When scheduling in distributed parallel systems, the matching of the parallelism characteristics between tasks and parallel machines rather than load balancing should be carefully handled with the minimization of communication cost in order to obtain more speedup. This paper proposes the based initialization methods for an initial population and the knowledge-based mutation methods to accommodate the parallelism type matching in genetic algorithms.

Parallel Implementation of Distributed Sample Scrambler (분산표본혼화기의 병렬구현)

  • 정헌주;김재형정성현박승철
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.62-65
    • /
    • 1998
  • This paper presents a method and implementation of the parallel distributed sample scrambler(DSS) in the cell-based ATM transmission environment. In the serial processing, it requires very high speed clock because the processing clock of the serial DSS is equal with the data transmission speed. In this paper, we develop a conversion method of the serial SRG(shift register generator) to 8bit parallel realization. In this case, it has a sample data processing problem which is a character of DSS. So, a theory of correction time movement is presented to solve this problem. We has developed a ASIC using this algorithm and verified the recommendation of ITU-T, I.432.

  • PDF

An Analysis of Existing Studies on Parallel and Distributed Processing of the Rete Algorithm (Rete 알고리즘의 병렬 및 분산 처리에 관한 기존 연구 분석)

  • Kim, Jaehoon
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.7
    • /
    • pp.31-45
    • /
    • 2019
  • The core technologies for intelligent services today are deep learning, that is neural networks, and parallel and distributed processing technologies such as GPU parallel computing and big data. However, for intelligent services and knowledge sharing services through globally shared ontologies in the future, there is a technology that is better than the neural networks for representing and reasoning knowledge. It is a knowledge representation of IF-THEN in RIF or SWRL, which is the standard rule language of the Semantic Web, and can be inferred efficiently using the rete algorithm. However, when the number of rules processed by the rete algorithm running on a single computer is 100,000, its performance becomes very poor with several tens of minutes, and there is an obvious limitation. Therefore, in this paper, we analyze the past and current studies on parallel and distributed processing of rete algorithm, and examine what aspects should be considered to implement an efficient rete algorithm.

Energy-efficient Routing in MIMO-based Mobile Ad hoc Networks with Multiplexing and Diversity Gains

  • Shen, Hu;Lv, Shaohe;Wang, Xiaodong;Zhou, Xingming
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.2
    • /
    • pp.700-713
    • /
    • 2015
  • It is critical to design energy-efficient routing protocols for battery-limited mobile ad hoc networks, especially in which the energy-consuming MIMO techniques are employed. However, there are several challenges in such a design: first, it is difficult to characterize the energy consumption of a MIMO-based link; second, without a careful design, the broadcasted RREP packets, which are used in most energy-efficient routing protocols, could flood over the networks, and the destination node cannot decide when to reply the communication request; third, due to node mobility and persistent channel degradation, the selected route paths would break down frequently and hence the protocol overhead is increased further. To address these issues, in this paper, a novel Greedy Energy-Efficient Routing (GEER) protocol is proposed: (a) a generalized energy consumption model for the MIMO-based link, considering the trade-off between multiplexing and diversity gains, is derived to minimize link energy consumption and obtain the optimal transmit model; (b) a simple greedy route discovery algorithm and a novel adaptive reply strategy are adopted to speed up path setup with a reduced establishment overhead; (c) a lightweight route maintenance mechanism is introduced to adaptively rebuild the broken links. Extensive simulation results show that, in comparison with the conventional solutions, the proposed GEER protocol can significantly reduce the energy consumption by up to 68.74%.

A Fast Transmission of Mobile Agents Using Binomial Trees (바이노미얼 트리를 이용한 이동 에이전트의 빠른 전송)

  • Cho, Soo-Hyun;Kim, Young-Hak
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.341-350
    • /
    • 2002
  • As network environments have been improved and the use of internet has been increased, mobile agent technologies are widely used in the fields of information retrieval, network management, electronic commerce, and parallel/distributed processing. Recently, a lot of researchers have studied the concepts of parallel/distributed processing based on mobile agents. SPMD is the parallel processing method which transmits a program to all the computers participated in parallel environment, and performs a work with different data. Therefore, to transmit fast a program to all the computers is one of important factors to reduce total execution time. In this paper, we consider the parallel environment consisting of mobile agents system, and propose a new method which transmits fast a mobile agent code to all the computers using binomial trees in order to efficiently perform the SPMD parallel processing. The proposed method is compared with another ones through experimental evaluation on the IBM's Aglets, and gets greatly better performance. Also this paper deals with fault tolerances which can be occurred in transmitting a mobile agent using binomial trees.

Parallel Scrambling Techniques for SDH and ATM Transmissions (SDH와 ATM 전송을 위한 병렬혼화 기법)

  • 김석창;이병기
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1146-1158
    • /
    • 1993
  • In this paper, parallel scrambling techniques are considered for practical use in the SDH transmission and the ATM transmission. In the ATM transmission, there are two ways of transmitting ATM cells - the SDH-based and the cell-based - and the corresponding scrambling techniques differ accordingly. For the SDH transmission and the SDH-based ATM transmission, the FSS (frame synchronous scrambling) is applied to the STM frames : while for the cell-based ATM trans-mission, the DSS(distributed sample scrambling) is used on the ATM cell stream. The parallel scrambling techniques are examined for the FSS and the DSS, and applied to achieve the parallel FSSs for use in the SDH and the SDH-based ATM transmission along with the parallel DSS applicable to the cell-based ATM transmission. The resulting(8, 4) PSRG(parallel shift resister generator) and (8, 16) PSRG based parallel scramblings are directly applicable for the STM-1 rate processing of the STM-4 and STM-16 scramblings, respectively. Likewise, the resulting (1, 8)PSRG and double-sampling-double-correction based parallel scrambling techniques can be practically used for a low-rate processing of the SDH-based and the cell-based ATM signal scrambling respectively.

  • PDF

Proposition and Evaluation of Parallelism-Independent Scheduling Algorithms for DAGs of Tasks with Non-Uniform Execution Time

  • Kirilka Nikolova;Atusi Maeda;Sowa, Masa-Hiro
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.289-293
    • /
    • 2000
  • We propose two new algorithms for parallelism-independent scheduling. The machine code generated from the compiler using these algorithms in its scheduling phase is parallelism-independent code, executable in minimum time regardless of the number of the processors in the parallel computer. Our new algorithms have the following phases: finding the minimum number of processors on which the program can be executed in minimal time, scheduling by an heuristic algorithm for this predefined number of processors, and serialization of the parallel schedule according to the earliest start time of the tasks. At run time tasks are taken from the serialized schedule and assigned to the processor which allows the earliest start time of the task. The order of the tasks decided at compile time is not changed at run time regardless of the number of the available processors which means there is no out-of-order issue and execution. The scheduling is done predominantly at compile time and dynamic scheduling is minimized and diminished to allocation of the tasks to the processors. We evaluate the proposed algorithms by comparing them in terms of schedule length to the CP/MISF algorithm. For performance evaluation we use both randomly generated DAGs (directed acyclic graphs) and DACs representing real applications. From practical point of view, the algorithms we propose can be successfully used for scheduling programs for in-order superscalar processors and shared memory multiprocessor systems. Superscalar processors with any number of functional units can execute the parallelism-independent code in minimum time without necessity for dynamic scheduling and out-of-order issue hardware. This means that the use of our algorithms will lead to reducing the complexity of the hardware of the processors and the run-time overhead related to the dynamic scheduling.

  • PDF

FAST Design for Large-Scale Satellite Image Processing (대용량 위성영상 처리를 위한 FAST 시스템 설계)

  • Lee, Youngrim;Park, Wanyong;Park, Hyunchun;Shin, Daesik
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.372-380
    • /
    • 2022
  • This study proposes a distributed parallel processing system, called the Fast Analysis System for remote sensing daTa(FAST), for large-scale satellite image processing and analysis. FAST is a system that designs jobs in vertices and sequences, and distributes and processes them simultaneously. FAST manages data based on the Hadoop Distributed File System, controls entire jobs based on Apache Spark, and performs tasks in parallel in multiple slave nodes based on a docker container design. FAST enables the high-performance processing of progressively accumulated large-volume satellite images. Because the unit task is performed based on Docker, it is possible to reuse existing source codes for designing and implementing unit tasks. Additionally, the system is robust against software/hardware faults. To prove the capability of the proposed system, we performed an experiment to generate the original satellite images as ortho-images, which is a pre-processing step for all image analyses. In the experiment, when FAST was configured with eight slave nodes, it was found that the processing of a satellite image took less than 30 sec. Through these results, we proved the suitability and practical applicability of the FAST design.

Parallel Implementation of Nonlinear Analysis Program of PSC Frame Using MPI (MPI를 이용한 PSC 프레임 비선형해석 프로그램의 병렬화)

  • 이재석;최규천
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2001.04a
    • /
    • pp.61-68
    • /
    • 2001
  • A parallel nonlinear analysis program of prestressed concrete frame is migrated on a PC cluster system and a massively parallel processing system, CRAY T3E system, using MPI. The PC cluster system is configured with Pentium Ⅲ class PCs and fast ethernet. The CRAY T3E system is composed of a set of nodes each containing one Processing Element (PE), a memory subsystem and its distributed memory interconnect network. Parallel computing algorithms are implemented on element-wise processing parts including the calculation of stiffness matrix, element stresses and determination of material states, check of material failure and calculation of unbalanced loads. Parallel performance of the migrated program is evaluated through typical numerical examples.

  • PDF