• 제목/요약/키워드: distributed parallel processing

검색결과 257건 처리시간 0.026초

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권10호
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.

분산병렬 시스템에서 유전자 알고리즘을 이용한 스케쥴링 방법 (Generic Scheduling Method for Distributed Parallel Systems)

  • 김화성
    • 한국통신학회논문지
    • /
    • 제28권1B호
    • /
    • pp.27-32
    • /
    • 2003
  • 본 논문에서는 고속 네트웍 기반의 분산 병렬 시스템에서 다양한 내재 병렬 형태를 갖는 프로그램의 효과적인 수행을 위한 유전자 알고리즘 기반의 태스크 스케쥴링 방법(Genetic Algorithm based Task Scheduling GATS)을 제안한다. 분산병렬 시스템은 고속 네트웍을 통해 연결되어진 다수의 범용, 병렬, 벡터 컴퓨터들로 구성되어진다. 분산병렬 처리의 목적은 다양한 내재 병렬 형태를 갖는 연산 집약적인 문제들을 다수의 고성능 및 병렬 컴퓨터들의 각기 다른 능력을 최대한 이용하여 해결함에 있다 분산병렬 시스템에서 스케쥴링을 통하여 더 많은 속도향상을 얻기 위해서는 시스템간의 부하 균형보다는 태스크와 병렬 컴퓨터간의 병렬특성의 일치가 주의 깊게 다루어져야 하며 태스크의 이동으로 인한 통신 오버헤드가 최소화되어야 한다 본 논문에서는 유전자 알고리즘의 동작이 병렬 특성을 감안하여 이루어질 수 있도록 초기화 방법과 지식 기반의 mutation 방법을 제안한다.

분산표본혼화기의 병렬구현 (Parallel Implementation of Distributed Sample Scrambler)

  • 정헌주;김재형정성현박승철
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 하계종합학술대회논문집
    • /
    • pp.62-65
    • /
    • 1998
  • This paper presents a method and implementation of the parallel distributed sample scrambler(DSS) in the cell-based ATM transmission environment. In the serial processing, it requires very high speed clock because the processing clock of the serial DSS is equal with the data transmission speed. In this paper, we develop a conversion method of the serial SRG(shift register generator) to 8bit parallel realization. In this case, it has a sample data processing problem which is a character of DSS. So, a theory of correction time movement is presented to solve this problem. We has developed a ASIC using this algorithm and verified the recommendation of ITU-T, I.432.

  • PDF

Rete 알고리즘의 병렬 및 분산 처리에 관한 기존 연구 분석 (An Analysis of Existing Studies on Parallel and Distributed Processing of the Rete Algorithm)

  • 김재훈
    • 한국정보기술학회논문지
    • /
    • 제17권7호
    • /
    • pp.31-45
    • /
    • 2019
  • 현재 지능적 서비스의 핵심 기술은 딥러닝 즉 신경망, 그리고 GPU 병렬 컴퓨팅 및 빅 데이터와 같은 병렬 분산 처리 기술이다. 하지만 미래의 전 세계적으로 공유된 온톨로지를 통한 지능적 서비스 및 지식 공유 서비스에서는 지식의 표현 및 추론을 위하여 신경망보다 더 나은 방법이 있다. 그것은 시맨틱 웹의 표준 규칙 언어인 RIF 혹은 SWRL의 IF-THEN의 지식 표현이며, 이러한 규칙을 rete 알고리즘을 이용하여 효율적으로 추론할 수 있다. 하지만 단일 컴퓨터에서 동작하는 rete 알고리즘의 처리 규칙 수가 100,000개가 될 경우 그 성능이 수 십 분으로 매우 안 좋아지며, 분명한 한계가 존재한다. 따라서 본 논문에서는 rete 알고리즘의 병렬 및 분산 처리에 대한 과거로부터 현재까지의 연구 내용을 정리 분석하며, 이를 통해 효율적인 rete 알고리즘의 구현을 위해 어떤 측면들이 고려되어야 하는지를 살펴본다.

Energy-efficient Routing in MIMO-based Mobile Ad hoc Networks with Multiplexing and Diversity Gains

  • Shen, Hu;Lv, Shaohe;Wang, Xiaodong;Zhou, Xingming
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권2호
    • /
    • pp.700-713
    • /
    • 2015
  • It is critical to design energy-efficient routing protocols for battery-limited mobile ad hoc networks, especially in which the energy-consuming MIMO techniques are employed. However, there are several challenges in such a design: first, it is difficult to characterize the energy consumption of a MIMO-based link; second, without a careful design, the broadcasted RREP packets, which are used in most energy-efficient routing protocols, could flood over the networks, and the destination node cannot decide when to reply the communication request; third, due to node mobility and persistent channel degradation, the selected route paths would break down frequently and hence the protocol overhead is increased further. To address these issues, in this paper, a novel Greedy Energy-Efficient Routing (GEER) protocol is proposed: (a) a generalized energy consumption model for the MIMO-based link, considering the trade-off between multiplexing and diversity gains, is derived to minimize link energy consumption and obtain the optimal transmit model; (b) a simple greedy route discovery algorithm and a novel adaptive reply strategy are adopted to speed up path setup with a reduced establishment overhead; (c) a lightweight route maintenance mechanism is introduced to adaptively rebuild the broken links. Extensive simulation results show that, in comparison with the conventional solutions, the proposed GEER protocol can significantly reduce the energy consumption by up to 68.74%.

바이노미얼 트리를 이용한 이동 에이전트의 빠른 전송 (A Fast Transmission of Mobile Agents Using Binomial Trees)

  • 조수현;김영학
    • 정보처리학회논문지A
    • /
    • 제9A권3호
    • /
    • pp.341-350
    • /
    • 2002
  • 네트워크 환경이 좋아지고 인터넷 사용이 급증함에 따라 이동 에이전트(Mobile Agent) 기술이 정보검색, 네트워크관리, 전자상거래, 병렬/분산처리 분야에 널리 활용되고 있다. 최근에 다수의 연구자들이 이동 에이전트를 기반으로 한 병렬/분산처리 개념을 연구하고 있다. SPMD(Single Program Multiple Data)는 하나의 프로그램이 병렬환경에 참여하는 모든 컴퓨터에 전송되어 다른 자료를 사용하여 작업을 수행하는 병렬처리 방법이다. 따라서 하나의 프로그램을 모든 컴퓨터에 빠르게 전송하는 것은 전체 수행시간을 줄이기 위한 주요한 요소 중의 하나이다. 본 논문에서는 이동 에이전트 시스템으로 구성된 병렬환경에서 SPMD의 병렬처리를 효율적으로 수행하기 위해, 바이노미얼 트리를 이용하여 하나의 이동 에이전트 코드를 모든 컴퓨터에 빠르게 전송하는 새로운 방법을 제안한다. 제안된 방법은 IBM's Aglets에서 실험적 평가를 통하여 다른 방법과 비교되었으며 다른 방법에 비해서 상당히 좋은 성능을 보였다. 또한 본 문에서는 바이노미얼 트리에서 에이전트 전송 중에 발생될 수 있는 결함허용에 관한 문제를 다룬다.

SDH와 ATM 전송을 위한 병렬혼화 기법 (Parallel Scrambling Techniques for SDH and ATM Transmissions)

  • 김석창;이병기
    • 한국통신학회논문지
    • /
    • 제18권8호
    • /
    • pp.1146-1158
    • /
    • 1993
  • In this paper, parallel scrambling techniques are considered for practical use in the SDH transmission and the ATM transmission. In the ATM transmission, there are two ways of transmitting ATM cells - the SDH-based and the cell-based - and the corresponding scrambling techniques differ accordingly. For the SDH transmission and the SDH-based ATM transmission, the FSS (frame synchronous scrambling) is applied to the STM frames : while for the cell-based ATM trans-mission, the DSS(distributed sample scrambling) is used on the ATM cell stream. The parallel scrambling techniques are examined for the FSS and the DSS, and applied to achieve the parallel FSSs for use in the SDH and the SDH-based ATM transmission along with the parallel DSS applicable to the cell-based ATM transmission. The resulting(8, 4) PSRG(parallel shift resister generator) and (8, 16) PSRG based parallel scramblings are directly applicable for the STM-1 rate processing of the STM-4 and STM-16 scramblings, respectively. Likewise, the resulting (1, 8)PSRG and double-sampling-double-correction based parallel scrambling techniques can be practically used for a low-rate processing of the SDH-based and the cell-based ATM signal scrambling respectively.

  • PDF

Proposition and Evaluation of Parallelism-Independent Scheduling Algorithms for DAGs of Tasks with Non-Uniform Execution Time

  • Kirilka Nikolova;Atusi Maeda;Sowa, Masa-Hiro
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 ITC-CSCC -1
    • /
    • pp.289-293
    • /
    • 2000
  • We propose two new algorithms for parallelism-independent scheduling. The machine code generated from the compiler using these algorithms in its scheduling phase is parallelism-independent code, executable in minimum time regardless of the number of the processors in the parallel computer. Our new algorithms have the following phases: finding the minimum number of processors on which the program can be executed in minimal time, scheduling by an heuristic algorithm for this predefined number of processors, and serialization of the parallel schedule according to the earliest start time of the tasks. At run time tasks are taken from the serialized schedule and assigned to the processor which allows the earliest start time of the task. The order of the tasks decided at compile time is not changed at run time regardless of the number of the available processors which means there is no out-of-order issue and execution. The scheduling is done predominantly at compile time and dynamic scheduling is minimized and diminished to allocation of the tasks to the processors. We evaluate the proposed algorithms by comparing them in terms of schedule length to the CP/MISF algorithm. For performance evaluation we use both randomly generated DAGs (directed acyclic graphs) and DACs representing real applications. From practical point of view, the algorithms we propose can be successfully used for scheduling programs for in-order superscalar processors and shared memory multiprocessor systems. Superscalar processors with any number of functional units can execute the parallelism-independent code in minimum time without necessity for dynamic scheduling and out-of-order issue hardware. This means that the use of our algorithms will lead to reducing the complexity of the hardware of the processors and the run-time overhead related to the dynamic scheduling.

  • PDF

대용량 위성영상 처리를 위한 FAST 시스템 설계 (FAST Design for Large-Scale Satellite Image Processing)

  • 이영림;박완용;박현춘;신대식
    • 한국군사과학기술학회지
    • /
    • 제25권4호
    • /
    • pp.372-380
    • /
    • 2022
  • This study proposes a distributed parallel processing system, called the Fast Analysis System for remote sensing daTa(FAST), for large-scale satellite image processing and analysis. FAST is a system that designs jobs in vertices and sequences, and distributes and processes them simultaneously. FAST manages data based on the Hadoop Distributed File System, controls entire jobs based on Apache Spark, and performs tasks in parallel in multiple slave nodes based on a docker container design. FAST enables the high-performance processing of progressively accumulated large-volume satellite images. Because the unit task is performed based on Docker, it is possible to reuse existing source codes for designing and implementing unit tasks. Additionally, the system is robust against software/hardware faults. To prove the capability of the proposed system, we performed an experiment to generate the original satellite images as ortho-images, which is a pre-processing step for all image analyses. In the experiment, when FAST was configured with eight slave nodes, it was found that the processing of a satellite image took less than 30 sec. Through these results, we proved the suitability and practical applicability of the FAST design.

MPI를 이용한 PSC 프레임 비선형해석 프로그램의 병렬화 (Parallel Implementation of Nonlinear Analysis Program of PSC Frame Using MPI)

  • 이재석;최규천
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2001년도 봄 학술발표회 논문집
    • /
    • pp.61-68
    • /
    • 2001
  • A parallel nonlinear analysis program of prestressed concrete frame is migrated on a PC cluster system and a massively parallel processing system, CRAY T3E system, using MPI. The PC cluster system is configured with Pentium Ⅲ class PCs and fast ethernet. The CRAY T3E system is composed of a set of nodes each containing one Processing Element (PE), a memory subsystem and its distributed memory interconnect network. Parallel computing algorithms are implemented on element-wise processing parts including the calculation of stiffness matrix, element stresses and determination of material states, check of material failure and calculation of unbalanced loads. Parallel performance of the migrated program is evaluated through typical numerical examples.

  • PDF