• Title/Summary/Keyword: Parallel Computing(병렬컴퓨팅)

Search Result 229, Processing Time 0.02 seconds

Dynamic Load Balancing Scheme Based on Resource Reservation for Migration of Agents in Pure P2P Network Environments (순수 P2P 네트워크 환경에서 에이전트 이주를 위한 자원 예약 기반 동적 부하 균형 기법)

  • Kim, Kyung-In;Kim, Young-jin;Eom, Young-Ik
    • The KIPS Transactions:PartA
    • /
    • v.11A no.4
    • /
    • pp.257-266
    • /
    • 2004
  • Mobile agents are defined as processes which can be autonomously delegated or transferred among the hosts in a network in order to perform some computations on behalf of the user and co-operate with other agents. Currently, mobile agents are used in various fields, such as electronic commerce, mobile communication, parallel processing, search of information, recovery, and so on. In pure P2P network environment, if mobile agents that require computing resources rashly migrate to another peers without consideration on the peer's capacity of resources, the peer may have a problem that the performance of the peer is degraded due to lack of resources. To solve this problem, we propose resource reservation based load balancing scheme of using RMA(Resource Management Agent) that monitors workload information of the peers and that decides migrating agents and destination peers. In mobile agent migrating procedure, if the resource of specific peer is already reserved, our resource reservation scheme prevents other mobile agents from allocating the resource.

KITTEN: A Multi-thread Virtual Reality System (KITTEN: 다중 스레드 가상현실 시스템)

  • Kim, Dae-Won;Lee, Son-Ou;Whon, Kwang-Yun;Lee, Kwang-Hyung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.3
    • /
    • pp.275-287
    • /
    • 2000
  • A virtual reality system must provide participants with a natural interaction, a sufficient immersion, and mostly, realistic images. To achieve this, it is crucial to provide a fast and uniform rendering speed regardless of the complexity of virtual worlds, or the complexity of simulation. In this paper, a virtual reality system which offers an improved rendering performance for complex virtual reality applications has been designed and implemented. The key idea of the proposed system is to exploit the multi-thread scheme in system module design, and execute each modules in parallel. Taking such design approach, rendering, simulation, and interaction can be executed independently. Hence, in applications where a simulation is complex or a scene is very large, this system can provide a more uniform and faster frame rates. The proposed method has been experimented under the various application environments in which scenes and simulations are very complex.

  • PDF

The Priority Heuristics for Concurrent Parsing of JavaScript (자바스크립트 동시 파싱을 위한 우선순위 휴리스틱)

  • Cha, Myungsu;Park, Hyukwoo;Moon, Soo-Mook
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.510-515
    • /
    • 2017
  • It is important to speed up the loading time of web applications. Parsing is a loading process that contributes to an increased loading time. To address this issue, the optimization called Concurrent Parsing has been proposed which handles the parsing process in parallel by using additional threads. However, Concurrent Parsing has a limitation that it does not consider the priority order of parsing. In this paper, we propose heuristics that exploit priorities of parsing to improve the Concurrent Parsing. For parsing priority, we empirically investigate the sequence of function calls, classify functions into 3 categories, and extract function call probabilities. If a function has high call probability, we give a high priority and if a function has low probability, we give a low priority. We evaluate this priority heuristics on real web applications and get the 2.6% decrease of loading time on average.

Efficient Hardware Transactional Memory Scheme for Processing Transactions in Multi-core In-Memory Environment (멀티코어 인메모리 환경에서 트랜잭션을 처리하기 위한 효율적인 HTM 기법)

  • Jang, Yeonwoo;Kang, Moonhwan;Yoon, Min;Chang, Jaewoo
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.466-472
    • /
    • 2017
  • Hardware Transactional Memory (HTM) has greatly changed the parallel programming paradigm for transaction processing. Since Intel has recently proposed Transactional Synchronization Extension (TSX), a number of studies based on HTM have been conducted. However, the existing studies support conflict prediction for a single cause of the transaction processing and provide a standardized TSX environment for all workloads. To solve the problems, we propose an efficient hardware transactional memory scheme for processing transactions in multi-core in-memory environment. First, the proposed scheme determines whether to use Software Transactional Memory (STM) or the serial execution as a fallback path of HTM by using a prediction matrix to collect the information of previously executed transactions. Second, the proposed scheme performs efficient transaction processing according to the characteristic of a given workload by providing a retry policy based on machine learning algorithms. Finally, through the experimental performance evaluation using Stanford transactional applications for multi-processing (STAMP), the proposed scheme shows 10~20% better performance than the existing schemes.

Load Balancing of Heterogeneous Workstation Cluster based on Relative Load Index (상대적 부하 색인을 기반으로 한 이기종 워크스테이션 클러스터의 부하 균형)

  • Ji, Byoung-Jun;Lee, Kwang-Mo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.2
    • /
    • pp.183-194
    • /
    • 2002
  • The clustering environment with heterogeneous workstations provides the cost effectiveness and usability for executing applications in parallel. Load balancing is considered a necessary feature for a cluster of heterogeneous workstations to minimize the turnaround time. Previously, static load balancing that assigns a predetermined weight for the processing capability of each workstation, or dynamic approaches which execute a benchmark program to get relative processing capability of each workstation were proposed. The execution of the benchmark program, which has nothing to do with the application being executed, consumes the computation time and the overall turnaround time is delayed. In this paper, we present efficient methods for task distribution and task migration, based on the relative load index. We designed and implemented a load balancing system for the clustering environment with heterogeneous workstations. Turnaround times of our methods and the round-robin approach, as well as the load balancing method using a benchmark program, were compared. The experimental results show that our methods outperform all the other methods that we compared.

Dynamic Directory Table: On-Demand Allocation of Directory Entries for Active Shared Cache Blocks (동적 디렉터리 테이블 : 공유 캐시 블록의 디렉터리 엔트리 동적 할당)

  • Bae, Han Jun;Choi, Lynn
    • Journal of KIISE
    • /
    • v.44 no.12
    • /
    • pp.1245-1251
    • /
    • 2017
  • In this study we present a novel directory architecture that can dynamically allocate a directory entry for a cache block on demand at runtime only when the block is shared by more than one core. Thus, we do not maintain coherence for private blocks, substantially reducing the number of directory entries. Even for shared blocks, we allocate directory entry dynamically only when the block is actively shared, further reducing the number of directory entries at runtime. For this, we propose a new directory architecture called dynamic directory table (DDT), which is implemented as a cache of active directory entries. Through our detailed simulation on PARSEC benchmarks, we show that DDT can outperform the expensive full-map directory by a slight margin with only 17.84% of directory area across a variety of different workloads. This is achieved by its faster access and high hit rates in the small directory. In addition, we demonstrate that even smaller DDTs can give comparable or higher performance compared to recent directory optimization schemes such as SPACE and DGD with considerably less area.

Performance evaluation and analysis of TILE-Gx36 many-core processor with PARSEC benchmark (PARSEC을 이용한 TILE-Gx36 다중코어 프로세서의 성능 평가 및 분석)

  • Lee, Boseon;Kim, Han-Yee;Yu, Heonchang;Suh, Taeweon
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.1
    • /
    • pp.107-115
    • /
    • 2014
  • This paper evaluates and analyzes the performance of TILE-Gx36(Gx36), a many-core processor. The PARSEC parallel benchmark suite was used to measure the performance, and Core i7 (i7) and Atom are used for the performance comparison. When experimented with the maximum number of threads that can be executed concurrently on each machine, Gx36 showed a 2.73${\times}$ inferior performance to Core i7 and a 1.93${\times}$ superior performance to Atom. Gx36 has the largest Last Level Cache(LLC) among the compared processors. Nevertheless, it reported the biggest number of LLC misses, which, we strongly believe, is the major culprit for lower performance than expected. Our study suggests that the DDC employed in Gx36 is not a favorable cache structure for the general-purpose high-performance computing. The actual measurement with off-the-shelf machine provides non-biased data for polishing the future many-core architecture.

  • PDF

Development of Multiscale Modeling Methods Coupling Molecular Dynamics and Stochastic Rotation Dynamics (분자동역학과 확률회전동역학을 결합한 멀티스케일 모델링 기법 개발)

  • Cha, Kwangho;Jung, Youngkyun
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.10
    • /
    • pp.534-542
    • /
    • 2014
  • Multiscale modeling is a new simulation approach which can manage different spatial and temporal scales of system. In this study, as part of multiscale modeling research, we propose the way of combining two different simulation methods, molecular dynamics(MD) and stochastic rotation dynamics(SRD). Our conceptual implementations are based on LAMMPS, one of the well-known molecular dynamics programs. Our prototype of multiscale modeling follows the form of the third party implementation of LAMMPS. It added MD to SRD in order to simulate the boundary area of the simulation box. Because it is important to guarantee the seamless simulation, we also designed the overlap zones and communication zones. The preliminary experimental results showed that our proposed scheme is properly worked out and the execution time is also reduced.

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.

Implementation of a Cluster VOD Server and an Embedded Client based on Linux (리눅스 기반의 클러스터 VOD서버와 내장형에 클라이언트의 구현)

  • Seo Dongmahn;Bang Cheolseok;Lee Joahyoung;Kim Byounggil;Jung Inbum
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.6
    • /
    • pp.435-447
    • /
    • 2004
  • For VOD systems, it is important to provide QoS to more users under the limited resources. To analyze QoS issues in real environment, we implement clustered VOD server and embedded client system based on the Linux open source platform. The parallel processing of MPEG data, load balancing for nodes and VCR like functions are implemented in the server side. To provide more user friendly interface, the general TV is used for a VOD client's terminal and the embedded board is used supporting for VCR functions. In this paper, we measure the performance of the implemented VOD system under the various user requirement features and evaluate the sources of performance limitations. From these analyses, we propose the dynamic admission control method based on the availability memory and network bandwidth. The proposed method enhances the utilization of the system resource for the more QoS media streams.