• Title/Summary/Keyword: multi-thread

Search Result 188, Processing Time 0.031 seconds

Analysis and Application of Performance Improvement of a Real-time Simulation Visualization based on Multi-thread Pipelining Parallel Processing (다중 스레드 파이프라인 병렬처리를 통한 실시간 시뮬레이션 시각화의 성능 향상 해석 및 적용)

  • Lee, Jun Hee;Song, Hee Kang;Kim, Tag Gon
    • Journal of the Korea Society for Simulation
    • /
    • v.26 no.3
    • /
    • pp.13-22
    • /
    • 2017
  • This research proposes and applies a pipelining parallel processing technique to enhance the speed of visualizing the results of real-time simulations. Generally, a simulation with real-time visualization consists of three processes: executing a simulation model, transmitting simulation result, and visualizing simulation result. If we have these processes in serial, the latency from simulation to visualization will be very long, which degrades the speed of visualization of data from real-time simulation. Thus, the main purpose of this research is maximizing performance by adapting pipelining parallel processing technique to the real-time simulation visualization. Also we show that performance is improved by adding multi-threading technique to each process. This paper proposes a theoretical performance model and simulation results of the techniques and then we applied this to an air combat simulation model as a case study. As the result, it shows that the performance is greatly enhanced than the original model's execution time.

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

  • Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.43-53
    • /
    • 2010
  • Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.

Visualization of Basal Body Temperature and Its Frequency Spectrum Analysis Using an Android Platform Smartphone (스마트폰을 활용한 여성의 기초체온 가시화 및 주파수 스펙트럼 분석)

  • Park, Sang-Eun;Kim, Jeong-Hwan;Seo, Eun-Ah;Choi, Heejung;Kim, Kyeong-Seop
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.7
    • /
    • pp.934-939
    • /
    • 2014
  • The daily recording of basal body temperature is the most useful method of determining the term of ovulation by resolving the rise in temperature. To support this aim, Graphical User Interface (GUI) system is designed and implemented to visualize the basal body temperature variations on daily basis by using android platform smartphone with programming multi-thread Java modules. To estimate the occurrence of ovulation cycle, a new method of analyzing the low-frequency features including a DC level and the second largest peak in frequency spectrum domain is proposed with interpreting the prominent features into the average basal-body temperature variations and a menstrual cycle.

Parallel LDPC Decoder for CMMB on CPU and GPU Using OpenCL (OpenCL을 활용한 CPU와 GPU 에서의 CMMB LDPC 복호기 병렬화)

  • Park, Joo-Yul;Hong, Jung-Hyun;Chung, Ki-Seok
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.6
    • /
    • pp.325-334
    • /
    • 2016
  • Recently, Open Computing Language (OpenCL) has been proposed to provide a framework that supports heterogeneous computing platforms. By using an OpenCL framework, digital communication systems can support various protocols in a unified computing environment to achieve both high portability and high performance. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes for China Multimedia Mobile Broadcasting (CMMB) on a heterogeneous platform. Each step of LDPC decoding has different parallelization characteristics. In this paper, steps suitable for task-level parallelization are executed on the CPU, and steps suitable for data-level parallelization are processed by the GPU. To improve the performance of the proposed OpenCL kernels for LDPC decoding operations, explicit thread scheduling, loop-unrolling, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance by using heterogeneous multi-core processors on a unified computing framework.

Estimation of Heart Rate Variability with an Android Smart Phone Platform (안드로이드 기반 스마트폰 연동 심박변이도 추정)

  • Kim, Jeong-Hwan;Shin, Seung-Won;Kim, Hyun-Tae;Yoon, Tae-Ho;Kim, Kyeong-Seop;Lee, Jeong-Whan;Eom, Gwang-Moon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.6
    • /
    • pp.865-871
    • /
    • 2012
  • In this study, ambulatory electrocardiogram(ECG) signal and the rhythms of heart beats are visualized in terms of R-R intervals and Heart Rate Variability(HRV) in the environment of an android plaform. With this aim, Graphical User Interface(GUI) is implemented by executing multi-thread Java programming modules including ECG, heart-beats, tachogram and visualization unit. ECG signals are acquired in an android device by receiving the data from ambulatory ECG sensory system. Finite Impulse Response(FIR) filters are implemented to eliminate the baseline wandering noises contained in the ambulatory signals and DC-offset level in R-R interval data. With simulating the normal or stress emotional state of a subject, we can find the fact that HRV can be successfully estimated and visualized in an android smart phone platform.

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

HD-Tree: High performance Lock-Free Nearest Neighbor Search KD-Tree (HD-Tree: 고성능 Lock-Free NNS KD-Tree)

  • Lee, Sang-gi;Jung, NaiHoon
    • Journal of Korea Game Society
    • /
    • v.20 no.5
    • /
    • pp.53-64
    • /
    • 2020
  • Supporting NNS method in KD-Tree algorithm is essential in multidimensional data applications. In this paper, we propose HD-Tree, a high-performance Lock-Free KD-Tree that supports NNS in situations where reads and writes occurs concurrently. HD-Tree reduced the number of synchronization nodes used in NNS and requires less atomic operations during Lock-Free method execution. Comparing with existing algorithms, in a multi-core system with 8 core 16 thread, HD-Tree's performance has improved up to 95% on NNS and 15% on modifying in oversubscription situation.

Design of a SIMT architecture GP-GPU Using Tile based on Graphic Pipeline Structure (타일 기반 그래픽 파이프라인 구조를 사용한 SIMT 구조 GP-GPU 설계)

  • Kim, Do-Hyun;Kim, Chi-Yong
    • Journal of IKEEE
    • /
    • v.20 no.1
    • /
    • pp.75-81
    • /
    • 2016
  • This paper proposes a design of the tile based on graphic pipeline to improve the graphic application performance in SIMT based GP-GPU. The proposed Tile based on graphics pipeline avoids unnecessary graphic processing operation, and processes the rasterization step in parallel. The massive data processing in parallel through SIMT architecture improve the computational performance, thereby improving the 3D graphic pipeline performance. The more vertex data of 3D model, the higher performance. The proposed structure was confirmed to improve processing performance of up to 3 times from about 1.18 times as compared to 'RAMP' and previous studies.

A parametric study of bolt-nut joints by the method of finite element contact analysis (유한 요소 접촉 해석법에 의한 나사 체결부 설계 개선에 관한 연구)

  • 이병채;김영곤
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.13 no.3
    • /
    • pp.353-361
    • /
    • 1989
  • A parametric study of load distribution in bolt-nut joints is performed by the method of finite element contact analysis. The contacting surface is assumed unbonded and frictionless. Multi-body contact analysis is performed in elastic region under the assumption of axi-symmetric stress state. Load acting on the first thread from the fastened plate is much greater than that on the other threads in the standard setting. But the load distribution is shown to be improved by making the center of contact force acting on the nut surface move outwards. Such a modification is possible by enlarging the gap between bolt shank and fastened plate or by inserting suitable washers. Shape modification of the standard nut by the making a groove and a step on the nut surface is also suggested, which results in almost uniform load distribution and considerable decrease in the maximum stress of the joint.

GPU-Based Acceleration of Quantum-Inspired Evolutionary Algorithm (GPU를 이용한 Quantum-Inspired Evolutionary Algorithm 가속)

  • Ryoo, Ji-Hyun;Park, Han-Min;Choi, Ki-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.1-9
    • /
    • 2012
  • Quantum-Inspired Evolutionary Algorithm(QEA) contains sufficient data-level parallelism to be naturally accelerated on GPUs. For an efficient reduction of execution time, however, careful task-mapping should be done to properly reflect the characteristics of CPU and GPU. Furthermore, when deciding which part of the application should run on GPU, we need to consider the data transfer between CPU and GPU memory spaces as well as the data-level parallelism. In addition, the usage of zero-copy host memory, proper choice of the execution configuration, and thread organization considering memory coalescing is important to further reduce the execution time. With all these techniques, we could run QEA 3.69 times faster on average in comparison with the multi-threading CPU for the case of 0-1 knapsack problem with 30,000 items.