• Title/Summary/Keyword: multithreading

Search Result 31, Processing Time 0.017 seconds

The Efficient Execution of Functional Language Loops on the Multithreaded Architectures (다중스레드 구조에서 함수 언어 루프의 효과적 실행)

  • Ha, Sang-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.3
    • /
    • pp.962-970
    • /
    • 2000
  • Multithreading is attractive in that it can tolerate memory latency and synchronization by effectively overlapping communication with computation. While several compiler techniques have been developed to produce multithreaded codes from functional languages programs, there still remains a lot of works to implement loops effectively. Executing lops in a style of multithreading usually causes some overheads, which can reduce severely the effect of multirheading. This paper suggests several methods in terms of architectures or compilers which can optimize loop execution by multithreading. We then simulate and analyze them for the matrix multiplication program.

  • PDF

Performance Enhancement of Parallel Prime Sieving with Hybrid Programming and Pipeline Scheduling (혼합형 병렬처리 및 파이프라이닝을 활용한 소수 연산 알고리즘)

  • Ryu, Seung-yo;Kim, Dongseung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.10
    • /
    • pp.337-342
    • /
    • 2015
  • We develop a new parallelization method for Sieve of Eratosthenes algorithm, which enhances both computation speed and energy efficiency. A pipeline scheduling is included for better load balancing after proper workload partitioning. They run on multicore CPUs with hybrid parallel programming model which uses both message passing and multithreading computation. Experimental results performed on both small scale clusters and a PC with a mobile processor show significant improvement in execution time and energy consumptions.

The Future of Microprocessor: GHz, SMT and Code Morphing (마이크로프로세서의 미래)

  • 박성배
    • Journal of the Korean Professional Engineers Association
    • /
    • v.33 no.4
    • /
    • pp.53-58
    • /
    • 2000
  • Within 10years, it will be possible to integrate 10B transistors on a single chip microprocessor which wilt operate far beyond GHZ, and it will execute about 20-200 instructions per clock cycle from widely variable instruction streams leveraging SMT(Simultaneous Multithreading) technology . Also it will decouple the current legacy X86 binary compatibility by translation layer such as code morphing technology.

  • PDF

A Concurrent Incremental Evaluation Using Multithreading (멀티쓰레딩을 활용한 병행 점진 평가)

  • Han, Junglan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.915-916
    • /
    • 2009
  • 프로그램 개발 단계에서 프로그램을 수정할 경우, 전체 프로그램을 다시 평가하는 대신 수정한 부분과 그 부분에 영향 받는 부분만을 다시 평가하는 것이 점진 평가 방법이다. 본 논문에서는 의미 구조에 직접적으로 영향을 주는 변수의 값을 나타내는 속성을 중심으로 종속성을 표시하고, 여러 프로세서에서 병렬로 처리하는 대신 멀티쓰레딩 기능을 활용하여 객체 지향언어인 자바 언어에서 점진 평가를 동시에 효율적으로 수행할 수 있는 병행 점진 평가 방법을 제시하고 모의실험을 통해 병행 점진 평가 방법의 효율성을 분석한다.

Study of an In-order SMT Architecture and Grouping Schemes

  • Moon, Byung-In;Kim, Moon-Gyung;Hong, In-Pyo;Kim, Ki-Chang;Lee, Yong-Surk
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.339-350
    • /
    • 2003
  • In this paper, we propose a simultaneous multithreading (SMT) architecture that improves instruction throughput by exploiting instruction level parallelism (ILP) and thread level parallelism (TLP). The proposed architecture issues and completes instructions belonging to the same thread in exact program order. The issue and completion policy greatly reduces the design complexity and hardware cost of our architecture, compared with others that employ out-of-order issue and completion. On the other hand, when the instructions belong to different threads, the issue and completion orders for those instructions may not necessarily be identical to the fetch order. The processor issues instructions simultaneously from multiple threads to functional units by exploiting ILP and TLP, and by dynamic resource sharing. That parallel execution notably improves performance and resource utilization with minimal additional hardware cost over the conventional superscalar processors. This paper proposes an SMT architecture with grouping as well as one without grouping. Without grouping, all threads dynamically and flexibly share most resources. On the other hand, in the SMT architecture with grouping, in which resources and threads are divided into several groups for design simplification, resources are shared only among threads belonging to the same group as those resources. Simulation results show that our processors with four and eight threads improve performance by three or more times over the conventional superscalar processor with comparable execution resources and policies, and that reasonable grouping reduces the design complexity of SMT processors with little negative effect on performance.

Parallelization and Performance Optimization of the Boyer-Moore Algorithm on GPU (Boyer-Moore 알고리즘을 위한 GPU상에서의 병렬 최적화)

  • Jeong, Yosang;Tran, Nhat-Phuong;Lee, Myungho;Nam, Dukyun;Kim, Jik-Soo;Hwang, Soonwook
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.138-143
    • /
    • 2015
  • The Boyer-Moore algorithm is a single pattern string matching algorithm that is widely used in various applications such as computer and internet security, and bioinformatics. This algorithm is computationally demanding and requires high-performance parallel processing. In this paper, we propose a parallelization and performance optimization methodology for the BM algorithm on a GPU. Our methodology adopts an algorithmic cascading technique. This results in significant reductions in the mapping overheads for the threads participating in the parallel string matching. It also results in the efficient utilization of the multithreading capability of the GPU which improves the load balancing among threads. Our experimental results show that this approach achieves a 45-times speedup at maximum, in comparison with a serial execution.

Emotion Recognition Implementation with Multimodalities of Face, Voice and EEG

  • Udurume, Miracle;Caliwag, Angela;Lim, Wansu;Kim, Gwigon
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.3
    • /
    • pp.174-180
    • /
    • 2022
  • Emotion recognition is an essential component of complete interaction between human and machine. The issues related to emotion recognition are a result of the different types of emotions expressed in several forms such as visual, sound, and physiological signal. Recent advancements in the field show that combined modalities, such as visual, voice and electroencephalography signals, lead to better result compared to the use of single modalities separately. Previous studies have explored the use of multiple modalities for accurate predictions of emotion; however the number of studies regarding real-time implementation is limited because of the difficulty in simultaneously implementing multiple modalities of emotion recognition. In this study, we proposed an emotion recognition system for real-time emotion recognition implementation. Our model was built with a multithreading block that enables the implementation of each modality using separate threads for continuous synchronization. First, we separately achieved emotion recognition for each modality before enabling the use of the multithreaded system. To verify the correctness of the results, we compared the performance accuracy of unimodal and multimodal emotion recognitions in real-time. The experimental results showed real-time user emotion recognition of the proposed model. In addition, the effectiveness of the multimodalities for emotion recognition was observed. Our multimodal model was able to obtain an accuracy of 80.1% as compared to the unimodality, which obtained accuracies of 70.9, 54.3, and 63.1%.

Characterization of a Magnetron Sputtering Cathode by a 3D Particle Model (3차원 입자 모델을 이용한 마그네트론 스퍼터링 음극의 특성 분석)

  • Joo, Jung-Hoon
    • Journal of the Korean institute of surface engineering
    • /
    • v.41 no.5
    • /
    • pp.205-213
    • /
    • 2008
  • A 3D particle code is developed to analyze electron behavior in a planar magnetron sputtering cathode either in balanced or unbalanced configuration. Three types of collisions are included; electron - neutral elastic, excitation to a metastable state and ionization. Flight path is calculated by a 4-th order Runge-Kutta method with a time step of 10 ps. Effects of electron starting position, magnetic field intensity and configuration were analyzed. For a more efficient and accurate modeling, multithreading technique is considered for multicore CPU computers. Under an assumption of cold ion approach, target erosion profiles are predicted for a flat target surface.

On-Chip Multiprocessor with Simultaneous Multithreading

  • Park, Kyoung;Choi, Sung-Hoon;Chung, Yong-Wha;Hahn, Woo-Jong;Yoon, Suk-Han
    • ETRI Journal
    • /
    • v.22 no.4
    • /
    • pp.13-24
    • /
    • 2000
  • As more transistors are integrated onto bigger die, an on-chip multiprocessor will become a promising alternative to the superscalar microprocessor that dominates today's microprocessor marketplace. This paper describes key parts of a new on-chip multiprocessor, called Raptor, which is composed of four 2-way superscalar processor cores and one graphic co-processor. To obtain performance characteristics of Raptor, a program-driven simulator and its programming environment were developed. The simulation results showed that Raptor can exploit thread level parallelism effectively and offer a promising architecture for future on-chip multi-processor designs.

  • PDF

Robust Fuzzy Control of a Class of Nonlinear Descriptor Systems with Time-Varying Delay

  • Yan Wang;Sun, Zeng-Qi;Sun, Fu-Chun
    • International Journal of Control, Automation, and Systems
    • /
    • v.2 no.1
    • /
    • pp.76-82
    • /
    • 2004
  • A robust fuzzy controller is designed to stabilize a class of solvable nonlinear descriptor systems with time-varying delay. First, a new modeling and control method for nonlinear descriptor systems is presented with a fuzzy descriptor model. A sufficient condition for the existence of the fuzzy controller is given in terms of a series of LMIs. Then, a less conservative fuzzy controller design approach is obtained based on the fuzzy rules and weights. This method includes the interactions of the different subsystems into one matrix. The effectiveness of the presented approach and the design procedure of the fuzzy controller are illustrated by way of an example.