• Title/Summary/Keyword: multi-core processing

Search Result 218, Processing Time 0.025 seconds

Implementation of a 'Rasterization based on Vector Algorithm' suited for a Multi-thread Shader architecture (Multi-Thread 쉐이더 구조에 적합한 Vector 기반의 Rasterization 알고리즘의 구현)

  • Lee, Ju-Suk;Kim, Woo-Young;Lee, Bo-Haeng;Lee, Kwang-Yeob
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.10
    • /
    • pp.46-52
    • /
    • 2009
  • A Multi-Core/Multi-Thread architecture is adopted for the Shader processor to enhance the processing performance. The Shader processor is designed to utilize its processing core IP for multiple purposes, such as Vertex-Shading, Rasterization, Pixel-Shading, etc. In this paper, we propose a 'Rasterization based on Vector Algorithm' that makes parallel pixels processing possible with Multi-Core and Multi-Thread architecture on the Shader Core. The proposed algorithm takes only 2% operation counts of the Scan-Line Algorithm and processes pixels independently.

A Performance Evaluation of Parallel Color Conversion based on the Thread Number on Multi-core Systems (멀티코어 시스템에서 쓰레드 수에 따른 병렬 색변환 성능 검증)

  • Kim, Cheong Ghil
    • Journal of Satellite, Information and Communications
    • /
    • v.9 no.4
    • /
    • pp.73-76
    • /
    • 2014
  • With the increasing popularity of multi-core processors, they have been adopted even in embedded systems. Under this circumstance many multimedia applications can be parallelized on multi-core platforms because they usually require heavy computations and extensive memory accesses. This paper proposes an efficient thread-level parallel implementation for color space conversion on multi-core CPU. Thread-level parallelism has been becoming very useful parallel processing paradigm especially on shared memory computing systems. In this work, it is exploited by allocating different input pixels to each thread for concurrent loop executions. For the performance evaluation, this paper evaluate the performace improvements for color conversion on multi-core processors based on the processing speed comparison between its serial implementation and parallel ones. The results shows that thread-level parallel implementations show the overall similar ratios of performance improvements regardless of different multi-cores.

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment

  • Jang, Yeon-Woo;Kang, Moon-Hwan;Chang, Jae-Woo
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.499-509
    • /
    • 2018
  • Recently, hybrid transactional memory (HyTM) has gained much interest from researchers because it combines the advantages of hardware transactional memory (HTM) and software transactional memory (STM). To provide the concurrency control of transactions, the existing HyTM-based studies use a bloom filter. However, they fail to overcome the typical false positive errors of a bloom filter. Though the existing studies use a global lock, the efficiency of global lock-based memory allocation is significantly low in multi-core environment. In this paper, we propose an efficient hybrid transactional memory scheme using near-optimal retry computation and sophisticated memory management in order to efficiently process transactions in multi-core environment. First, we propose a near-optimal retry computation algorithm that provides an efficient HTM configuration using machine learning algorithms, according to the characteristic of a given workload. Second, we provide an efficient concurrency control for transactions in different environments by using a sophisticated bloom filter. Third, we propose a memory management scheme being optimized for the CPU cache line, in order to provide a fast transaction processing. Finally, it is shown from our performance evaluation that our HyTM scheme achieves up to 2.5 times better performance by using the Stanford transactional applications for multi-processing (STAMP) benchmarks than the state-of-the-art algorithms.

Implementation of IQ/IDCT in H.264/AVC Decoder Using Mobile Multi-Core GPGPU (모바일 멀티 코어 GP-GPU를 이용한 H.264/AVC 디코더 구현)

  • Kim, Dong-Han;Lee, Kwang-Yeob;Jeong, Jun-Mo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.321-324
    • /
    • 2010
  • There have been lots of researches on a multi-core processor. The enhancement has been performed through parallelization method. Multi-core architecture in the mobile environment has emerged. But, there is a limit to a mobile CPU's performance. GP-GPU(General-Purpose computing on Graphics Processing Units) can improve performance without adding other dedicated hardware. This paper presents the implementation of Inverse Quantization, Inverse DCT and Color Space Conversion module in H.264/AVC decoder using Multi-Core GP-GPU for a mobile environments. The proposed architecture improves approximately 50% of performance when it use all the features.

  • PDF

The Effects of Runner Core Pin on the Filling Imbalance Occurred in Multi Cavity Injection Mold (다수 캐비티 사출금형에서 러너 코어핀이 충전불균형에 미치는 영향)

  • Kang C. M.;Jeong Y. D.;Han K. T.
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 2005.05a
    • /
    • pp.39-42
    • /
    • 2005
  • For mass production, usually injection mold has multi-cavity which is filled through geometrical balanced runner system. Despite geometrical balanced runner system, filling imbalances between cavity to cavity have always been observed. These filling imbalances are one of the most significant factors to affect quality of plastic parts when molding plastic parts in multi-cavity injection mold. Filling imbalances are results from non-symmetrical shear rate distribution within melt as it flows through the runner system. It has been possible to decrease filling imbalance by optimizing processing conditions, but it has not completely eliminated this phenomenon during injection molding processing. This paper presents a solution of these filling imbalances through using 'runner core pin'. The runner core pin which is developed in this study creates a symmetrical shear distribution within runner. As a result of using runner core pin, a remarkable improvement in reducing filling imbalance was confirmed.

  • PDF

Development of New Runner System for Filling Balance in Multi Cavity Injection Mold (다수 캐비티 사출금형에 적용되는 새로운 균형 충전용 러너 시스템 개발)

  • Jeong Y. D.
    • Transactions of Materials Processing
    • /
    • v.15 no.1 s.82
    • /
    • pp.42-46
    • /
    • 2006
  • For mass production, usually injection mold has multi-cavity which is filled through geometrical balanced runner system. Despite geometrical balanced runner system, filling imbalances between cavity to cavity have always been observed. These filling imbalances are one of the most significant factors to affect quality of plastic parts. Filling imbalances are results from non-symmetrical shear rate distribution within melt when it flows through tile runner system. It has been possible to decrease filling imbalance by optimizing processing conditions, but it has not completely eliminated this phenomenon during injection molding processing. This paper presents a solution for these filling imbalances by using Runner Core pin (RC pin). The Runner Core pin which is developed in this study creates a symmetrical shear distribution within runner. As a result of using Runner Core pin, a remarkable improvement in reducing filling imbalances was confirmed.

TBBench: A Micro-Benchmark Suite for Intel Threading Building Blocks

  • Marowka, Ami
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.331-346
    • /
    • 2012
  • Task-based programming is becoming the state-of-the-art method of choice for extracting the desired performance from multi-core chips. It expresses a program in terms of lightweight logical tasks rather than heavyweight threads. Intel Threading Building Blocks (TBB) is a task-based parallel programming paradigm for multi-core processors. The performance gain of this paradigm depends to a great extent on the efficiency of its parallel constructs. The parallel overheads incurred by parallel constructs determine the ability for creating large-scale parallel programs, especially in the case of fine-grain parallelism. This paper presents a study of TBB parallelization overheads. For this purpose, a TBB micro-benchmarks suite called TBBench has been developed. We use TBBench to evaluate the parallelization overheads of TBB on different multi-core machines and different compilers. We report in detail in this paper on the relative overheads and analyze the running results.

Development of Runner System for Filling Balance in Multi Cavity Injection Mold (다수 캐비티 사출금형에서 균형 충전용 러너 시스템 개발)

  • Jeong Y. D.
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 2005.09a
    • /
    • pp.13-16
    • /
    • 2005
  • For mass production, usually injection mold has multi-cavity which is filled through geometrical balanced runner system. Despite geometrical balanced runner system, filling imbalances between cavity to cavity have always been observed. These filing imbalances are one of the most significant factors to affect quality of plastic parts when molding plastic parts in multi-cavity injection mold. Filling imbalances are results from non-symmetrical shear rate distribution within melt as it flows through the runner system. It has been possible to decrease filling imbalance by optimizing processing conditions, but it has not completely eliminated this phenomenon during injection molding processing. This paper presents a solution of these filling imbalances through using 'runner core pin'. The runner core pin which is developed in this study creates a symmetrical shear distribution within runner. As a result of using runner core pin, a remarkable improvement in reducing filling imbalance was confirmed.

  • PDF

Implementation and Verification of a Multi-Core Processor including Multimedia Specific Instructions (멀티미디어 전용 명령어를 내장한 멀티코어 프로세서 구현 및 검증)

  • Seo, Jun-Sang;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.1
    • /
    • pp.17-24
    • /
    • 2013
  • In this paper, we present a multi-core processor including multimedia specific instructions to process multimedia data efficiently in the mobile environment. Multimedia specific instructions exploit subword level parallelism (SLP), while the multi-core processor exploits data level parallelism (DLP). These combined parallelisms improve the performance of multimedia processing applications. The proposed multi-core processor including multimedia specific instructions is implemented and tested using a Xilinx ISE 10.1 tool and SoCMaster3 testbed system including Vertex 4 FPGA. Experimental results using a fire detection algorithm show that multimedia specific instructions outperform baseline instructions in the same multi-core architecture in terms of performance (1.2x better), energy efficiency (1.37x better), and area efficiency (1.23x better).

The Study of Distributed Processing for Graphics Rendering Engine Based on ARINC 653 Multi-Core System (ARINC 653 멀티코어 기반 그래픽스 렌더링 엔진 분산처리방안 연구)

  • Jung, Mukyoung
    • Journal of Aerospace System Engineering
    • /
    • v.13 no.5
    • /
    • pp.1-8
    • /
    • 2019
  • Recently, avionics has been migrating from a federated architecture to an integrated modular architecture based on a multi-core to reduce the number of systems, weight, power consumption, and platform redundancy. The volume of data which must bo provided to the pilot through the display device has increased, because an integrated single device performs multiple functions. For this reason, the volume of data processed by the graphic processor within a fixed operation period has increased. In this paper, we provide a multi-core-based rendering engine in to perform more graphics processing within a fixed operation period. We assume the proposed method uses a multi-core-based partitioning operating system using the AMP (Asymmetric Multi-Processing) architecture.