• Title/Summary/Keyword: multi-core CPU

Search Result 76, Processing Time 0.031 seconds

Parallel Processing of Multi-Core Processor and GPUs in Projection Step for Efficient Fluid Simulation (효율적인 유체 시뮬레이션을 위한 투영 단계에서의 멀티 코어 프로세서와 그래픽 프로세서의 병렬처리)

  • Kim, Sun-Tae;Jung, Hwi-Ryong;Hong, Jeong-Mo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.6
    • /
    • pp.48-54
    • /
    • 2013
  • In these days, the state-of-art technologies employ the heterogeneous parallelization of CPU and GPU for fluid simulations in the field of computer graphics. In this paper, we present a novel CPU-GPU parallel algorithm that solves projection step of fluid simulation more efficiently than existing sequential CPU-GPU processing. Fluid simulation that requires high computational resources can be carried out efficiently by the proposed method.

Performance Analysis on Next-Generation Web Browser at Multicore CPU and GPU (멀티 코어와 GPU가 차세대 웹 브라우저의 성능에 미치는 영향 분석)

  • Hong, Gyeong-Hwan;Kim, Dae-Ho;Shin, Dong-Kun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.355-357
    • /
    • 2012
  • 차세대 웹 브라우저는 멀티 쓰레드(multi-thread) 구조로 되어 있으며 HTML5와 WebGL을 기반으로 화려한 그래픽을 구사하기 때문에, 멀티 코어(multi-core) CPU와 GPU의 성능이 웹 브라우저의 성능에 큰 영향을 미치고 있다. 본 논문은 오픈 소스 웹 브라우저인 크로미엄(Chromium) 상에서 프로세서의 성능 변화에 따라 웹 브라우저에서 실행되는 웹 어플리케이션의 성능이 어떤 양상으로 변화하는지와 이 변화에 웹 브라우저의 각 동작이 얼마나 기여하는지를 비교 분석하였다. 그 결과 CPU 코어의 수가 렌더링 성능에 큰 영향을 주며, GPU의 성능은 WebGL의 성능을 크게 좌우함을 알 수 있었다.

Improvement in Reconstruction Time Using Multi-Core Processor on Computed Tomography (다중코어 프로세서를 이용한 전산화단층촬영의 재구성 시간 개선)

  • Chon, Kwon Su
    • Journal of the Korean Society of Radiology
    • /
    • v.9 no.7
    • /
    • pp.487-493
    • /
    • 2015
  • The reconstruction on the computed tomography requires much time for calculation. The calculation time rapidly increases with enlarging matrix size for improving image quality. Multi-core processor, multi-core CPU, has widely used nowadays and has provided the reduction of the calculation time through multi-threads. In this study, the calculation time of the reconstruction process would improved using multi-threads based on the multi-core processor. The Pthread and the OpenMP used for multi-threads were used in convolution and back projection steps that required much time in the reconstruction. The Pthread and the OpenMP showed similar results in the speedup and the efficiency.

Multi-core Scalable Real-time Flash Storage Simulation (멀티 코어 확장성을 제공하는 실시간 플래시 저장장치 시뮬레이션)

  • Lee, Hyeon-gyu;Min, Sang Lyul;Kim, Kanghee
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.566-572
    • /
    • 2017
  • As NAND flash storage is being widely used, its simulation methodologies have been studied in various aspects such as performance, reliability, and endurance. As a result, there have been advances in NAND flash storage simulation for both functional modeling and timing modeling. However, in addition to these advances, there is a need to drastically reduce the long simulation time that is required to evaluate the aging effect on flash storage. This paper proposes a so-called multi-core scalable real-time flash storage simulation method, which can control the simulation speed according to the user's preference. According to this method, it is possible to speed up the simulation in proportion to the number of CPU cores arbitrarily given while guaranteeing the correctness of the simulation result. Using our simulator implemented in the form of the Linux kernel module, we demonstrate the multi-core scalability and correctness of the proposed method.

SVM-based Energy-Efficient scheduling on Heterogeneous Multi-Core Mobile Devices (비대칭 멀티코어 모바일 단말에서 SVM 기반 저전력 스케줄링 기법)

  • Min-Ho, Han;Young-Bae, Ko;Sung-Hwa, Lim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.69-75
    • /
    • 2022
  • We propose energy-efficient scheduling considering real-time constraints and energy efficiency in smart mobile with heterogeneous multi-core structure. Recently, high-performance applications such as VR, AR, and 3D game require real-time and high-level processings. The big.LITTLE architecture is applied to smart mobiles devices for high performance and high energy efficiency. However, there is a problem that the energy saving effect is reduced because LITTLE cores are not properly utilized. This paper proposes a heterogeneous multi-core assignment technique that improves real-time performance and high energy efficiency with big.LITTLE architecture. Our proposed method optimizes the energy consumption and the execution time by predicting the actual task execution time using SVM (Support Vector Machine). Experiments on an off-the-shelf smartphone show that the proposed method reduces energy consumption while ensuring the similar execution time to legacy schemes.

Asymmetric Load Balancing on Multi-Core CPUs (멀티코어 CPU에서의 비대칭 부하 분산)

  • Kim, Hee-Gon;Lee, Sung-Ju;Chung, Yong-Wha
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.4-6
    • /
    • 2012
  • 최근 멀티코어 CPU가 장착된 시스템들이 출시되면서 많은 병렬처리 기법들이 제안되고 있다. 본 논문에서는 데이터 종속성이 없는 모듈과 종속성이 있는 모듈이 순차적으로 구성된 응용에서 각 코어에 부하를 효과적으로 분산시키는 방법을 제안한다. 즉, 데이터 종속성이 없는 모듈을 각 코어에 대칭적으로 분산시키는 통상적인 방법 대신, 비대칭적으로 부하를 분산시킴으로써 암달의 법칙에서 계산된 성능 상한치를 뛰어넘는 성능 개선을 얻을 수 있음을 보인다.

The Implementation of Real-time Performance Monitor for Multi-thread Application (멀티스레드 어플리케이션을 위한 실시간 성능모니터의 구현)

  • Kim, Jin-Hyuk;Shin, Kwang-Sik;Yoon, Wan-Oh;Lee, Chang-Ho;Choi, Sang-Bang
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.3
    • /
    • pp.82-90
    • /
    • 2011
  • Multi-core system is becoming more general with development of microprocessors. Due to this change in performance improvement paradigm, switching conventional single thread applications with multi thread applications. Performance monitoring tools are used to optimize application performance because of complexity in development of multi thread applications. Conventional performance monitoring tools are focused on performance itself rather than user friendliness or real-time support. Real-time performance monitor identify the problem while multi-threaded applications should be performed as well as check real-time operating status of the application. So it can be used as an effective tool compared to non-real-time performance monitor that only with simple performance indicators to find the cause of the problem. In this paper, we propose RMPM(Real-time Multi-core Performance Monitor) which is real-time performance monitoring tool for multi-core system. Observation period is optimized by comparing relation between overhead due to performance evaluation period and accuracy. Our performance monitor shows not only amount of CPU usage of whole system, memory usage, network usage but also aspect of overhead distribution per thread of an application.

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment

  • Jang, Yeon-Woo;Kang, Moon-Hwan;Chang, Jae-Woo
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.499-509
    • /
    • 2018
  • Recently, hybrid transactional memory (HyTM) has gained much interest from researchers because it combines the advantages of hardware transactional memory (HTM) and software transactional memory (STM). To provide the concurrency control of transactions, the existing HyTM-based studies use a bloom filter. However, they fail to overcome the typical false positive errors of a bloom filter. Though the existing studies use a global lock, the efficiency of global lock-based memory allocation is significantly low in multi-core environment. In this paper, we propose an efficient hybrid transactional memory scheme using near-optimal retry computation and sophisticated memory management in order to efficiently process transactions in multi-core environment. First, we propose a near-optimal retry computation algorithm that provides an efficient HTM configuration using machine learning algorithms, according to the characteristic of a given workload. Second, we provide an efficient concurrency control for transactions in different environments by using a sophisticated bloom filter. Third, we propose a memory management scheme being optimized for the CPU cache line, in order to provide a fast transaction processing. Finally, it is shown from our performance evaluation that our HyTM scheme achieves up to 2.5 times better performance by using the Stanford transactional applications for multi-processing (STAMP) benchmarks than the state-of-the-art algorithms.

A Function-characteristic Aware Thread-mapping Strategy for an SEDA-based Message Processor in Multi-core Environments (멀티코어 환경에서 SEDA 기반 메시지 처리기의 수행함수 특성을 고려한 쓰레드 매핑 기법)

  • Kang, Heeeun;Park, Sungyong;Lee, Younjeong;Jee, Seungbae
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.13-20
    • /
    • 2017
  • A message processor is server software that receives various message formats from clients, creates the corresponding threads to process them, and lastly delivers the results to the destination. Considering that each function of an SEDA-based message processor has its own characteristics such as CPU-bound or IO-bound, this paper proposes a thread-mapping strategy called "FC-TM" (function-characteristic aware thread mapping) that schedules the threads to the cores based on the function characteristics in multi-core environments. This paper assumes that message-processor functions are static in the sense that they are pre-defined when the message processor is built; therefore, we profile each function in advance and map each thread to a core using the information in order to maximize the throughput. The benchmarking results show that the throughput increased by up to a maximum of 72 % compared with the previous studies when the ratio of the IO-bound functions to the CPU-bound functions exceeds a certain percentage.

Parallel Cell-Connectivity Information Extraction Algorithm for Ray-casting on Unstructured Grid Data (비정렬 격자에 대한 광선 투사를 위한 셀 사이 연결정보 추출 병렬처리 알고리즘)

  • Lee, Jihun;Kim, Duksu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.17-25
    • /
    • 2020
  • We present a novel multi-core CPU based parallel algorithm for the cell-connectivity information extraction algorithm, which is one of the preprocessing steps for volume rendering of unstructured grid data. We first check the synchronization issues when parallelizing the prior serial algorithm naively. Then, we propose a 3-step parallel algorithm that achieves high parallelization efficiency by removing synchronization in each step. Also, our 3-step algorithm improves the cache utilization efficiency by increasing the spatial locality for the duplicated triangle test process, which is the core operation of building cell-connectivity information. We further improve the efficiency of our parallel algorithm by employing a memory pool for each thread. To check the benefit of our approach, we implemented our method on a system consisting of two octa-core CPUs and measured the performance. As a result, our method shows continuous performance improvement as we add threads. Also, it achieves up to 82.9 times higher performance compared with the prior serial algorithm when we use thirty-two threads (sixteen physical cores). These results demonstrate the high parallelization efficiency and high cache utilization efficiency of our method. Also, it validates the suitability of our algorithm for large-scale unstructured data.