• 제목/요약/키워드: Multicore Platforms

검색결과 12건 처리시간 0.024초

Performance Evaluation of Real-time Linux for an Industrial Real-time Platform

  • Jo, Yong Hwan;Choi, Byoung Wook
    • International journal of advanced smart convergence
    • /
    • 제11권1호
    • /
    • pp.28-35
    • /
    • 2022
  • This paper presents a performance evaluation of real-time Linux for industrial real-time platforms. On industrial platforms, multicore processors are popular due to their work distribution efficiency and cost-effectiveness. Multicore processors, however, are not designed for applications with real-time constraints, and their performance capabilities depend on their core configurations. In order to assess the feasibility of a multicore processor for real-time applications, we conduct a performance evaluation of a general processor and a low-power processor to provide an experimental environment of real-time Linux on both Xenomai and RT-preempt considering the multicore configuration. The real-time performance is evaluated through scheduling latency and in an environment with loads on the CPU, memory, and network to consider an actual situation. The results show a difference between a low-power and a general-purpose processor, but from developer's point of view, it shows that the low-power processor is a proper solution to accommodate low power situations.

Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제9권2호
    • /
    • pp.51-72
    • /
    • 2015
  • Time predictability is crucial in hard real-time and safety-critical systems. Cache memories, while useful for improving the average-case memory performance, are not time predictable, especially when they are shared in multicore processors. To achieve time predictability while minimizing the impact on performance, this paper explores several time-predictable scratch-pad memory (SPM) based architectures for multicore processors. To support these architectures, we propose the dynamic memory objects allocation based partition, the static allocation based partition, and the static allocation based priority L2 SPM strategy to retain the characteristic of time predictability while attempting to maximize the performance and energy efficiency. The SPM based multicore architectural design and the related allocation methods thus form a comprehensive solution to hard real-time multicore based computing. Our experimental results indicate the strengths and weaknesses of each proposed architecture and the allocation method, which offers interesting on-chip memory design options to enable multicore platforms for hard real-time systems.

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제6권1호
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

멀티코어를 이용한 안전하고 에너지 효율적인 MPEG 인코딩 (Secure and Energy-Efficient MPEG Encoding using Multicore Platforms)

  • 이성주;이은지;홍승우;최한나;정용화
    • 정보보호학회논문지
    • /
    • 제20권3호
    • /
    • pp.113-120
    • /
    • 2010
  • 컨텐츠 및 프라이버시 보호는 최근 보급되기 시작한 네트워크 기반 비디오 감시 시스템의 주요한 이슈가 되고 있다. 특히, 배터리로 동작하는 임베디드 시스템 기반의 비디오 센서가 압축 및 암호화 과정을 실시간으로 처리해야 하는 환경에서 실시간 요구사항과 에너지 효율성을 동시에 만족시키는 것은 쉽지 않은 문제이다. 본 논문에서는 비디오 감시 데이터를 압축 및 암호화하는 멀티코어 기반 솔루션을 제안하고, 제안 방법의 효율성을 실시간 처리와 에너지 소비 관점에서 평가한다. MPEG2/AES를 이용한 실험 결과, 실시간을 만족하는 범위 내에서 멀티코어 기반의 제안 방법이 통상적인 싱글코어 기반의 방법에 비하여 최대 30배까지 에너지 효율성을 개선할 수 있음을 확인하였다.

Performance Comparison of Parallel Programming Frameworks in Digital Image Transformation

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권3호
    • /
    • pp.1-7
    • /
    • 2019
  • Previously, parallel computing was mainly used in areas requiring high computing performance, but nowadays, multicore CPUs and GPUs have become widespread, and parallel programming advantages can be obtained even in a PC environment. Various parallel programming frameworks using multicore CPUs such as OpenMP and PPL have been announced. Nvidia and AMD have developed parallel programming platforms and APIs for program developers to take advantage of multicore GPUs on their graphics cards. In this paper, we develop digital image transformation programs that runs on each of the major parallel programming frameworks, and measure the execution time. We analyze the characteristics of each framework through the execution time comparison. Also a constant K indicating the ratio of program execution time between different parallel computing environments is presented. Using this, it is possible to predict rough execution time without implementing a parallel program.

멀티코어 DSP를 이용한 다중 안테나를 지원하는 SDR 기반 LTE-A PDSCH 디코더 구현 (Implementation of SDR-based LTE-A PDSCH Decoder for Supporting Multi-Antenna Using Multi-Core DSP)

  • 나용;안흥섭;최승원
    • 디지털산업정보학회논문지
    • /
    • 제15권4호
    • /
    • pp.85-92
    • /
    • 2019
  • This paper presents a SDR-based Long Term Evolution Advanced (LTE-A) Physical Downlink Shared Channel (PDSCH) decoder using a multicore Digital Signal Processor (DSP). For decoder implementation, multicore DSP TMS320C6670 is used, which provides various hardware accelerators such as turbo decoder, fast Fourier transformer and Bit Rate Coprocessors. The TMS320C6670 is a DSP specialized in implementing base station platforms and is not an optimized platform for implementing mobile terminal platform. Accordingly, in this paper, the hardware accelerator was changed to the terminal implementation to implement the LTE-A PDSCH decoder supporting the multi-antenna and the functions not provided by the hardware accelerator were implemented through core programming. Also pipeline using multicore was implemented to meet the transmission time interval. To confirm the feasibility of the proposed implementation, we verified the real-time decoding capability of the PDSCH decoder implemented using the LTE-A Reference Measurement Channel (RMC) waveform about transmission mode 2 and 3.

Parallelizing H.264 and AES Collectively

  • Kim, Heegon;Lee, Sungju;Chung, Yongwha;Pan, Sung Bum
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권9호
    • /
    • pp.2326-2337
    • /
    • 2013
  • Many applications can be parallelized by using multicore platforms. We propose a load-balancing technique for parallelizing a whole application, whose first module (H.264) has data independency and whose second module (AES) has data dependency. Instead of distributing the first module symmetrically over the multi-core platform, we distribute the data-independent workload asymmetrically in order to start the data-dependent workload as early as possible. Based on the experimental results with a compression/encryption application, we confirm that the asymmetric load balancing can provide better performance than the typical symmetric load balancing.

멀티코어 플랫폼에서 에너지 효율적 EDZL 실시간 스케줄링 (Energy-aware EDZL Real-Time Scheduling on Multicore Platforms)

  • 한상철
    • 정보과학회 논문지
    • /
    • 제43권3호
    • /
    • pp.296-303
    • /
    • 2016
  • 시스템 자원과 가용한 전력량이 한정적인 모바일 실시간 시스템은 시간제약의 만족뿐만 아니라 시스템 부하가 높을 때는 시스템 자원을 최대한 활용하고 시스템 부하가 낮을 때는 에너지 소모량을 줄일 수 있어야 한다. 멀티프로세서 실시간 스케줄링 알고리즘인 EDZL(Earliest Deadline until Zero Laxity)은 높은 시스템 이용률을 가지고 있으나 에너지 절감기법에 대한 연구가 매우 적다. 본 논문은 멀티코어 플랫폼에서 EDZL 스케줄링의 동적 전압조절(DVFS) 기법을 다룬다. 본 논문은 full-chip DVFS 플랫폼을 위한 동일속도와 per-core DVFS 플랫폼을 위한 개별속도 산정 기법을 제안한다. EDZL 스케줄 가능성 검사에 기반을 둔 이 기법은 단순하지만 효과적으로 태스크들의 수행속도를 오프라인에 결정할 수 있다. 또한 모의실험을 통하여 제안한 기법이 효과적으로 에너지를 절감할 수 있음을 보인다.

글로벌 큐를 통한 임베디드 멀티코어 프로세서의 멀티 DNN 연산 성능 향상 (Improving Multi-DNN Computational Performance of Embedded Multicore Processors through a Global Queue)

  • 조호진;김명선
    • 한국정보통신학회논문지
    • /
    • 제24권6호
    • /
    • pp.714-721
    • /
    • 2020
  • DNN은 로봇 및 자율주행차 등의 임베디드 시스템에서 활용 분야가 넓어지고 있다. 최근 높은 인식 정확도를 위하여 연산 복잡도가 크게 증가되고 비주기적으로 다수의 DNN을 사용하는 형태가 증가되고 있다. 따라서 임베디드 환경에서 다수의 DNN을 처리할 수 있는 능력은 중요한 이슈가 되었다. 이에 따라 멀티코어 기반 플랫폼들이 출시되고 있다. 하지만 대부분의 DNN 모델들은 배치 프로세스로 운용되어, 여러 DNN이 함께 멀티코어에서 운용될 때 어떻게 코어에 할당되느냐에 따라 각 DNN 간 수행시간 편차가 클 수 있고 시스템 전체적인 DNN 수행 시간이 길어질 수 있다. 본 논문에서는 각 DNN들을 배치 형태가 아닌 레이어별로 재구성한 후 글로벌 큐를 통하여 멀티코어에 분산시킬 수 있는 프레임워크를 제공하여 이러한 문제를 해결한다. 실험 결과 전체 DNN 수행 시간은 31% 감소하였고 다수의 동일 DNN을 운용 시 그 수행시간 편차는 최대 95.1% 감소하였다.

On-line Trace Based Automatic Parallelization of Java Programs on Multicore Platforms

  • Sun, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • 제6권2호
    • /
    • pp.105-118
    • /
    • 2012
  • We propose two new approaches that automatically parallelize Java programs at runtime. These approaches, which rely on run-time trace information collected during program execution, dynamically recompile Java byte code that can be executed in parallel. One approach utilizes trace information to improve traditional loop parallelization, and the other parallelizes traces instead of loop iterations. We also describe a cost/benefit model that makes intelligent parallelization decisions, as well as a parallel execution environment to execute parallelized programs. These techniques are based on Jikes RVM. Our approach is evaluated by parallelizing sequential Java programs, and its performance is compared to that of the manually parallelized code. According to the experimental results, our approach has low overheads and achieves competitive speedups compared to the manually parallelizing code. Moreover, trace parallelization can exploit parallelism beyond loop iterations.