• Title/Summary/Keyword: multicore

Search Result 143, Processing Time 0.214 seconds

Fault-tolerant Scheduling of Real-time Tasks with Energy Efficiency on Lightly Loaded Multicore Processors

  • Lee, Wan Yeon;Choi, Yun-Seok
    • International journal of advanced smart convergence
    • /
    • v.7 no.3
    • /
    • pp.92-100
    • /
    • 2018
  • In this paper, we propose a fault-tolerant scheduling scheme with energy efficiency for real-time periodic tasks on DVFS-enabled multicore processors. The scheme provides the tolerance of a permanent fault with the primary-backup task model. Also the scheme reduces the energy consumption of real-time tasks with the fully overlapped execution between each primary task and its backup task, whereas most of previous methods tried to minimize the overlapped execution between the two tasks. In order to the leakage energy loss of idle cores, the scheme activates a part of available cores with rarely used cores powered off. Evaluation results show that the proposed scheme saves up to 82% energy consumption of the previous method.

Localized Eigenmodes in a Triangular Multicore Hollow Optical Fiber for Space-division Multiplexing in C+L Band

  • Hong, Seongjin;Oh, Kyunghwan
    • Current Optics and Photonics
    • /
    • v.2 no.3
    • /
    • pp.226-232
    • /
    • 2018
  • We propose a triangular-multicore hollow optical fiber (TMC-HOF) design for uncoupled mode-division and space-division multiplexing. The TMC-HOF has three triangular cores, and each core has three modes: $LP_{01}$ and two split $LP_{11}$ modes. The asymmetric structure of the triangular core can split the $LP_{11}$ modes. Using the proposed structures, nine independent modes can propagate in a fiber. We use a fully vectorial finite-element method to estimate effective index, chromatic dispersion, differential group delay (DGD), and confinement loss by controlling the parameters of the TMC-HOF structure. We confirm that the proposed TMC-HOF shows flattened chromatic dispersion, low DGD, low confinement loss, low core-to-core crosstalk, and low crosstalk between adjacent modes. The proposed TMC-HOF can provide a common platform for MDM and SDM applications.

Performance Comparison of Parallel Programming Frameworks in Digital Image Transformation

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.1-7
    • /
    • 2019
  • Previously, parallel computing was mainly used in areas requiring high computing performance, but nowadays, multicore CPUs and GPUs have become widespread, and parallel programming advantages can be obtained even in a PC environment. Various parallel programming frameworks using multicore CPUs such as OpenMP and PPL have been announced. Nvidia and AMD have developed parallel programming platforms and APIs for program developers to take advantage of multicore GPUs on their graphics cards. In this paper, we develop digital image transformation programs that runs on each of the major parallel programming frameworks, and measure the execution time. We analyze the characteristics of each framework through the execution time comparison. Also a constant K indicating the ratio of program execution time between different parallel computing environments is presented. Using this, it is possible to predict rough execution time without implementing a parallel program.

A Study on Effect of Code Distribution and Data Replication for Multicore Computing Architectures

  • Cho, Doosan
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.282-287
    • /
    • 2021
  • A multicore system must be able to take full advantage of the program's instruction and data parallelism. This study introduces the data replication technique as a support technique to maximize the program's instruction and data parallelism. Instruction level parallelism can be limited by data dependency. In this case, if data is replicated to each processor core and used, instruction level parallelism can be used to the maximum. The technique proposed in this study can maximize the performance improvement effect when applied to scientific applications such as matrix multiplication operation.

Minimum-Power Scheduling of Real-Time Parallel Tasks based on Load Balancing for Frequency-Sharing Multicore Processors (주파수 공유형 멀티코어 프로세서를 위한 부하균등화에 기반한 실시간 병렬 작업들의 최소 전력 스케줄링)

  • Lee, Wan Yeon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.6
    • /
    • pp.177-184
    • /
    • 2015
  • This paper proposes a minimum-power scheduling scheme of real-time parallel tasks while meeting deadlines of the real-time tasks on DVFS-enabled multicore processors. The proposed scheme first finds a floating number of processing cores to each task so that the computation load of all processing cores would be equalized. Next the scheme translates the found floating number of cores into a natural number of cores while maintaining the computation load of all cores unchanged, and allocates the translated natural number of cores to the execution of each task. The scheme is designed to minimize the power consumption of the frequency-sharing multicore processor operating with the same processing speed at an instant time. Evaluation shows that the scheme saves up to 38% power consumption of the previous method.

Parallel Multithreaded Processing for Data Set Summarization on Multicore CPUs

  • Ordonez, Carlos;Navas, Mario;Garcia-Alvarado, Carlos
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.111-120
    • /
    • 2011
  • Data mining algorithms should exploit new hardware technologies to accelerate computations. Such goal is difficult to achieve in database management system (DBMS) due to its complex internal subsystems and because data mining numeric computations of large data sets are difficult to optimize. This paper explores taking advantage of existing multithreaded capabilities of multicore CPUs as well as caching in RAM memory to efficiently compute summaries of a large data set, a fundamental data mining problem. We introduce parallel algorithms working on multiple threads, which overcome the row aggregation processing bottleneck of accessing secondary storage, while maintaining linear time complexity with respect to data set size. Our proposal is based on a combination of table scans and parallel multithreaded processing among multiple cores in the CPU. We introduce several database-style and hardware-level optimizations: caching row blocks of the input table, managing available RAM memory, interleaving I/O and CPU processing, as well as tuning the number of working threads. We experimentally benchmark our algorithms with large data sets on a DBMS running on a computer with a multicore CPU. We show that our algorithms outperform existing DBMS mechanisms in computing aggregations of multidimensional data summaries, especially as dimensionality grows. Furthermore, we show that local memory allocation (RAM block size) does not have a significant impact when the thread management algorithm distributes the workload among a fixed number of threads. Our proposal is unique in the sense that we do not modify or require access to the DBMS source code, but instead, we extend the DBMS with analytic functionality by developing User-Defined Functions.

Multicore DVFS Scheduling Scheme Using Parallel Processing for Reducing Power Consumption of Periodic Real-time Tasks (주기적 실시간 작업들의 전력 소모 감소를 위한 병렬 수행을 활용한 다중코어 DVFS 스케줄링 기법)

  • Pak, Suehee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.12
    • /
    • pp.1-10
    • /
    • 2014
  • This paper proposes a scheduling scheme that enhances power consumption efficiency of periodic real-time tasks using DVFS and power-shut-down mechanisms while meeting their deadlines on multicore processors. The proposed scheme is suitable for dependent multicore processors in which processing cores have an identical speed at an instant, and resolves the load unbalance of processing cores by exploiting parallel processing because the load unbalance causes inefficient power consumption in previous methods. Also the scheme activates a part of processing cores and turns off the power of unused cores. The number of activated processing cores is determined through mathematical analysis. Evaluation experiments show that the proposed scheme saves up to 77% power consumption of the previous method.

Secure and Energy-Efficient MPEG Encoding using Multicore Platforms (멀티코어를 이용한 안전하고 에너지 효율적인 MPEG 인코딩)

  • Lee, Sung-Ju;Lee, Eun-Ji;Hong, Seung-Woo;Choi, Han-Na;Chung, Yong-Wha
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.3
    • /
    • pp.113-120
    • /
    • 2010
  • Content security and privacy protection are important issues in emerging network-based video surveillance applications. Especially, satisfying both real-time constraint and energy efficiency with embedded system-based video sensors is challenging since the battery-operated sensors need to compress and protect video content in real-time. In this paper, we propose a multicore-based solution to compress and protect video surveillance data, and evaluate the effectiveness of the solution in terms of both real-time constraint and energy efficiency. Based on the experimental results with MPEG2/AES software, we confirm that the multicore-based solution can improve the energy efficiency of a singlecore-based solution by a factor of 30 under the real-time constraint.

Implementation of LTE Transport Channel on Multicore DSP Software Defined Radio Platform (멀티코어 DSP 기반 소프트웨어 정의 라디오 플랫폼을 활용한 LTE 전송 채널의 구현)

  • Lee, Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.4
    • /
    • pp.508-514
    • /
    • 2020
  • To implement the continuously evolving mobile communication standards such as Long Term Evolution (LTE) and 5G, the Software Defined Radio (SDR) concept provides great flexibility and efficiency. For many years, a high-end Digital Signal Processor (DSP) System on Chip (SoC) has been developed to support multicore and various hardware coprocessors. This paper introduces the implementation of the SDR platform hardware using TI's TCI663x chip. Using the platform, LTE transport channel is implemented by interworking multicore DSP with Bit rate Coprocessor (BCP) and Turbo Decoder Coprocessor (TCP) and the performance is evaluated according to various implementation options. In order to evaluate the performance of the implemented LTE transport channel, LTE base station system was constructed by combining FPGA main board for physical channels, SDR platform board, and RF & Antenna board.

Improving Multi-DNN Computational Performance of Embedded Multicore Processors through a Global Queue (글로벌 큐를 통한 임베디드 멀티코어 프로세서의 멀티 DNN 연산 성능 향상)

  • Cho, Ho-jin;Kim, Myung-sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.714-721
    • /
    • 2020
  • DNN is expanding its use in embedded systems such as robots and autonomous vehicles. For high recognition accuracy, computational complexity is greatly increased, and multiple DNNs are running aperiodically. Therefore, the ability processing multiple DNNs in embedded environments is a crucial issue. Accordingly, multicore based platforms are being released. However, most DNN models are operated in a batch process, and when multiple DNNs are operated in multicore together, the execution time deviation between each DNN may be large and the end-to-end execution time of the whole DNNs could be long depending on how they are allocated to the cores. In this paper, we solve these problems by providing a framework that decompose each DNN into individual layers and then distribute to multicores through a global queue. As a result of the experiment, the total DNN execution time was reduced by 31%, and when operating multiple identical DNNs, the deviation in execution time was reduced by up to 95.1%.