• Title/Summary/Keyword: Multi-core Processors

Search Result 84, Processing Time 0.026 seconds

Implementation of Kernel Module for Shared Memory in Dual Bus System (듀얼 버스 시스템에서의 공유 메모리 커널 모듈 구현)

  • Moon, Ji-Hoon;Oh, Jae-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.5
    • /
    • pp.539-548
    • /
    • 2015
  • In this paper, shared memory feature was developed in multi-core system with different OS for different processor-specific bus, while conducting an experiment on shared memory feature between the two processors based on embedded Linux system. For the purpose of developing shared memory in dual bus structure, memory controller was used, while managing shared memory segment through list data structure. For AMP multi-core test, Linux OS was installed in 2 processor cores. In addition, it verified the creation and use of shared memory by using kernel module implemented to test shared memory.

High Performance Message Scattering Algorithm in Multicore Processor (멀티코어 프로세서에서의 효율적인 메시지 스캐터링 지원 기법)

  • Park, Jongsu
    • Journal of Platform Technology
    • /
    • v.10 no.2
    • /
    • pp.3-9
    • /
    • 2022
  • In this paper, to maximize the performance of the scatter communication in multi-core and many-core processors, a technique that considers the communication situation of the processing node is applied to a multi-core processor composed of 32 processing nodes. Since the existing scatter algorithm cannot recognize the communication conditions of the processing nodes, communication is generally performed according to an initially set transmission order. In this case, scatter communication starts only after the communication currently being performed by all processing nodes inside the processor is finished. The scatter communication performance was improved by this technique, and it was confirmed that there was a performance improvement of up to 78.93% compared to the existing algorithm through BFM simulation.

Real-Time Power-Saving Scheduling Based on Genetic Algorithms in Multi-core Hybrid Memory Environments (멀티코어 이기종메모리 환경에서의 유전 알고리즘 기반 실시간 전력 절감 스케줄링)

  • Yoo, Suhyeon;Jo, Yewon;Cho, Kyung-Woon;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.1
    • /
    • pp.135-140
    • /
    • 2020
  • Recently, due to the rapid diffusion of intelligent systems and IoT technologies, power saving techniques in real-time embedded systems has become important. In this paper, we propose P-GA (Parallel Genetic Algorithm), a scheduling algorithm aims at reducing the power consumption of real-time systems in multi-core hybrid memory environments. P-GA improves the Proportional-Fairness (PF) algorithm devised for multi-core environments by combining the dynamic voltage/frequency scaling of the processor with the nonvolatile memory technologies. Specifically, P-GA applies genetic algorithms for optimizing the voltage and frequency modes of processors and the memory types, thereby minimizing the power consumptions of the task set. Simulation experiments show that the power consumption of P-GA is reduced by 2.85 times compared to the conventional schemes.

Dynamic Scheduling of Network Processes for Multi-Core Systems (멀티 코어 시스템에서 통신 프로세스의 동적 스케줄링)

  • Jang, Hye-Churn;Jin, Hyun-Wook;Kim, Hag-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.968-972
    • /
    • 2009
  • The multi-core processors are being widely exploited by many high-end systems. With significant advances in processor architecture, the network band-width required on the high-end systems is increasing drastically. It is therefore highly desirable to manage multiple cores efficiently to achieve high network band-width with minimum resource requirements. Modern operating systems, however, still have significant design and optimization space to leverage the network performance over multi-core systems. In this paper, we suggest a novel networking process scheduling scheme, which decides the best processor affinity of networking processes based on the processor cache layout, communication intensiveness, and processor loads. The experimental results show that the scheduling scheme implemented in the Linux kernel can improve the network bandwidth and the effectiveness of processor utilization by 20% and 59%, respectively.

Multi-Core Processor for Real-Time Sound Synthesis of Gayageum (가야금의 실시간 음 합성을 위한 멀티코어 프로세서 구현)

  • Choi, Ji-Won;Cho, Sang-Jin;Kim, Cheol-Hong;Kim, Jong-Myon;Chong, Ui-Pil
    • The KIPS Transactions:PartA
    • /
    • v.18A no.1
    • /
    • pp.1-10
    • /
    • 2011
  • Physical modeling has been widely used for sound synthesis since it synthesizes high quality sound which is similar to real-sound for musical instruments. However, physical modeling requires a lot of parameters to synthesize a large number of sounds simultaneously for the musical instrument, preventing its real-time processing. To solve this problem, this paper proposes a single instruction, multiple data (SIMD) based multi-core processor that supports real-time processing of sound synthesis of gayageum which is a representative Korean traditional musical instrument. The proposed SIMD-base multi-core processor consists of 12 processing elements (PE) to control 12 strings of gayageum in which each PE supports modeling of the corresponding string. The proposed SIMD-based multi-core processor can generate synthesized sounds of 12 strings simultaneously after receiving excitation signals and parameters of each string as an input. Experimental results using a sampling reate 44.1 kHz and 16 bits quantization show that synthesis sound using the proposed multi-core processor was very similar to the original sound. In addition, the proposed multi-core processor outperforms commercial processors(TI's TMS320C6416, ARM926EJ-S, ARM1020E) in terms of execution time ($5.6{\sim}11.4{\times}$ better) and energy efficiency (about $553{\sim}1,424{\times}$ better).

Performance Evaluation and Verification of MMX-type Instructions on an Embedded Parallel Processor (임베디드 병렬 프로세서 상에서 MMX타입 명령어의 성능평가 및 검증)

  • Jung, Yong-Bum;Kim, Yong-Min;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.10
    • /
    • pp.11-21
    • /
    • 2011
  • This paper introduces an SIMD(Single Instruction Multiple Data) based parallel processor that efficiently processes massive data inherent in multimedia. In addition, this paper implements MMX(MultiMedia eXtension)-type instructions on the data parallel processor and evaluates and analyzes the performance of the MMX-type instructions. The reference data parallel processor consists of 16 processors each of which has a 32-bit datapath. Experimental results for a JPEG compression application with a 1280x1024 pixel image indicate that MMX-type instructions achieves a 50% performance improvement over the baseline instructions on the same data parallel architecture. In addition, MMX-type instructions achieves 100% and 51% improvements over the baseline instructions in energy efficiency and area efficiency, respectively. These results demonstrate that multimedia specific instructions including MMX-type have potentials for widely used many-core GPU(Graphics Processing Unit) and any types of parallel processors.

Core-aware Cache Replacement Policy for Reconfigurable Last Level Cache (재구성 가능한 라스트 레벨 캐쉬 구조를 위한 코어 인지 캐쉬 교체 기법)

  • Son, Dong-Oh;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.11
    • /
    • pp.1-12
    • /
    • 2013
  • In multi-core processors, Last Level Cache(LLC) can reduce the speed gap between the memory and the core. For this reason, LLC has big impact on the performance of processors. LLC is composed of shared cache and private cache. In computer architecture community, most researchers have mainly focused on the management techniques for shared cache, while management techniques for private cache have not been widely researched. In conventional private LLC, memory is statically assigned to each core, resulting in serious performance degradation when the workloads are not fairly distributed. To overcome this problem, this paper proposes the replacement policy for managing private cache of LLC efficiently. As proposed core-aware cache replacement policy can reconfigure LLC dynamically, hit rate of LLC is increases drastically. Moreover, proposed policy uses 2-bit saturating counters to improve the performance. According to our simulation results, the proposed method can improve hit rates by 9.23% and reduce the access time by 12.85% compared to the conventional method.

A Study on Parallel AES Cipher Algorithm based on Multi Processor (멀티프로세서 기반의 병렬 AES 암호 알고리즘에 관한 연구)

  • Park, Jung-Oh;Oh, Gi-Oug
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.1
    • /
    • pp.171-181
    • /
    • 2012
  • This paper defines the AES password algorithm used as a symmetric-key-based password algorithm, and proposes the design of parallel password algorithm to utilize the resources of multi-core processor as much as possible. The proposed parallel password algorithm was confirmed for parallel execution of password computation by allocating the password algorithm according to the number of cores, and about 30% of performance increase compared to AES password algorithm. The encryption/decryption performance of the password algorithm was confirmed through binary comparative analysis tool, which confirmed that the binary results were the same for AES password algorithm and proposed parallel password algorithm, and the decrypted binary were also the same. The parallel password algorithm for multi-core environment proposed in this paper can be applied to authentication/payment of financial service in PC, laptop, server, and mobile environment, and can be utilized in the area that required high-speed encryption operation of large-sized data.

A Performance Study of Embedded Multicore Processor Architectures (임베디드 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.163-169
    • /
    • 2013
  • Recently, the importance of embedded system is growing rapidly. In-order to satisfy the real-time constraints of the system, high performance embedded processor is required. Therefore, as in general purpose computer systems, embedded processor should be designed as multicore architecture as well. Using MiBench benchmarks as input, the trace-driven simulation has been performed and analyzed for the 2-core to 16-core embedded processor architectures with different types of cores from simple RISC to in-order and out-of-order superscalar processors, extensively. As a result, the achievable performance is as high as 23 times over the single core embedded RISC processor.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.