• Title/Summary/Keyword: multi-core processors

Search Result 84, Processing Time 0.022 seconds

Multiple Signature Comparison of LogTM-SE for Fast Conflict Detection (다중 시그니처 비교를 통한 트랜잭셔널 메모리의 충돌해소 정책의 성능향상)

  • Kim, Deok-Ho;Oh, Doo-Hwan;Ro, Won-W.
    • The KIPS Transactions:PartA
    • /
    • v.18A no.1
    • /
    • pp.19-24
    • /
    • 2011
  • As era of multi-core processors has arrived, transactional memory has been considered as an effective method to achieve easy and fast multi-threaded programming. Various hardware transactional memory systems such as UTM, VTM, FastTM, LogTM, and LogTM-SE, have been introduced in order to implement high-performance multi-core processors. Especially, LogTM-SE has provided study performance with an efficient memory management policy and a practical thread scheduling method through conflict detection based on signatures. However, increasing number of cores on a processor imposes the hardware complexity for signature processing. This causes overall performance degradation due to the heavy workload on signature comparison. In this paper, we propose a new architecture of multiple signature comparison to improve conflict detection of signature based transactional memory systems.

An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units

  • Lee, Sang-Pil;Kim, Deok-Ho;Yi, Jae-Young;Ro, Won-Woo
    • Journal of Information Processing Systems
    • /
    • v.8 no.1
    • /
    • pp.159-174
    • /
    • 2012
  • This paper presents a study on a high-performance design for a block cipher algorithm implemented on modern many-core graphics processing units (GPUs). The recent emergence of VLSI technology makes it feasible to fabricate multiple processing cores on a single chip and enables general-purpose computation on a GPU (GPGPU). The GPU strategy offers significant performance improvements for all-purpose computation and can be used to support a broad variety of applications, including cryptography. We have proposed an efficient implementation of the encryption/decryption operations of a block cipher algorithm, SEED, on off-the-shelf NVIDIA many-core graphics processors. In a thorough experiment, we achieved high performance that is capable of supporting a high network speed of up to 9.5 Gbps on an NVIDIA GTX285 system (which has 240 processing cores). Our implementation provides up to 4.75 times higher performance in terms of encoding and decoding throughput as compared to the Intel 8-core system.

Computation-Communication Overlapping in AES-CCM Using Thread-Level Parallelism on a Multi-Core Processor (멀티코어 프로세서의 쓰레드-수준 병렬성을 활용한 AES-CCM 계산-통신 중첩화)

  • Lee, Eun-Ji;Lee, Sung-Ju;Chung, Yong-Wha;Lee, Myung-Ho;Min, Byoung-Ki
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.8
    • /
    • pp.863-867
    • /
    • 2010
  • Multi-core processors are becoming increasingly popular. As they are widely adopted in embedded systems as well as desktop PC's, many multimedia applications are being parallelized on multi-core platforms. However, it is difficult to parallelize applications with inherent data dependencies such as encryption algorithms for multimedia data. In order to overcome this limit, we propose a technique to overlap computation and communication using an otherwise idle core in this paper. In particular, we interpret the problem of multimedia computation and communication as a pipeline design problem at the application program level, and derive an optimal number of stages in the pipeline.

Implementation and Performance Evaluation of Preempt-RT Based Multi-core Motion Controller for Industrial Robot (산업용 로봇 제어를 위한 Preempt-RT 기반 멀티코어 모션 제어기의 구현 및 성능 평가)

  • Kim, Ikhwan;Ahn, Hyosung;Kim, Taehyoun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.12 no.1
    • /
    • pp.1-10
    • /
    • 2017
  • Recently, with the ever-increasing complexity of industrial robot systems, it has been greatly attention to adopt a multi-core based motion controller with high cost-performance ratio. In this paper, we propose a software architecture that aims to utilize the computing power of multi-core processors. The key concept of our architecture is to use shared memory for the interplay between threads running on separate processor cores. And then, we have integrated our proposed architecture with an industrial standard compliant IDE for automatic code generation of motion runtime. For the performance evaluation, we constructed a test-bed consisting of a motion controller with Preempt-RT Linux based dual-core industrial PC and a 3-axis industrial robot platform. The experimental results show that the actuation time difference between axes is 10 ns in average and bounded up to 689 ns under $1000{\mu}s$ control period, which can come up with real-time performance for industrial robot.

Collaborative Streamlined On-Chip Software Architecture on Heterogenous Multi-Cores for Low-Power Reactive Control in Automotive Embedded Processors (차량용 임베디드 프로세서에서 저전력 반응적 제어를 위한 이기종 멀티코어 협력적 스트리밍 온-칩 소프트웨어 구조)

  • Jisu, Kwon;Daejin, Park
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.6
    • /
    • pp.375-382
    • /
    • 2022
  • This paper proposes a multi-core cooperative computing structure considering the heterogeneous features of automotive embedded on-chip software. The automotive embedded software has the heterogeneous execution flow properties for various hardware drives. Software developed with a homogeneous execution flow without considering these properties will incur inefficient overhead due to core latency and load. The proposed method was evaluated on an target board on which a automotive MCU (micro-controller unit) with built-in multi-cores was mounted. We demonstrate an overhead reduction when software including common embedded system tasks, such as ADC sampling, DSP operations, and communication interfaces, are implemented in a heterogeneous execution flow. When we used the proposed method, embedded software was able to take advantage of idle states that occur between heterogeneous tasks to make efficient use of the resources on the board. As a result of the experiments, the power consumption of the board decreased by 42.11% compared to the baseline. Furthermore, the time required to process the same amount of sampling data was reduced by 27.09%. Experimental results validate the efficiency of the proposed multi-core cooperative heterogeneous embedded software execution technique.

A Real-Time Scheduling Technique on Multi-Core Systems for Multimedia Multi-Streaming (다중 멀티미디어 스트리밍을 위한 멀티코어 시스템 기반의 실시간 스케줄링 기법)

  • Park, Sang-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.11
    • /
    • pp.1478-1490
    • /
    • 2011
  • Recently, multi-core processors have been drawing significant interest from the embedded systems research and industry communities due mainly to their potential for achieving high performance and fault-tolerance at low cost in such products as automobiles and cell phones. To process multimedia data, a scheduling algorithm is required to meet timing constraints of periodic tasks in the system. Though Pfair scheduling algorithm can meet all the timing constraints while achieving 100% utilization on multi-core based system theoretically, however, the algorithm incurs high scheduling overheads including frequent core migrations and system-wide synchronizations. To mitigate the problems, we propose a real-time scheduling algorithm for multi-core based system so that system-wide scheduling is performed only when it is absolutely necessary. Otherwise the proposed algorithm performs scheduling within each core independently. The experimental results by extensive simulations show that the proposed algorithm dramatically reduces the scheduling overheads up to as negligible one when the utilization is under 80%.

Analysis method of application effects of single and multi-core processors in automobile integrated ECU (차량 통합ECU에서 싱글 및 멀티코어 프로세서 효과 분석 방안)

  • Lim, Chan-Woo;Ju, Hyo-Jae
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2017.07a
    • /
    • pp.271-274
    • /
    • 2017
  • 통합 ECU를 구현할 시, 구현단계 이전에 어떤 코어를 적용 시킬지를 판단하여 추후 발생할 수 있는 비용, 성능, 소요 시간 등의 문제점을 미리 대비할 수 있는 방안이 필요하다. 따라서 본 논문에서는 차량 통합ECU에서 싱글코어와 멀티코어 프로세서 중 어떤 프로세서를 적용하는 것이 효율적인 것인지 판단하는 연구를 진행했다. 통합 ECU의 기능이 결정된 경우, 구현이전의 설계 단계에서 기능적 요구 사항들을 고려하여 SW Task를 HW 모델에 적용했을 때의 효과를 분석했다. 분석 된 결과를 토대로 통합 ECU 모델의 싱글 및 멀티 코어 방식에 따른 성능을 비교했다. 본 논문에서 제안한 분석 방안을 적용하면 구현 단계 이전에 통합 ECU에 필요한 프로세서의 성능을 파악할 수 있다. 또한, 통합 ECU 구현 비용 및 개발 소요 시간을 줄일 수 있는 효과를 불러올 수 있을 것으로 예상한다.

  • PDF

An Optimization Tool for Determining Processor Affinity of Networking Processes (통신 프로세스의 프로세서 친화도 결정을 위한 최적화 도구)

  • Cho, Joong-Yeon;Jin, Hyun-Wook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.131-136
    • /
    • 2013
  • Multi-core processors can improve parallelism of application processes and thus can enhance the system throughput. Researchers have recently revealed that the processor affinity is an important factor to determine network I/O performance due to architectural characteristics of multi-core processors; thus, many researchers are trying to suggest a scheme to decide an optimal processor affinity. Existing schemes to dynamically decide the processor affinity are able to transparently adapt for system changes, such as modifications of application and upgrades of hardware, but these have limited access to characteristics of application behavior and run-time information that can be collected heuristically. Thus, these can provide only sub-optimal processor affinity. In this paper, we define meaningful system variables for determining optimal processor affinity and suggest a tool to gather such information. We show that the implemented tool can overcome limitations of existing schemes and can improve network bandwidth.

A Task Scheduling Strategy in a Multi-core Processor for Visual Object Tracking Systems (시각물체 추적 시스템을 위한 멀티코어 프로세서 기반 태스크 스케줄링 방법)

  • Lee, Minchae;Jang, Chulhoon;Sunwoo, Myoungho
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.24 no.2
    • /
    • pp.127-136
    • /
    • 2016
  • The camera based object detection systems should satisfy the recognition performance as well as real-time constraints. Particularly, in safety-critical systems such as Autonomous Emergency Braking (AEB), the real-time constraints significantly affects the system performance. Recently, multi-core processors and system-on-chip technologies are widely used to accelerate the object detection algorithm by distributing computational loads. However, due to the advanced hardware, the complexity of system architecture is increased even though additional hardwares improve the real-time performance. The increased complexity also cause difficulty in migration of existing algorithms and development of new algorithms. In this paper, to improve real-time performance and design complexity, a task scheduling strategy is proposed for visual object tracking systems. The real-time performance of the vision algorithm is increased by applying pipelining to task scheduling in a multi-core processor. Finally, the proposed task scheduling algorithm is applied to crosswalk detection and tracking system to prove the effectiveness of the proposed strategy.