• Title/Summary/Keyword: 멀티코어 구조

Search Result 136, Processing Time 0.026 seconds

Dynamic Scheduling of Network Processes for Multi-Core Systems (멀티 코어 시스템에서 통신 프로세스의 동적 스케줄링)

  • Jang, Hye-Churn;Jin, Hyun-Wook;Kim, Hag-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.968-972
    • /
    • 2009
  • The multi-core processors are being widely exploited by many high-end systems. With significant advances in processor architecture, the network band-width required on the high-end systems is increasing drastically. It is therefore highly desirable to manage multiple cores efficiently to achieve high network band-width with minimum resource requirements. Modern operating systems, however, still have significant design and optimization space to leverage the network performance over multi-core systems. In this paper, we suggest a novel networking process scheduling scheme, which decides the best processor affinity of networking processes based on the processor cache layout, communication intensiveness, and processor loads. The experimental results show that the scheduling scheme implemented in the Linux kernel can improve the network bandwidth and the effectiveness of processor utilization by 20% and 59%, respectively.

The Study of Distributed Processing for Graphics Rendering Engine Based on ARINC 653 Multi-Core System (ARINC 653 멀티코어 기반 그래픽스 렌더링 엔진 분산처리방안 연구)

  • Jung, Mukyoung
    • Journal of Aerospace System Engineering
    • /
    • v.13 no.5
    • /
    • pp.1-8
    • /
    • 2019
  • Recently, avionics has been migrating from a federated architecture to an integrated modular architecture based on a multi-core to reduce the number of systems, weight, power consumption, and platform redundancy. The volume of data which must bo provided to the pilot through the display device has increased, because an integrated single device performs multiple functions. For this reason, the volume of data processed by the graphic processor within a fixed operation period has increased. In this paper, we provide a multi-core-based rendering engine in to perform more graphics processing within a fixed operation period. We assume the proposed method uses a multi-core-based partitioning operating system using the AMP (Asymmetric Multi-Processing) architecture.

The Effect of Mesh Interconnection Network on the Performance of Manycore System. (다중코어 시스템의 메쉬구조 상호연결망이 성능에 미치는 영향)

  • Kim, Han-Yee;Kim, Young-Hwan;Suh, Taeweon
    • Annual Conference of KIPS
    • /
    • 2011.11a
    • /
    • pp.116-119
    • /
    • 2011
  • 다중코어(Many-Core) 시스템은 많은 코어들이 상호연결망을 통해서 연결되어있는 시스템으로, 단일코어나 멀티코어 시스템에 비해 보다 많은 병렬 컴퓨팅 자원을 지원한다. Amdahl 의 법칙에 의하면 병렬화되어 처리하는 부분은 이론적으로 프로세서의 개수에 비례하게 가속화 될 수 있지만, 상호연결망에서의 전송 지연을 비롯한 많은 요인에 의해서 성능의 가속화가 저해된다. 특히 캐시 일관성 규약(Cache Coherence Protocol)을 지원하는 대부분의 다중코어 시스템에서는 병렬화를 함에 있어서 캐시 미스로 인해 발생하는 데이터의 전송 지연이 성능에 많은 영향을 미칠 수 있다. 따라서 효과적인 병렬 프로그램을 위해서는 캐시 구조에 대한 이해를 바탕으로 상호연결망에 대한 연구가 필요하다. 본 논문에서는 메쉬(Mesh) 구조의 64 코어 다중코어 시스템인 TilePro64 를 이용하여 상호연결망의 데이터 전송 지연에 따른 프로그램 성능의 민감도를 측정하였다. 결과적으로 코어간 거리(Hop)가 늘어날수록 작업의 수행시간이 평균적으로 4.27%씩 선형적으로 증가하는 관계가 있는 것으로 나타났다.

A Method of Selecting Candidate Core for Shared-Based Tree Multicast Routing Protocol (공유기반 트리 멀티캐스트 라우팅 프로토콜을 위한 후보 코어 선택 방법)

  • Hwang Soon-Hwan;Youn Sung-Dae
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1436-1442
    • /
    • 2004
  • A shared-based tree established by the Core Based Tree multicast routing protocol (CBT), the Protocol Independent Multicast Sparse-Mode(PIM-SM), or the Core-Manager based Multicast Routing(CMMR) is rooted at a center node called core or Rendezvous Point(RP). The routes from the core (or RP) to the members of the multicast group are shortest paths. The costs of the trees constructed based on the core and the packet delays are dependent on the location of the core. The location of the core may affect the cost and performance of the shared-based tree. In this paper, we propose three methods for selecting the set of candidate cores. The three proposed methods, namely, k-minimum average cost, k-maximum degree, k-maximum weight are compared with a method which select the candidate cores randomly. Three performance measures, namely, tree cost, mean packet delay, and maximum packet delay are considered. Our simulation results show that the three proposed methods produce lower tree cost, significantly lower mean packet delay and maximum packet delay than the method which selects the candidate cores randomly.

  • PDF

Performance Evaluation of A Molecular Dynamics Code on Multi-core Systems (멀티 코어 시스템에서의 분자 동역학 코드 성능 분석)

  • Cha, Kwangho
    • Annual Conference of KIPS
    • /
    • 2013.05a
    • /
    • pp.111-113
    • /
    • 2013
  • 멀티 코어 시스템의 보급으로 일반 시스템에서도 프로그램의 병렬 실행이 가능해지고 있다. 본 연구에서는 멀티 코어를 사용하는 단일 시스템에서 분자 동역학 코드인 LAMMPS를 대상으로 병렬 수행 성능을 확인하고 분석하여 효과적인 실행 조건을 살펴보았다. LAMMPS의 구조적인 특징과 공간 분할 방식의 사용으로 인하여 단일 시스템에서도 메시지 전달 방식에 의한 병렬 수행이 보다 효율적임을 확인할 수 있었다.

Performance Improvement of a Real-time Traffic Identification System on a Multi-core CPU Environment (멀티 코어 환경에서 실시간 트래픽 분석 시스템 처리속도 향상)

  • Yoon, Sung-Ho;Park, Jun-Sang;Kim, Myung-Sup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5B
    • /
    • pp.348-356
    • /
    • 2012
  • The application traffic analysis is getting more and more challenging due to the huge amount of traffic from high-speed network link and variety of applications running on wired and wireless Internet devices. Multi-level combination of various analysis methods is desired to achieve high completeness and accuracy of analysis results for a real-time analysis system, while requires much of processing burden on the contrary. This paper proposes a novel architecture for a real-time traffic analysis system which improves the processing performance on multi-core CPU environment. The main contribution of the proposed architecture is an efficient parallel processing mechanism with multiple threads of various analysis methods. The feasibility of the proposed architecture was proved by implementing and deploying it on our campus network.

Optimal Design Space Exploration of Multi-core Architecture for Real-time Lane Detection Algorithm (실시간 차선인식 알고리즘을 위한 최적의 멀티코어 아키텍처 디자인 공간 탐색)

  • Jeong, Inkyu;Kim, Jongmyon
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.3
    • /
    • pp.339-349
    • /
    • 2017
  • This paper proposes a four-stage algorithm for detecting lanes on a driving car. In the first stage, it extracts region of interests in an image. In the second stage, it employs a median filter to remove noise. In the third stage, a binary algorithm is used to classify two classes of backgrond and foreground of an input image. Finally, an image erosion algorithm is utilized to obtain clear lanes by removing noises and edges remained after the binary process. However, the proposed lane detection algorithm requires high computational time. To address this issue, this paper presents a parallel implementation of a real-time line detection algorithm on a multi-core architecture. In addition, we implement and simulate 8 different processing element (PE) architectures to select an optimal PE architecture for the target application. Experimental results indicate that 40×40 PE architecture show the best performance, energy efficiency and area efficiency.

A Performance Improvement of Linux TCP/IP Stack based on Flow-Level Parallelism in a Multi-Core System (멀티코어 시스템에서 흐름 수준 병렬처리에 기반한 리눅스 TCP/IP 스택의 성능 개선)

  • Kwon, Hui-Ung;Jung, Hyung-Jin;Kwak, Hu-Keun;Kim, Young-Jong;Chung, Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.113-124
    • /
    • 2009
  • With increasing multicore system, much effort has been put on the performance improvement of its application. Because multicore system has multiple processing devices in one system, its processing power increases compared to the single core system. However in many cases the advantages of multicore can not be exploited fully because the existing software and hardware were designed to be suitable for single core. When the existing software runs on multicore, its performance improvement is limited by the bottleneck of sharing resources and the inefficient use of cache memory on multicore. Therefore, according as the number of core increases, it doesn't show performance improvement and shows performance drop in the worst case. In this paper we propose a method of performance improvement of multicore system by applying Flow-Level Parallelism to the existing TCP/IP network application and operating system. The proposed method sets up the execution environment so that each core unit operates independently as much as possible in network application, TCP/IP stack on operating system, device driver, and network interface. Moreover it distributes network traffics to each core unit through L2 switch. The proposed method allows to minimize the sharing of application data, data structure, socket, device driver, and network interface between each core. Also it allows to minimize the competition among cores to take resources and increase the hit ratio of cache. We implemented the proposed methods with 8 core system and performed experiment. Experimental results show that network access speed and bandwidth increase linearly according to the number of core.

Design Technique and Application for Distributed Recovery Block Using the Partitioning Operating System Based on Multi-Core System (멀티코어 기반 파티셔닝 운영체제를 이용한 분산 복구 블록 설계 기법 및 응용)

  • Park, Hansol
    • Journal of IKEEE
    • /
    • v.19 no.3
    • /
    • pp.357-365
    • /
    • 2015
  • Recently, embedded systems such as aircraft and automobilie, are developed as modular architecture instead of federated architecture because of SWaP(Size, Weight and Power) issues. In addition, partition operating system that support multiple logical node based on partition concept were recently appeared. Distributed recovery block is fault tolerance design scheme that applicable to mission critical real-time system to support real-time take over via real-time synchronization between participated nodes. Because of real-time synchronization, single-core based computer is not suitable for partition based distributed recovery block design scheme. Multi-core and AMP(Asymmetric Multi-Processing) based partition architecture is required to apply distributed recovery block design scheme. In this paper, we proposed design scheme of distributed recovery block on the multi-core based supervised-AMP architecture partition operating system. This paper implements flight control simulator for avionics to check feasibility of our design scheme.

Design Considerations on Large-scale Parallel Finite Element Code in Shared Memory Architecture with Multi-Core CPU (멀티코어 CPU를 갖는 공유 메모리 구조의 대규모 병렬 유한요소 코드에 대한 설계 고려 사항)

  • Cho, Jeong-Rae;Cho, Keunhee
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.30 no.2
    • /
    • pp.127-135
    • /
    • 2017
  • The computing environment has changed rapidly to enable large-scale finite element models to be analyzed at the PC or workstation level, such as multi-core CPU, optimal math kernel library implementing BLAS and LAPACK, and popularization of direct sparse solvers. In this paper, the design considerations on a parallel finite element code for shared memory based multi-core CPU system are proposed; (1) the use of optimized numerical libraries, (2) the use of latest direct sparse solvers, (3) parallelism using OpenMP for computing element stiffness matrices, and (4) assembly techniques using triplets, which is a type of sparse matrix storage. In addition, the parallelization effect is examined on the time-consuming works through a large scale finite element model.