• 제목/요약/키워드: Network Processor[1]

검색결과 145건 처리시간 0.024초

네트워크 프로세서를 이용한 기가비트 패킷 헤데 수집기 (A Gigabit Rate Packet Header Collector using Network Processor)

  • 최판안;최경희;정기현;심재홍
    • 정보처리학회논문지C
    • /
    • 제12C권1호
    • /
    • pp.11-18
    • /
    • 2005
  • 본 논문에서는 기가비트 트래픽에서도 높은 패킷 헤더 수집률(packet header collection ratio)을 보이는 멀티프로세서(multi-processor), 멀티쓰레드(multi-thread)를 채용한 네트워크 프로세서 기반의 패킷 헤더 수집기를 제안한다. 제안 패킷 수집기는 기가비트 트래픽 패킷 헤더를 분리하여 여러 대의 100Mbps MAC 포트로 분산하여 전달할 수 있는 구조를 가지고 있다. 제안된 구조는 고속 트래픽 처리를 위해 독창적인 버퍼관리 기법과 프로세서간 부하 분산 기법을 사용하고 있으며, 풍부한 실험을 퐁해 그 성능을 검증하였다.

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

추계적 모형을 이용한 모니터링 과정의 성능 분석 (Performance Analysis of Monitoring Process using the Stochastic Model)

  • 김제숭
    • 산업경영시스템학회지
    • /
    • 제17권32호
    • /
    • pp.145-154
    • /
    • 1994
  • In this paper, monitoring processor in a circuit switched network is considered. Monitoring processor monitors communication links, and offers a grade of service in each link to controller. Such an information is useful for an effective maintenance of system. Two links with nonsymmetric system Parameters are considered. each link is assumed independent M/M/1/1 type. The Markov process is introduced to compute busy and idle portions of monitoring processor and monitored rate of each link. Inter-idle times and inter-monitoring times of monitoring processor between two links are respectively computed. A recursive formula is introduced to make computational procedure rigorous.

  • PDF

계층구조 Computer Network에서 공정제어를 위한 JOB Scheduling (JOB Scheduling for process Control in Hierarchical Computer Network)

  • 박일
    • 한국통신학회논문지
    • /
    • 제5권1호
    • /
    • pp.83-87
    • /
    • 1980
  • 階層構造 COMPUTER Network을 通한 工程制御로 Processing을 分數하여 Fault tolerance를 極大化시키며 複雜하고 多樣한 變數의 相互關係를 週期的으로 監視制御하는 分散制御 Processor JOB은 그 週期와 實行時間으로 定義할 수 있다. 모든 JOB에 대하여 Tree structure인 關係를 가진 subset들로 구성하여 이에 JOB Scheduling Algorithm을 求하여 본 결과 FCFS(Fist Come/First Service)인 Schedule 보다 Processor의 利用에 있어 Loose Time을 減少시키고 處理 可能時間 確保에 有利하였다.

  • PDF

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

Performance Evaluation of Real-time Linux for an Industrial Real-time Platform

  • Jo, Yong Hwan;Choi, Byoung Wook
    • International journal of advanced smart convergence
    • /
    • 제11권1호
    • /
    • pp.28-35
    • /
    • 2022
  • This paper presents a performance evaluation of real-time Linux for industrial real-time platforms. On industrial platforms, multicore processors are popular due to their work distribution efficiency and cost-effectiveness. Multicore processors, however, are not designed for applications with real-time constraints, and their performance capabilities depend on their core configurations. In order to assess the feasibility of a multicore processor for real-time applications, we conduct a performance evaluation of a general processor and a low-power processor to provide an experimental environment of real-time Linux on both Xenomai and RT-preempt considering the multicore configuration. The real-time performance is evaluated through scheduling latency and in an environment with loads on the CPU, memory, and network to consider an actual situation. The results show a difference between a low-power and a general-purpose processor, but from developer's point of view, it shows that the low-power processor is a proper solution to accommodate low power situations.

부분행렬을 사용한 행렬.벡터 연산용 1차원 시스톨릭 어레이 프로세서 설계에 관한 연구 (A Study On Improving the Performance of One Dimensional Systolic Array Processor for Matrix.Vector Operation using Sub-Matrix)

  • 김용성
    • 정보학연구
    • /
    • 제10권3호
    • /
    • pp.33-45
    • /
    • 2007
  • Systolic Array Processor is used for designing the special purpose processor in Digital Signal Processing, Computer Graphics, Neural Network Applications etc., since it has the characteristic of parallelism, pipeline processing and architecture of regularity. But, in case of using general design method, it has intial waiting period as large as No. of PE-1. And if the connected system needs parallel and simultaneous outputs, processor has some problems of the performance, since it generates only one output at each clock in output state. So in this paper, one dimensional Systolic Array Processor that is designed according to the dependance of data and operations using the partitioned sub-matrix is proposed for the purpose of improving the performance. 1-D Systolic Array using 4 partitioned sub-matrix has efficient method in case of considering those two problems.

  • PDF

UltraSPARC(64bit-RISC processor)을 위한 고성능 컴퓨터 리눅스 클러스터링 (HPC(High Performance Computer) Linux Clustering for UltraSPARC(64bit-RISC processor))

  • 김기영;조영록;장종권
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 컴퓨터소사이어티 추계학술대회논문집
    • /
    • pp.45-48
    • /
    • 2003
  • We can easily buy network system for high performance micro-processor, progress computer architecture is caused of high bandwidth and low delay time. Coupling PC-based commodity technology with distributed computing methodologies provides an important advance in the development of single-user dedicated systems. Lately Network is joined PC or workstation by computers of high performance and low cost. Than it make intensive that Cluster system is resembled supercomputer. Unix, Linux, BSD, NT(Windows series) can use Cluster system OS(operating system). I'm chosen linux gain low cost, high performance and open technical documentation. This paper is benchmark performance of Beowulf clustering by UltraSPARC-1K(64bit-RISC processor). Benchmark tools use MPI(Message Passing Interface) and NetPIPE. Beowulf is a class of experimental parallel workstations developed to evaluate and characterize the design space of this new operating point in price-performance.

  • PDF

병렬 알고리즘의 가속화를 위한 GP-GPU의 Thread할당 기법 (Thread Distribution Method of GP-GPU for Accelerating Parallel Algorithms)

  • 이관호;김치용
    • 전기전자학회논문지
    • /
    • 제21권1호
    • /
    • pp.92-95
    • /
    • 2017
  • 본 논문에서는 적은 면적의 GP-GPU에서 성능을 향상시키기 위한 방법을 제안한다. 본 논문에서는 superscalar와 같이 과도하게 스케줄링 복잡성을 증가시키지 않는 대신 단순한 코어의 수를 늘려 성능을 극대화 시키는 방법을 제안한다. GP-GPU를 구성하는 Stream Processor의 구조를 단순화한다. 또한, Warp Schedule에서 thread 할당을 어플리케이션에 적합한 방법을 개발하여 성능을 개선한다. 성능을 검증하는 방안으로 neural network의 한 분야인 딥러닝에 대한 스레드 할당방식을 제안한다. Neural Network 알고리즘의 경우 Intel CPU 대비 90%에서 ARM Cortex-A15 4 core 대비 98% 성능 향상을 확인할 수 있었다.

1 kW급 가정용 연료개질기 성능 최적화 (Performance optimization of 1 kW class residential fuel processor)

  • 정운호;구기영;윤왕래
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 한국신재생에너지학회 2009년도 춘계학술대회 논문집
    • /
    • pp.731-734
    • /
    • 2009
  • KIER has been developed a compact and highly efficient fuel processor which is one of the key component of the residential PEM fuel cells system. The fuel processor uses methane steam reforming to convert natural gas to a mixture of water, hydrogen, carbon dioxide, carbon monoxide and unreacted methane. Then carbon monoxide is converted to carbon dioxide in water-gas-shift reactor and preferential oxidation reactor. A start-up time of the fuel processor is about 1h and CO concentration among the final product is maintained less than 5 vol. ppm. To achieve high thermal efficiency of 80% on a LHV basis, an optimal thermal network was designed. Internal heat exchange of the fuel processor is so efficient that the temperature of the reformed gas and the flue gas at the exit of the fuel processor remains less than $100^{\circ}C$. A compact design considering a mixing and distribution of the feed was applied to reduce the reactor volume. The current volume of the fuel processor is 17L with insulation.

  • PDF