• Title/Summary/Keyword: Network Processor[1]

Search Result 145, Processing Time 0.04 seconds

A Gigabit Rate Packet Header Collector using Network Processor (네트워크 프로세서를 이용한 기가비트 패킷 헤데 수집기)

  • Choi Pan-an;Choi Kyung-hee;Jung Gi-hyun;Sim Jae-hong
    • The KIPS Transactions:PartC
    • /
    • v.12C no.1 s.97
    • /
    • pp.11-18
    • /
    • 2005
  • This paper proposes a packet header collector, based on a network processor with multi-processor and multi-threads, that shows a high throughput on gigabit network. The proposed collector has an architecture to separate packets coming from gigabit network into headers and payloads, and distribute them to multiple 100Mbit MAC ports. The architecture hiring a unique buffer management method and load distribution strategy among multiple processors is evaluated empirically in depth.

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

Performance Analysis of Monitoring Process using the Stochastic Model (추계적 모형을 이용한 모니터링 과정의 성능 분석)

  • 김제숭
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.17 no.32
    • /
    • pp.145-154
    • /
    • 1994
  • In this paper, monitoring processor in a circuit switched network is considered. Monitoring processor monitors communication links, and offers a grade of service in each link to controller. Such an information is useful for an effective maintenance of system. Two links with nonsymmetric system Parameters are considered. each link is assumed independent M/M/1/1 type. The Markov process is introduced to compute busy and idle portions of monitoring processor and monitored rate of each link. Inter-idle times and inter-monitoring times of monitoring processor between two links are respectively computed. A recursive formula is introduced to make computational procedure rigorous.

  • PDF

JOB Scheduling for process Control in Hierarchical Computer Network (계층구조 Computer Network에서 공정제어를 위한 JOB Scheduling)

  • Park, Yil
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.5 no.1
    • /
    • pp.83-87
    • /
    • 1980
  • The distributive processing job in a hierarchical computer network, which supervises and controls the complex relations between the variables periodically for raising the fault folerance, can be defined its periodicity and its execution time. All the job may be composed of the subsets in relation of Tree structure. For a processor job set this paper finds out a job scheduling algorithm that has the less loose time between period than that of FCFS.

  • PDF

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

Performance Evaluation of Real-time Linux for an Industrial Real-time Platform

  • Jo, Yong Hwan;Choi, Byoung Wook
    • International journal of advanced smart convergence
    • /
    • v.11 no.1
    • /
    • pp.28-35
    • /
    • 2022
  • This paper presents a performance evaluation of real-time Linux for industrial real-time platforms. On industrial platforms, multicore processors are popular due to their work distribution efficiency and cost-effectiveness. Multicore processors, however, are not designed for applications with real-time constraints, and their performance capabilities depend on their core configurations. In order to assess the feasibility of a multicore processor for real-time applications, we conduct a performance evaluation of a general processor and a low-power processor to provide an experimental environment of real-time Linux on both Xenomai and RT-preempt considering the multicore configuration. The real-time performance is evaluated through scheduling latency and in an environment with loads on the CPU, memory, and network to consider an actual situation. The results show a difference between a low-power and a general-purpose processor, but from developer's point of view, it shows that the low-power processor is a proper solution to accommodate low power situations.

A Study On Improving the Performance of One Dimensional Systolic Array Processor for Matrix.Vector Operation using Sub-Matrix (부분행렬을 사용한 행렬.벡터 연산용 1차원 시스톨릭 어레이 프로세서 설계에 관한 연구)

  • Kim, Yong-Sung
    • The Journal of Information Technology
    • /
    • v.10 no.3
    • /
    • pp.33-45
    • /
    • 2007
  • Systolic Array Processor is used for designing the special purpose processor in Digital Signal Processing, Computer Graphics, Neural Network Applications etc., since it has the characteristic of parallelism, pipeline processing and architecture of regularity. But, in case of using general design method, it has intial waiting period as large as No. of PE-1. And if the connected system needs parallel and simultaneous outputs, processor has some problems of the performance, since it generates only one output at each clock in output state. So in this paper, one dimensional Systolic Array Processor that is designed according to the dependance of data and operations using the partitioned sub-matrix is proposed for the purpose of improving the performance. 1-D Systolic Array using 4 partitioned sub-matrix has efficient method in case of considering those two problems.

  • PDF

HPC(High Performance Computer) Linux Clustering for UltraSPARC(64bit-RISC processor) (UltraSPARC(64bit-RISC processor)을 위한 고성능 컴퓨터 리눅스 클러스터링)

  • 김기영;조영록;장종권
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.45-48
    • /
    • 2003
  • We can easily buy network system for high performance micro-processor, progress computer architecture is caused of high bandwidth and low delay time. Coupling PC-based commodity technology with distributed computing methodologies provides an important advance in the development of single-user dedicated systems. Lately Network is joined PC or workstation by computers of high performance and low cost. Than it make intensive that Cluster system is resembled supercomputer. Unix, Linux, BSD, NT(Windows series) can use Cluster system OS(operating system). I'm chosen linux gain low cost, high performance and open technical documentation. This paper is benchmark performance of Beowulf clustering by UltraSPARC-1K(64bit-RISC processor). Benchmark tools use MPI(Message Passing Interface) and NetPIPE. Beowulf is a class of experimental parallel workstations developed to evaluate and characterize the design space of this new operating point in price-performance.

  • PDF

Thread Distribution Method of GP-GPU for Accelerating Parallel Algorithms (병렬 알고리즘의 가속화를 위한 GP-GPU의 Thread할당 기법)

  • Lee, Kwan-Ho;Kim, Chi-Yong
    • Journal of IKEEE
    • /
    • v.21 no.1
    • /
    • pp.92-95
    • /
    • 2017
  • In this paper, we proposed a way to improve function of small scale GP-GPU. Instead of using superscalar which increase scheduling-complexity, we suggested the application of simple core to maximize GP-GPU performance. Our studies also demonstrated that simplified Stream Processor is one of the way to achieve functional improvement in GP-GPU. In addition, we found that developing of optimal thread-assigning method in Warp Scheduler for specific application improves functional performance of GP-GPU. For examination of GP-GPU functional performance, we suggested the thread-assigning way which coordinated with Deep-Learning system; a part of Neural Network. As a result, we found that functional index in algorithm of Neural Network was increased to 90%, 98% compared with Intel CPU and ARM cortex-A15 4 core respectively.

Performance optimization of 1 kW class residential fuel processor (1 kW급 가정용 연료개질기 성능 최적화)

  • Jung, Un-Ho;Koo, Kee-Young;Yoon, Wang-Lai
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2009.06a
    • /
    • pp.731-734
    • /
    • 2009
  • KIER has been developed a compact and highly efficient fuel processor which is one of the key component of the residential PEM fuel cells system. The fuel processor uses methane steam reforming to convert natural gas to a mixture of water, hydrogen, carbon dioxide, carbon monoxide and unreacted methane. Then carbon monoxide is converted to carbon dioxide in water-gas-shift reactor and preferential oxidation reactor. A start-up time of the fuel processor is about 1h and CO concentration among the final product is maintained less than 5 vol. ppm. To achieve high thermal efficiency of 80% on a LHV basis, an optimal thermal network was designed. Internal heat exchange of the fuel processor is so efficient that the temperature of the reformed gas and the flue gas at the exit of the fuel processor remains less than $100^{\circ}C$. A compact design considering a mixing and distribution of the feed was applied to reduce the reactor volume. The current volume of the fuel processor is 17L with insulation.

  • PDF