• Title/Summary/Keyword: Hardware Performance Counter

Search Result 30, Processing Time 0.035 seconds

On the Conceptual Design of the SIMD Vector Machine Attachable to SISD Machine (SISD 머신에 부착 가능한 SIMD 벡터 머신의 개념적 설계)

  • Cho Young-Il;Ko Young-Woong
    • The KIPS Transactions:PartA
    • /
    • v.12A no.3 s.93
    • /
    • pp.263-272
    • /
    • 2005
  • The addressing mode for data is performed by the software in yon Neumann-concept(SISD) computer a priori without hardware design of an address counter for operands. Therefore, in the addressing mode for the vector the corresponding variables as much as the number of the elements should be specified and used also in the software method. This is because not for operand but only for an instructions, quasi PC(program counter) is designed in hardware physically. A vector has a characteristic of a structural dimension. In this paper we propose to design a hardware unit physically external to the CPU for addressing only the elements of a vector unit with the structure and dimension. Because of the high speed performance for a vector processing it should be designed in the SIMD pipeline mechanics. The proposed mechanics is evaluated through a simulation. Our result shows $12\%$ to $30\%$ performance enhancement over CRAY architecture under the same hardware consideration(processing unit).

Digital Correlator Design for GPS/GLONASS Receiver (GPS/GLONASS 수신기용 디지털 상관기 설계)

  • 조득재;최일홍;박찬식;이상정
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.275-275
    • /
    • 2000
  • This paper designs a digital correlator for the integrated GPS/GLONASS receiver consisting of DCO, carrier cycle counter, code generator, code phase counter, mixer, epoch counter, accumulator. It is designed using Verilog-HDL(Verilog-Hardware Description Language) and synthesized using EDA(Electronic Design Automation) tools. The performance of the designed digital correlator is verified by the functional simulation and real satellite tracking experiments.

  • PDF

Performance Improvement of Smart Counter for Uneven Small Grain (지능형 미소비균일체 계수기의 성능개선)

  • Cho, Si-Hyeong;Park, Chan-Won
    • Journal of Industrial Technology
    • /
    • v.29 no.B
    • /
    • pp.127-131
    • /
    • 2009
  • This paper presents the development of smart counting system that is proper for grains with uneven unit weight or shape. This device can detect the small differences of a light beam and count the pulse from wave shape control, when the grain is going on the light screen, which is made by the light beam screen sensor. It can, different from the former conventional device, distinct the uneven grains for counting detect, by using the dedicated hardware and the software algorithm of the light sensor.

  • PDF

Machine Learning-Based Detection of Cache Side Channel Attack Using Performance Counter Monitor of CPU (Performance Counter Monitor를 이용한 머신 러닝 기반 캐시 부채널 공격 탐지)

  • Hwang, Jongbae;Bae, Daehyeon;Ha, Jaecheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1237-1246
    • /
    • 2020
  • Recently, several cache side channel attacks have been proposed to extract secret information by exploiting design flaws of the microarchitecture. The Flush+Reload attack, one of the cache side channel attack, can be applied to malicious application attacks due to its properties of high resolution and low noise. In this paper, we proposed a detection system, which detects the cache-based attacks using the PCM(Performance Counter Monitor) for monitoring CPU cache activity. Especially, we observed the variation of each counter value of PCM in case of two kinds of attacks, Spectre attack and secret recovering attack during AES encryption. As a result, we found that four hardware counters were sensitive to cache side channel attacks. Our detector based on machine learning including SVM(Support Vector Machine), RF(Random Forest) and MLP(Multi Level Perceptron) can detect the cache side channel attacks with high detection accuracy.

Development of Efficient Dynamic Bandwidth Allocation Algorithm for XGPON

  • Han, Man Soo;Yoo, Hark;Lee, Dong Soo
    • ETRI Journal
    • /
    • v.35 no.1
    • /
    • pp.18-26
    • /
    • 2013
  • This paper proposes an efficient bandwidth utilization (EBU) algorithm that utilizes the unused bandwidth in dynamic bandwidth allocation (DBA) of a 10-gigabit-capable passive optical network (XGPON). In EBU, an available byte counter of a queue can be negative and the unused remainder of an available byte counter can be utilized by the other queues. In addition, EBU uses a novel polling scheme to collect the requests of queues as soon as possible. We show through analysis and simulations that EBU improves performance compared to that achieved with existing methods. In addition, we describe the hardware implementation of EBU. Finally we show the test results of the hardware implementation of EBU.

A study on the efficient method of constrained iterative regular expression pattern matching (제약 반복적인 정규표현식 패턴 매칭의 효율적인 방법에 관한 연구)

  • Seo, Byung-Suk
    • Design & Manufacturing
    • /
    • v.16 no.3
    • /
    • pp.34-38
    • /
    • 2022
  • Regular expression pattern matching is widely used in applications such as computer virus vaccine, NIDS and DNA sequencing analysis. Hardware-based pattern matching is used when high-performance processing is required due to time constraints. ReCPU, SMPU, and REMP, which are processor-based regular expression matching processors, have been proposed to solve the problem of the hardware-based method that requires resynthesis whenever a pattern is updated. However, these processor-based regular expression matching processors inefficiently handle repetitive operations of regular expressions. In this paper, we propose a new instruction set to improve the inefficient repetitive operations of ReCPU and SMPU. We propose REMPi, a regular expression matching processor that enables efficient iterative operations based on the REMP instruction set. REMPi improves the inefficient method of processing a particularly short sub-pattern as a repeat operation OR, and enables processing with a single instruction. In addition, by using a down counter and a counter stack, nested iterative operations are also efficiently processed. REMPi was described with Verilog and synthesized on Intel Stratix IV FPGA.

Hardware Implementation of an Intelligent Controller with a DSP and an FPGA for Nonlinear Systems (DSP와 FPGA를 이용한 지능 제어기의 하드웨어 구현)

  • 김성수
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.10
    • /
    • pp.922-929
    • /
    • 2004
  • In this paper, we develop control hardware such as an FPGA based general purposed intelligent controller with a DSP board to solve nonlinear system control problems. PID control algorithms are implemented in an FPGA and neural network control algorithms are implemented in a BSP board. An FPGA was programmed with VHDL to achieve high performance and flexibility. The additional hardware such as an encoder counter and a PWM generator can be implemented in a single FPGA device. As a result, the noise and power dissipation problems can be minimized and the cost effectiveness can be achieved. To show the performance of the developed controller, it was tested fur nonlinear systems such as a robot hand and an inverted pendulum.

A Study on Scalability of Profiling Method Based on Hardware Performance Counter for Optimal Execution of Supercomputer (슈퍼컴퓨터 최적 실행 지원을 위한 하드웨어 성능 카운터 기반 프로파일링 기법의 확장성 연구)

  • Choi, Jieun;Park, Guenchul;Rho, Seungwoo;Park, Chan-Yeol
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.221-230
    • /
    • 2020
  • Supercomputer that shares limited resources to multiple users needs a way to optimize the execution of application. For this, it is useful for system administrators to get prior information and hint about the applications to be executed. In most high-performance computing system operations, system administrators strive to increase system productivity by receiving information about execution duration and resource requirements from users when executing tasks. They are also using profiling techniques that generates the necessary information using statistics such as system usage to increase system utilization. In a previous study, we have proposed a scheduling optimization technique by developing a hardware performance counter-based profiling technique that enables characterization of applications without further understanding of the source code. In this paper, we constructed a profiling testbed cluster to support optimal execution of the supercomputer and experimented with the scalability of the profiling method to analyze application characteristics in the built cluster environment. Also, we experimented that the profiling method can be utilized in actual scheduling optimization with scalability even if the application class is reduced or the number of nodes for profiling is minimized. Even though the number of nodes used for profiling was reduced to 1/4, the execution time of the application increased by 1.08% compared to profiling using all nodes, and the scheduling optimization performance improved by up to 37% compared to sequential execution. In addition, profiling by reducing the size of the problem resulted in a quarter of the cost of collecting profiling data and a performance improvement of up to 35%.

AndroScope: An Insightful Performance Analyzer for All Software Layers of the Android-Based Systems

  • Cho, Myeongjin;Lee, Ho Jin;Kim, Minseong;Kim, Seon Wook
    • ETRI Journal
    • /
    • v.35 no.2
    • /
    • pp.259-269
    • /
    • 2013
  • Android has become the most popular platform for mobile devices. However, Android still has critical performance issues, such as "application not responding" errors and hiccups resulting from garbage collection. Many phone vendors have tried to resolve the problems by characterizing and improving the performance. However, there are few insightful performance analysis tools for the Android-based systems. This paper presents AndroScope, which is a performance analysis tool for both the Android platform (Dalvik virtual machine, core libraries, Android libraries, and even Linux kernels) and its applications. To the best of our knowledge, this is the first tool to collect and analyze performance data from all the software layers of the Android-based systems. AndroScope offers a trace mechanism to collect such deep and wide performance data as hardware performance counters, time, and memory usage. In addition, the tool includes TraceBridge, which is a middleware for the fast handling of mass logs. Moreover, AndroScope offers an integrated graphical user interface with the Android software development kit to display a great volume of the detailed performance data.

Implementation of an Intelligent Controller with a DSP and an FPGA for Nonlinear Systems

  • Kim, Sung-Su;Jung, Seul
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.575-580
    • /
    • 2003
  • In this paper, we develop a control hardware such as an FPGA based general purpose controller with a DSP board to solve nonlinear control problems. PID control algorithms are implemented in an FPGA and neural network control algorithms are implemented in a DSP board. PID controllers implemented on an FPGA was designed by using VHDL to achieve high performance and flexibility. By using high capacity of an FPGA, the additional hardware such as an encoder counter and a PWM generator, can be implemented in a single FPGA device. As a result, the noise and power dissipation problems can be minimized and the cost effectiveness can be achieved. In order to show the performance of the developed controller, it was tested for controlling nonlinear systems such as an inverted pendulum.

  • PDF