• Title/Summary/Keyword: CPU performance

Search Result 821, Processing Time 0.026 seconds

Performance Scalability of SPEC CPU2000 Benchmark over CPU Clock Speed (CPU 주파수 속도에 대한 SPEC CPU2000 성능 변화)

  • Yi, Jong-Su;Kim, Jun-Seong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.5
    • /
    • pp.1-8
    • /
    • 2005
  • SPEC CPU2000 is an widely used benchmark program, both in industry and in academy, for measuring compute-intensive performance of computer systems with various architectures. However, there has been little effort to investigate its characteristics with respect to hardware components. This paper presents the performance scalability of SPEC CPU2000 benchmark over CPU clock speed. For an Intel x86-based system running at various clock speed, we measure the performance of SPEC CPU2000 benchmark, and analyze the characteristic of SPEC CPU2000 in a system aspect. In the experiment, we found that the overall performance of SPEC CPU2000 increases monotonically and linearly as the CPU clock speed increases and that the scale efficiencies of SPEC CPU2000 component benchmarks are quite evenly distributed.

A study on performance evaluation of K4 Firewall System with multiple CPUs and security rules (K4 방화벽의 CPU 및 보안규칙의 증가에 따르는 성능평가연구)

  • 박대우;전문석
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.3
    • /
    • pp.203-218
    • /
    • 2002
  • According as development of networks and increasing on Internet service, For the performance increase of K4 Firewall require that hardware be installed of 2 CPU or 4 CPU instead of 1 CPU. Output of performance test among 1CPU, 2CPU, and 4CPU of K4 Firewall system has not any efficient about increasing multiple CPUs. K4 Firewall put performance on setting on demon of packet filtering rules and Network Address Translate and Authentication and Proxy services. Performance results that setting after security rules are less 2% Packet Filtering, 8%-11% NAT, 18%-20% Proxy and Authentication services than setting before security rules on K4 Firewall System. NAT and Proxy service have decrease of performance. This performance result comes in useful for research and development on K4 Firewall System.

  • PDF

Limiting CPU Frequency Scaling Considering Main Memory Accesses (주메모리 접근을 고려한 CPU 주파수 조정 제한)

  • Park, Moonju
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.9
    • /
    • pp.483-491
    • /
    • 2014
  • Contemporary computer systems exploits DVFS (Dynamic Voltage/Frequency Scaling) technology for balancing performance and power consumption. The efficiency of DVFS depends on how much performance we get for larger power consumption due to elevated CPU frequency. Especially for memory-bounded applications, higher CPU frequency often does not result in higher performance. In this paper, we present an upper bound of CPU frequency scaling based on memory accesses. It is observed that the performance gain due to higher CPU frequency is limited by memory accesses (last level cache misses) per instructions by experiments. Using the results, we present the CPU frequency upper bound with little performance gain. Experimental results show that for a memory-bounded application, applying the frequency upper bound enhances the energy efficiency of the application by above 30%.

Cooling Performance of Liquid CPU Cooler using Water/PG-based $Al_2O_3$ Nanofluids (물/PG-기반 $Al_2O_3$ 나노유체를 적용한 수냉식 CPU 쿨러의 냉각성능)

  • Park, Y.J.;Kim, K.H.;Lee, S.H.;Jang, S.P.
    • Journal of ILASS-Korea
    • /
    • v.19 no.1
    • /
    • pp.19-24
    • /
    • 2014
  • In this study, the cooling performance of a liquid CPU cooler using the water/propylene glycol(PG)-based $Al_2O_3$ nanofluids is experimentally investigated. Water/PG-based $Al_2O_3$ nanofluids are manufactured by two-step method with ultrasonic energy for 10 hours. The volume fractions of the nanofluids are 0.25% and 0.35%. Thermal conductivity and viscosity of the nanofluids are measured to theoretically predict the thermal performance of the liquid CPU cooler using performance factor. Performance factor results indicate that the cooling performance of the liquid CPU cooler can be improved using the manufactured nanofluids. To evaluate the cooling performance of the liquid CPU cooler experimentally, temperature differences between ambient air and heater are measured for base fluid and nanofluids respectively. Based on the results, it is shown that performance of the liquid CPU cooler using $Al_2O_3$ nanofluids is improved maximum up to 8.6% at 0.25 Vol.%.

Analysis on the Performance Impact of Partitioned LLC for Heterogeneous Multicore Processors (이종 멀티코어 프로세서에서 분할된 공유 LLC가 성능에 미치는 영향 분석)

  • Moon, Min Goo;Kim, Cheol Hong
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.2
    • /
    • pp.39-49
    • /
    • 2019
  • Recently, CPU-GPU integrated heterogeneous multicore processors have been widely used for improving the performance of computing systems. Heterogeneous multicore processors integrate CPUs and GPUs on a single chip where CPUs and GPUs share the LLC(Last Level Cache). This causes a serious cache contention problem inside the processor, resulting in significant performance degradation. In this paper, we propose the partitioned LLC architecture to solve the cache contention problem in heterogeneous multicore processors. We analyze the performance impact varying the LLC size of CPUs and GPUs, respectively. According to our simulation results, the bigger the LLC size of the CPU, the CPU performance improves by up to 21%. However, the GPU shows negligible performance difference when the assigned LLC size increases. In other words, the GPU is less likely to lose the performance when the LLC size decreases. Because the performance degradation due to the LLC size reduction in GPU is much smaller than the performance improvement due to the increase of the LLC size of the CPU, the overall performance of heterogeneous multicore processors is expected to be improved by applying partitioned LLC to CPUs and GPUs. In addition, if we develop a memory management technique that can maximize the performance of each core in the future, we can greatly improve the performance of heterogeneous multicore processors.

Effective CPU overclocking scheme considering energy efficiency (에너지 효율을 고려한 효과적인 CPU 오버클럭킹 방법)

  • Lee, Jun-Hee;Kong, Joon-Ho;Suh, Tae-Weon;Chung, Sung-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.12
    • /
    • pp.17-24
    • /
    • 2009
  • More recently, the Green Computing have become a important issue in all fields of industry. The energy efficiency cannot be over-emphasized. Microprocessor companies such as Intel Corporation design processors with taking both energy efficiency and performance into account. Nevertheless, general computer users typically utilize the CPU overclocking to enhance the application performance. The overclocking is traditionally considered as an evil in terms of the power consumption. In this paper, we present effective CPU overclocking schemes, which raise CPU frequency while keeping current CPU supply voltage for energy reduction and performance improvement. The proposed scheme gain both energy reduction and performance improvement. Evaluation results show that our proposed schemes reduce the processor execution time as much as 17% and total computer system energy as much as 5%, respectively. In addition, our effective CPU overclocking schemes reduce the Energy Delay Product (EDP) as much as 22%, on average.

A Performance Study on CPU-GPU Data Transfers of Unified Memory Device (통합메모리 장치에서 CPU-GPU 데이터 전송성능 연구)

  • Kwon, Oh-Kyoung;Gu, Gibeom
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.5
    • /
    • pp.133-138
    • /
    • 2022
  • Recently, as GPU performance has improved in HPC and artificial intelligence, its use is becoming more common, but GPU programming is still a big obstacle in terms of productivity. In particular, due to the difficulty of managing host memory and GPU memory separately, research is being actively conducted in terms of convenience and performance, and various CPU-GPU memory transfer programming methods are suggested. Meanwhile, recently many SoC (System on a Chip) products such as Apple M1 and NVIDIA Tegra that bundle CPU, GPU, and integrated memory into one large silicon package are emerging. In this study, data between CPU and GPU devices are used in such an integrated memory device and performance-related research is conducted during transmission. It shows different characteristics from the existing environment in which the host memory and GPU memory in the CPU are separated. Here, we want to compare performance by CPU-GPU data transmission method in NVIDIA SoC chips, which are integrated memory devices, and NVIDIA SMX-based V100 GPU devices. For the experimental workload for performance comparison, a two-dimensional matrix transposition example frequently used in HPC applications was used. We analyzed the following performance factors: the difference in GPU kernel performance according to the CPU-GPU memory transfer method for each GPU device, the transfer performance difference between page-locked memory and pageable memory, overall performance comparison, and performance comparison by workload size. Through this experiment, it was confirmed that the NVIDIA Xavier can maximize the benefits of integrated memory in the SoC chip by supporting I/O cache consistency.

Evaluating Power Consumption and Real-time Performance of Android CPU Governors (안드로이드 CPU 거버너의 전력 소비 및 실시간 성능 평가)

  • Tak, Sungwoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.12
    • /
    • pp.2401-2409
    • /
    • 2016
  • Android CPU governors exploit the DVFS (Dynamic Voltage Frequency Scaling) technique. The DVFS is a power management technique where the CPU operating frequency is decreased to allow a corresponding reduction in the CPU supply voltage. The power consumed by a CPU is approximately proportional to the square of the CPU supply voltage. Therefore, lower CPU operating frequency allows the CPU supply voltage to be lowered. This helps to reduce the CPU power consumption. However, lower CPU operating frequency increases a task's execution time. Such an increase in the task's execution time makes the task's response time longer and makes the task's deadline miss occur. This finally leads to degrading the quality of service provided by the task. In this paper, we evaluated the performance of Android CPU governors in terms of the power consumption, tasks's response time and deadline miss ratio.

Analysis of the CPU/GPU Temperature and Energy Efficiency depending on Executed Applications (응용프로그램 실행에 따른 CPU/GPU의 온도 및 컴퓨터 시스템의 에너지 효율성 분석)

  • Choi, Hong-Jun;Kang, Seung-Gu;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.5
    • /
    • pp.9-19
    • /
    • 2012
  • As the clock frequency increases, CPU performance improves continuously. However, power and thermal problems in the CPU become more serious as the clock frequency increases. For this reason, utilizing the GPU to reduce the workload of the CPU becomes one of the most popular methods in recent high-performance computer systems. The GPU is a specialized processor originally designed for graphics processing. Recently, the technologies such as CUDA which utilize the GPU resources more easily become popular, leading to the improved performance of the computer system by utilizing the CPU and GPU simultaneously in executing various kinds of applications. In this work, we analyze the temperature and the energy efficiency of the computer system where the CPU and the GPU are utilized simultaneously, to figure out the possible problems in upcoming high-performance computer systems. According to our experimentation results, the temperature of both CPU and GPU increase when the application is executed on the GPU. When the application is executed on the CPU, CPU temperature increases whereas GPU temperature remains unchanged. The computer system shows better energy efficiency by utilizing the GPU compared to the CPU, because the throughput of the GPU is much higher than that of the CPU. However, the temperature of the system tends to be increased more easily when the application is executed on the GPU, because the GPU consumes more power than the CPU.

Implementation and Performance Evaluation of the Faddev-Leverrier Algorithm using GPGPU (GPGPU를 이용한 파데브-레브리어 알고리즘 구현 및 성능 분석)

  • Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.3
    • /
    • pp.171-178
    • /
    • 2013
  • In this paper, we implement the Faddev-Leverier algorithm using GPGPU (General-Purpose Graphics Processing Unit) to accelerate singular value decomposition. In addition, we compare the performance of the algorithm using CPU and CPU plus GPGPU for eleven ${\times}n$ matrix sizes in order to decompose singular values, where =4, 8, 16, 32, 64, 128, 256, 512, 1,024, 2,048, and 4,096. Experimental results indicate that CPU achieves better performance than CPU plus GPGPU for $n{\leq}64$ because of a large number of read and write operations between CPU and GPGPU. However, CPU plus GPGPU outperforms CPU exponentially in the execution time for $n{\geq}64$.