• Title/Summary/Keyword: CPU 시간

Search Result 518, Processing Time 0.049 seconds

S/W Watch-Dog method between dual CPU using different OS (이종 OS로 구동되는 Dual CPU 기반에서의 S/W Watch-Dog 기법)

  • You, Young-Eel;Chon, Byoung-Sil
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.2
    • /
    • pp.34-39
    • /
    • 2010
  • This paper proposes S/W Watch-Dog method between Dual CPU using the different OS. The proposed watch-dog method performs that it distinguishes the status of channel between dual CPU and status of processor itself. We find out the ideal value of threshold and priority for load test task. and we evaluate the accuracy of the proposed S/W Watch-Dog Method at the result of evaluation. We figure out that the accuracy of proposed method is higher than the accuracy of general S/W Watch-Dog Method in case of variable data rate. Therefore we confirm that the proposed Method has high accuracy of watch-dog function with the ideal value of threshold and priority for load test task through the performance evaluation.

Acceleration of Mesh Denoising Using GPU Parallel Processing (GPU의 병렬 처리 기능을 이용한 메쉬 평탄화 가속 방법)

  • Lee, Sang-Gil;Shin, Byeong-Seok
    • Journal of Korea Game Society
    • /
    • v.9 no.2
    • /
    • pp.135-142
    • /
    • 2009
  • Mesh denoising is a method to remove noise applying various filters. However, those methods usually spend much time since filtering is performed on CPU. Because GPU is specialized for floating point operations and faster than CPU, real-time processing for complex operations is possible. Especially mesh denoising is adequate for GPU parallel processing since it repeats the same operations for vertices or triangles. In this paper, we propose mesh denoising algorithm based on bilateral filtering using GPU parallel processing to reduce processing time. It finds neighbor triangles of each vertex for applying bilateral filter, and computes its normal vector. Then it performs bilateral filtering to estimate new vertex position and to update its normal vector.

  • PDF

Improved Global Placement Technique to Relieve Routing Congestion (배선 밀집도를 완화하기 위한 개선된 광역배치 기법)

  • Oh, Eun-Kyung;Hur, Sung-Woo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.431-435
    • /
    • 2008
  • Since previous work CDP(Congestion Driven Placement) [1] considers all possible directions to move every cell in nets which contribute highly to routing congestion, it consumes CPU time a lot. In this paper, we propose a faster global placement technique, so called ICDGP(Improved Congestion Driven Global Placement) to relieve the routing congestion. ICDGP uses the force-directed method to determine the target locations of the cells in the nets in the congested spots, and considers only to move the target location for each cell. If moving multiple cells simultaneously is considered better than moving each cell one by one it moves multiple cells simultaneously. By experimental results, ICDGP produces less congested placement than CDP does. Particularly, the CPU time is reduced by 36% on average.

Development of Network Event Audit Module Using Data Mining (데이터 마이닝을 통한 네트워크 이벤트 감사 모듈 개발)

  • Han, Seak-Jae;Soh, Woo-Young
    • Convergence Security Journal
    • /
    • v.5 no.2
    • /
    • pp.1-8
    • /
    • 2005
  • Network event analysis gives useful information on the network status that helps protect attacks. It involves finding sets of frequently used packet information such as IP addresses and requires real-time processing by its nature. Apriori algorithm used for data mining can be applied to find frequent item sets, but is not suitable for analyzing network events on real-time due to the high usage of CPU and memory and thus low processing speed. This paper develops a network event audit module by applying association rules to network events using a new algorithm instead of Apriori algorithm. Test results show that the application of the new algorithm gives drastically low usage of both CPU and memory for network event analysis compared with existing Apriori algorithm.

  • PDF

Performance Enhancement of GPU Parallelism Algorithm including Memory Loading Time (메모리 로딩 시간을 고려한 GPU 병렬 알고리즘의 성능 개선 방안)

  • Bae, Byunggul;Lee, Jinwoo;Park, II-Nam;Im, Eun-Jin;Kang, Seung-Shik
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.119-120
    • /
    • 2012
  • GPU를 이용한 병렬 알고리즘은 어떤 메모리를 사용하는지에 따라 시스템의 전체적인 성능이 달라진다. 본 논문은 GPU 환경에서 실행되는 CUDA 프레임워크에서 병렬처리를 이용하여 문서 분류 시스템의 속도를 향상시키고자 할 때 메모리 로딩 시간이 전체적인 시스템의 성능에 미치는 영항을 연구하였다. 기존의 CPU 환경에서 구현했을 때와 비교하여 어느 정도의 성능 향상이 있었는지 실험하였으며 이전 연구에서 고려하지 않았던 메모리를 읽는데 걸리는 시간을 고려하여 현실적인 실행 시간을 비교하였다. 실험 결과에 의하면 CPU 에서 구현했을 때의 연산 속도보다 GPU의 텍스쳐 메모리를 사용하여 구현하였을 때 문서분류 성능이 향상되는 효과가 있음을 알 수 있었다.

  • PDF

Implementation and Performance Evaluation of a Video-Equipped Real-Time Fire Detection Method at Different Resolutions using a GPU (GPU를 이용한 다양한 해상도의 비디오기반 실시간 화재감지 방법 구현 및 성능평가)

  • Shon, Dong-Koo;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.1
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose an efficient parallel implementation method of a widely used complex four-stage fire detection algorithm using a graphics processing unit (GPU) to improve the performance of the algorithm and analyze the performance of the parallel implementation method. In addition, we use seven different resolution videos (QVGA, VGA, SVGA, XGA, SXGA+, UXGA, QXGA) as inputs of the four-stage fire detection algorithm. Moreover, we compare the performance of the GPU-based approach with that of the CPU implementation for each different resolution video. Experimental results using five different fire videos with seven different resolutions indicate that the execution time of the proposed GPU implementation outperforms that of the CPU implementation in terms of execution time and takes a 25.11ms per frame for the UXGA resolution video, satisfying real-time processing (30 frames per second, 30fps) of the fire detection algorithm.

Quadtree-based Terrain Visualization Using Vertex Multiplication (정점증식을 이용한 사진트리 기반 지형 시각화 기법)

  • Lee, Eun-Seok;Shin, Byeong-Seok
    • Journal of the Korea Computer Graphics Society
    • /
    • v.15 no.3
    • /
    • pp.27-33
    • /
    • 2009
  • In terrain visualization, the quadtree is the most frequently used data structure for progressive mesh generation. The quadtree provides an efficient level-of-detail selection and view frustum culling. However, most applications using quadtrees are performed by the CPU, since the hierarchical data structure cannot be manipulated in a programmable rendering pipeline. For this reason, quadtree-based methods show lower performance and higher dependancy of CPU in comparison to GPU-based methods. We present a quadtree-based terrain-rendering method for GPU execution that uses vertex multiplication. It offers higher performance than previous CPU-based quadtree methods, without loss of image quality.

  • PDF

Analysis of TCP/IP Protocol for Implementing a High-Performance Hybrid TCP/IP Offload Engine (고성능 Hybrid TCP/IP Offload Engine 구현을 위한 TCP/IP 프로토콜 분석)

  • Jang Hankook;Oh Soo-Cheol;Chung Sang-Hwa;Kim Dong Kyue
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.6
    • /
    • pp.296-305
    • /
    • 2005
  • TCP/IP, the most popular communication protocol, is processed on a host CPU in traditional computer systems and this imposes enormous loads on the host CPU. Recently TCP/IP Offload Engine (TOE) technology, which processes TCP/IP on a network adapter instead of the host CPU, becomes an important way to solve the problem. In this paper we analysed the structure of a TCP/IP protocol stack in the Linux operating system and important factors, which cause a lot of loads on the host CPU, by measuring the time spent on processing each function in the protocol stack. Based on these analyses, we propose a Hybrid TOE architecture, in which functions imposing much loads on the host CPU are implemented using hardware and other functions are implemented using software.

Implementation of FFT on Massively Parallel GPU for DVB-T Receiver (DVB-T 수신기를 위한 대규모 병렬처리 GPU 기반의 FFT 구현)

  • Lee, Kyu Hyung;Heo, Seo Weon
    • Journal of Broadcast Engineering
    • /
    • v.18 no.2
    • /
    • pp.204-214
    • /
    • 2013
  • Recently various research have been conducted relating to the implementation of signal processing or communication system by software using the massively parallel processing capability of the GPU. In this work, we focus on reducing software simulation time of 2K/8K FFT in DVB-T by using GPU. we estimate the processing time of the DVB-T system, which is one of the standards for DTV transmission, by CPU. Then we implement the FFT processing by the software using the NVIDIA's massively parallel GPU processor. In this paper we apply stream process method to reduce the overhead for data transfer between CPU and GPU, coalescing method to reduce the global memory access time and data structure design method to maximize the shared memory usage. The results show that our proposed method is approximately 20~30 times as fast as the CPU based FFT processor, and approximately 1.8 times as fast as the CUFFT library (version 2.1) which is provided by the NVIDIA when applied to the DVB-T 2K/8K mode FFT.

An Efficient Dynamic Workload Balancing Strategy for High-Performance Computing System (고성능 컴퓨팅 시스템을 위한 효율적인 동적 작업부하 균등화 정책)

  • Lee, Won-Joo;Park, Mal-Soon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.5
    • /
    • pp.45-52
    • /
    • 2008
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-Performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.

  • PDF