• 제목/요약/키워드: CPU time

검색결과 946건 처리시간 0.026초

클라우드 컴퓨팅 응용 구동을 위한 마이크로서버 성능평가 (Performance Evaluation of Microservers to drive for Cloud Computing Applications)

  • 오명훈
    • 한국인터넷방송통신학회논문지
    • /
    • 제23권4호
    • /
    • pp.85-91
    • /
    • 2023
  • 국산 마이크로서버인 KOSMOS의 활용을 위해 클라우드 컴퓨팅 분야의 실제 응용 서비스 기반 벤치마크 프로그램인 CloudSuite로 성능 평가 결과를 제시한다. CloudSuite는 오프라인 응용과 온라인 응용의 두 가지 부분에서 클라우드 서비스로 제공되는 몇 가지의 구분되는 응용 프로그램을 컨테이너 기반으로 제공하고 있다. KOSMOS의 유사 스펙의 비교군인 다른 마이크로서버와의 성능 비교에서 전 부분에 걸처 KOSMOS가 우수하였으며, 인텔 Xeon CPU 기반 서버와의 비교에서도 일부 오프라인 응용에서는 성능이 더 우수하였다. CloudSuite 오프라인 응용 벤치마크 프로그램인 Graph Analytics 수행 시 KOSMOS의 다수의 노드들을 분산 실행시킨 형상에서 인텔 Xeon CPU 기반 2개의 서버 비교군과 비교하였을 때, 각각 30.3%, 72.3%만큼의 수행시간을 감소시켰다.

CPU-GPU 메모리 계층을 고려한 고처리율 병렬 KMP 알고리즘 (High Throughput Parallel KMP Algorithm Considering CPU-GPU Memory Hierarchy)

  • 박소은;김대희;이명호;박능수
    • 전기학회논문지
    • /
    • 제67권5호
    • /
    • pp.656-662
    • /
    • 2018
  • Pattern matching algorithm is widely used in many application fields such as bio-informatics, intrusion detection, etc. Among many string matching algorithms, KMP (Knuth-Morris-Pratt) algorithm is commonly used because of its fast execution time when using large texts. However, the processing speed of KMP algorithm is also limited when the text size increases significantly. In this paper, we propose a high throughput parallel KMP algorithm considering CPU-GPU memory hierarchy based on OpenCL in GPGPU (General Purpose computing on Graphic Processing Unit). We focus on the optimization for the allocation of work-times and work-groups, the local memory copy of the pattern data and the failure table, and the overlapping of the data transfer with the string matching operations. The experimental results show that the execution time of the optimized parallel KMP algorithm is about 3.6 times faster than that of the non-optimized parallel KMP algorithm.

FPGA-Based Hardware Accelerator for Feature Extraction in Automatic Speech Recognition

  • Choo, Chang;Chang, Young-Uk;Moon, Il-Young
    • Journal of information and communication convergence engineering
    • /
    • 제13권3호
    • /
    • pp.145-151
    • /
    • 2015
  • We describe in this paper a hardware-based improvement scheme of a real-time automatic speech recognition (ASR) system with respect to speed by designing a parallel feature extraction algorithm on a Field-Programmable Gate Array (FPGA). A computationally intensive block in the algorithm is identified implemented in hardware logic on the FPGA. One such block is mel-frequency cepstrum coefficient (MFCC) algorithm used for feature extraction process. We demonstrate that the FPGA platform may perform efficient feature extraction computation in the speech recognition system as compared to the generalpurpose CPU including the ARM processor. The Xilinx Zynq-7000 System on Chip (SoC) platform is used for the MFCC implementation. From this implementation described in this paper, we confirmed that the FPGA platform is approximately 500× faster than a sequential CPU implementation and 60× faster than a sequential ARM implementation. We thus verified that a parallelized and optimized MFCC architecture on the FPGA platform may significantly improve the execution time of an ASR system, compared to the CPU and ARM platforms.

ARM RISC 상에서의 MPEG-1 Audio decoder의 실시간 구현 (Real-Time Implementation of MPEG-1 Audio decoder on ARM RISC)

  • 김선태
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 추계종합학술대회 논문집(4)
    • /
    • pp.119-122
    • /
    • 2000
  • Recently, many complex DSP (Digital Signal Processing) algorithms have being realized on RISC CPU due to good compilation, low power consumption and large memory space. But, real-time implementation of multiple DSP algorithms on RISC requires the minimum and efficient memory usage and the lower occupancy of CPU. In this thesis, the original floating-point code of MPEG-1 audio decoder is converted to the fixed-point code and then optimized to the efficient assembly code in time-consuming function in accord with RISC feature. Finally, compared with floating-point and fixed-point, about 30 and 3 times speed enhancements are achieved respectively. And 3~4 times memory spaces are spared.

  • PDF

Development of Full Coverage Test Framework for NVMe Based Storage

  • Park, Jung Kyu;Kim, Jaeho
    • 한국컴퓨터정보학회논문지
    • /
    • 제22권4호
    • /
    • pp.17-24
    • /
    • 2017
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.

선박관리 시스템의 최적화 (Optimization of Ship Management System)

  • 임치산;박수홍
    • 한국전자통신학회논문지
    • /
    • 제8권6호
    • /
    • pp.839-846
    • /
    • 2013
  • 본 연구에서는 실시간 선박관리시스템을 위한 최적 프로그램방법을 설계 및 개발하였다. 종래의 인터럽트 프로그램방식을 대신하여 다중작업과 시각화되어진 임베디드 실시간 관리시스템을 제안한다. 데이터 관리는 임베디드 실시간 운영체계상에서 개발하였고, 인공지능방식으로 중앙연산장치(CPU)를 최적화하도록 설계되었다. 최종적으로 시스템에서의 데이터손실을 최소화하면서 최적프로그램모델을 통하여 데이터처리를 향상시켰다.

PIV에서의 보간기법의 평가에 관한 연구 (A Study on the Evaluation of Interpolation Methods in PIV)

  • 최장운;조대한;최민선;이영호
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제20권4호
    • /
    • pp.90-100
    • /
    • 1996
  • To maintain high spacial accuracy and rapid CPU time in interpolating data from grid to random position or inversely in PIV, proposed many technuques are compared and discussed mainly in terms of interpolating error and computing time. And artificial PIV atmosphere data is furnished by CFD result. First, for interpolation from grid to random position, multiquadric method gives the highest accuracy with the longest CPU time and Taylor series expansion methods give reasonable accuracy with less calculating load. Secondly, the sub-pixel resolution analysis in estimating the coordinates of the maximum correlation coefficients essential in the grey level correlation PIV reveal that 8-neighbours 2nd-order least square interpolation gives utmost accuracy in terms of the real flow conditions.

  • PDF

PIV에서의 보간기법의 평가에 관한 연구 (A Study on the Evaluation of Interpolation Methods in PIV)

  • 최장운;조대환;최민선;이영호
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제20권4호
    • /
    • pp.412-412
    • /
    • 1996
  • To maintain high spacial accuracy and rapid CPU time in interpolating data from grid to random position or inversely in PIV, proposed many technuques are compared and discussed mainly in terms of interpolating error and computing time. And artificial PIV atmosphere data is furnished by CFD result. First, for interpolation from grid to random position, multiquadric method gives the highest accuracy with the longest CPU time and Taylor series expansion methods give reasonable accuracy with less calculating load. Secondly, the sub-pixel resolution analysis in estimating the coordinates of the maximum correlation coefficients essential in the grey level correlation PIV reveal that 8-neighbours 2nd-order least square interpolation gives utmost accuracy in terms of the real flow conditions.

Study on the Relationship between Adolescents' Self-esteem and their Sociality -Focusing on the Moderating Effect of Gender -

  • Kim, Kyung-Sook;Lee, Duk-Nam
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권1호
    • /
    • pp.147-153
    • /
    • 2016
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.

Bayesian Regression Modeling for Patent Keyword Analysis

  • Choi, JunHyeog;Jun, SungHae
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권1호
    • /
    • pp.125-129
    • /
    • 2016
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.