• Title/Summary/Keyword: CPU 시간

Search Result 518, Processing Time 0.033 seconds

GPU Resource Contention Management Technique for Simultaneous GPU Tasks in the Container Environments with Share the GPU (GPU를 공유하는 컨테이너 환경에서 GPU 작업의 동시 실행을 위한 GPU 자원 경쟁 관리기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.10
    • /
    • pp.333-344
    • /
    • 2022
  • In a container-based cloud environment, multiple containers can share a graphical processing unit (GPU), and GPU sharing can minimize idle time of GPU resources and improve resource utilization. However, in a cloud environment, GPUs, unlike CPU or memory, cannot logically multiplex computing resources to provide users with some of the resources in an isolated form. In addition, containers occupy GPU resources only when performing GPU operations, and resource usage is also unknown because the timing or size of each container's GPU operations is not known in advance. Containers unrestricted use of GPU resources at any given point in time makes managing resource contention very difficult owing to where multiple containers run GPU tasks simultaneously, and GPU tasks are handled in black box form inside the GPU. In this paper, we propose a container management technique to prevent performance degradation caused by resource competition when multiple containers execute GPU tasks simultaneously. Also, this paper demonstrates the efficiency of container management techniques that analyze and propose the problem of degradation due to resource competition when multiple containers execute GPU tasks simultaneously through experiments.

Booting Process Profiling Tool for Baseboard Management Controllers (베이스보드 매니지먼트 컨트롤러를 위한 부팅 과정 프로파일링 도구)

  • Jaeseop Kim;Minho Park;Jiman Hong
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.84-91
    • /
    • 2022
  • Baseboard Management Controller(BMC) supports server monitoring, maintenance, and control functions using various communication interfaces. However, if an unexpected problem occurs during the device driver initialization process, the BMC may not operate normally. Therefore, a boot process profiling tool that accurately analyzes the device driver initialization process and provides a function to check the analysis result is essential. Existing boot process profiling tools do not specifically provide the device driver initialization process and results required for BMC boot process analysis, forcing developers to use a combination of tools to analyze the boot process in detail. In this paper, we propose an integrated profiling tool for BMC's booting process. The proposed tool provides device driver initialization process analysis, CPU and memory usage analysis, and kernel version management functions. Users can easily analyze the booting process using the proposed tool, and the analysis result can be used to shorten the booting time. Also, the proposed tool is implemented in Linux-based BMC, and it is shown that the proposed tool is more efficient than the existing profiling tool.

Stream-based API composition for stable API Gateway (안정적인 API 게이트웨이를 위한 스트림 기반 API 조합)

  • Dong-il Cho
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.1-8
    • /
    • 2024
  • In the API gateway, API composition is an essential function that can reduce the number of client calls and prevent over-fetching and under-fetching. API composition that operate with IMJ (In-Memory Join) consume a lot of resources, putting a burden on the performance of the API gateway. In this paper, to improve the problem of IMJ-style API composition, we propose SAPIC (Stream-based API Composition), which delivers the data to be composed to the client by streaming. SAPIC calls each MSA API that makes up the client response data and immediately streams the received response data to the client, reducing the resource consumption of the API gateway and providing faster response time compared to IMJ. As a result of a comparison experiment with GraphQL, a representative API combination technology, SAPIC recorded a maximum CPU occupancy rate of approximately 21 to 70 % lower, a maximum heap usage rate of approximately 16 to 74 % lower, and a throughput rate that was 1 to 2.3 times higher than GraphQL.

Dispersion Analysis of the Waveguide Structures by Using the Compact 2D ADI-FDTD (Compact 2D ADI-FDTD를 이용한 도파관 구조의 분산특성 연구)

  • 어수지;천정남;박현식;김형동
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.39 no.10
    • /
    • pp.38-45
    • /
    • 2002
  • This paper presents the new Compact 2D ADI-FDTD(Alternating-Direction Implicit Finite-Difference Time-Domain) method, where the time step is no longer restricted by the numerical stability condition. This method is an accelerating algorithm for the conventional Compact 2D FDTD method. To validate this algorithm, we have analyzed the dispersion characteristics of the hollow rectangular waveguide and the shielded microstrip line. The results of the proposed method are very well agreed with those of both the conventional analytic method and the Compact 2D FDTD method. The CPU time for analysis of this method is very much reduced compared with the conventional Compact 2D FDTD method. The proposed method is valuable as a fast algorithm in the research of dispersion characteristics of the waveguide structures.

Implementation of an ASP Upload Component to Comply with RFC 1867 (RFC 1867 규격을 준수하는 ASP 업로드 컴포넌트 설계)

  • Hwang Hyun-Ju;Kang Koo-Hong
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.3
    • /
    • pp.63-74
    • /
    • 2006
  • Recently many ASP applications have been released which enable them to accept, save and manipulate files uploaded with a web browser. The files are uploaded via an HTML POST form using RFC 1867 In particular, the file transfer via the HTTP port is getting more important because of the current Internet security issues. In this paper, we implement a form-based ASP upload component and disclose explicitly most of the main codes. That is, the open source might be helpful to develop the new ASP applications including file upload function in the future. We also show the upload time and CPU usage time of the proposed upload component and compare with the well-known commercial ones, showing the performance metrics of the proposed component are comparable to those of commercial ones.

  • PDF

Development of a Realtime Surface Image Velocimeter without Reference Points (참조점이 필요없는 실시간 표면영상유속계 개발)

  • Yu, Kwonkyu;Yoo, Byeongnam
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.73-73
    • /
    • 2015
  • 자연 하천의 홍수 유량 측정은 매우 어렵고 많은 비용과 시간, 노력을 요하는 작업이다. 보다 안전하고 경제적인 유량 측정의 대안으로 제시된 것이 하천 표면의 영상 분석을 이용하는 표면영 상유속계이다. 본 연구는 안드로이드 기반의 스마트폰을 이용한 실시간 표면영상유속계를 개발하는 것이다. 스마트폰에 내장된 카메라, GPS, 방향 센서, CPU를 활용하여, 실시간으로 현장에서 하천의 표면유속을 측정하는 것이다. 먼저, 스마트폰의 GPS를 이용하여 측정 현장의 위치를 잡고, 경사계(방향 센서)를 활용하여 카메라와 촬영면의 기하적인 관계를 설정한다. 수표면과 카메라의 높이차만을 입력하고, 측정된 카메라의 경사에서 하천 수표면의 위치관계를 추정할 수 있는 카메라 모형을 작성하였다. 이 방법을 이용함으로써 기존 표면영상유속계의 단점 중 하나인 참조점 보정이 필요없도록 하였다. 내장된 카메라로 정해진 시간(3초) 동안 동영상을 촬영하고, 촬영된 동영상은 개방 소스의 영상처리 라이브러리인 JavaCV를 이용하여 프레임별로 분할하고, 이를 시공간 영상 분석하여 하천 표면의 2차원 유속장을 추정한다. 영상의 시공간 분석에는 상호상관 시공간분석법을 이용하였다. 모든 코드는 안드로이드 운영체제에서 실행되도록 Java로 작성하였다. 시판되는 안드로이드 스마트폰에 적용하여 현장 시험한 결과 3초간의 영상 처리에 5초 정도를 소요하여, 거의 실시간으로 유속을 측정할 수 있었다. 또한 유속 측정 오차는 일반적인 영상 처리의 오차인 5% 내외였다.

  • PDF

A Shape-based 3D object retrieval and pose estimation scheme for the mobile environment (모바일 기반의 3 차원 객체 검색과 자세 추정을 위한 외형 기반의 인덱스 구축 및 검색 기법)

  • Tak, Yoon-Sik;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.395-398
    • /
    • 2009
  • 3 차원 객체 검색 및 자세 추정 기법은 의료, 보안 등의 다양한 산업 영역에서 매우 중요한 이슈 중 하나로써 연구되고 있다. 정확한 객체 검색 및 자세 추정을 위해서는 객체의 가능한 모든 영상 정보를 사용하여야 하기 때문에 많은 연산시간이 걸리게 되고, 특히 객체의 정확한 자세를 추정하기 위해서는 높은 CPU 의 성능과 큰 메모리 공간을 필요로 한다. 이러한 제약으로 인해, 3 차원 객체 검색 및 자세 추정은 상대적으로 하드웨어의 성능이 낮은 모바일 장치에서 실행되기 어려웠다. 따라서, 본 논문에서는 모바일 장치에서도 효과적으로 객체 검색 및 자세 추정이 가능하도록 하기 위한 클라이언트-서버 환경에서의 객체의 외형 기반 인덱스 구축 및 검색 기법을 제안한다. 제안된 기법의 주요 특징은 i) 모바일 장치의 하드웨어 환경을 고려하여 비교적 적은 수의 객체의 영상을 바탕으로 한 객체 검색 및 후보 자세 예측과 ii) 모바일 장치에서의 검색 결과와 많은 수의 객체 영상을 기반으로 한 서버에서의 정확한 자세 추정이다. 실험 결과에서는 제안된 기법들을 통해, 빠른 시간 내에 정확한 객체 검색 및 자세 추정이 가능함을 보였다.

Design of Omok AI using Genetic Algorithm and Game Trees and Their Parallel Processing on the GPU (유전 알고리즘과 게임 트리를 병합한 오목 인공지능 설계 및 GPU 기반 병렬 처리 기법)

  • Ahn, Il-Jun;Park, In-Kyu
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.2
    • /
    • pp.66-75
    • /
    • 2010
  • This paper proposes an efficient method for design and implementation of the artificial intelligence (AI) of 'omok' game on the GPU. The proposed AI is designed on a cooperative structure using min-max game tree and genetic algorithm. Since the evaluation function needs intensive computation but is independently performed on a lot of candidates in the solution space, it is computed on the GPU in a massive parallel way. The implementation on NVIDIA CUDA and the experimental results show that it outperforms significantly over the CPU, in which parallel game tree and genetic algorithm on the GPU runs more than 400 times and 300 times faster than on the CPU. In the proposed cooperative AI, selective search using genetic algorithm is performed subsequently after the full search using game tree to search the solution space more efficiently as well as to avoid the thread overflow. Experimental results show that the proposed algorithm enhances the AI significantly and makes it run within the time limit given by the game's rule.

A Study on GPU Computing of Bi-conjugate Gradient Method for Finite Element Analysis of the Incompressible Navier-Stokes Equations (유한요소 비압축성 유동장 해석을 위한 이중공액구배법의 GPU 기반 연산에 대한 연구)

  • Yoon, Jong Seon;Jeon, Byoung Jin;Jung, Hye Dong;Choi, Hyoung Gwon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.40 no.9
    • /
    • pp.597-604
    • /
    • 2016
  • A parallel algorithm of bi-conjugate gradient method was developed based on CUDA for parallel computation of the incompressible Navier-Stokes equations. The governing equations were discretized using splitting P2P1 finite element method. Asymmetric stenotic flow problem was solved to validate the proposed algorithm, and then the parallel performance of the GPU was examined by measuring the elapsed times. Further, the GPU performance for sparse matrix-vector multiplication was also investigated with a matrix of fluid-structure interaction problem. A kernel was generated to simultaneously compute the inner product of each row of sparse matrix and a vector. In addition, the kernel was optimized to improve the performance by using both parallel reduction and memory coalescing. In the kernel construction, the effect of warp on the parallel performance of the present CUDA was also examined. The present GPU computation was more than 7 times faster than the single CPU by double precision.

Parallel Range Query processing on R-tree with Graphics Processing Units (GPU를 이용한 R-tree에서의 범위 질의의 병렬 처리)

  • Yu, Bo-Seon;Kim, Hyun-Duk;Choi, Won-Ik;Kwon, Dong-Seop
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.5
    • /
    • pp.669-680
    • /
    • 2011
  • R-trees are widely used in various areas such as geographical information systems, CAD systems and spatial databases in order to efficiently index multi-dimensional data. As data sets used in these areas grow in size and complexity, however, range query operations on R-tree are needed to be further faster to meet the area-specific constraints. To address this problem, there have been various research efforts to develop strategies for acceleration query processing on R-tree by using the buffer mechanism or parallelizing the query processing on R-tree through multiple disks and processors. As a part of the strategies, approaches which parallelize query processing on R-tree through Graphics Processor Units(GPUs) have been explored. The use of GPUs may guarantee improved performances resulting from faster calculations and reduced disk accesses but may cause additional overhead costs caused by high memory access latencies and low data exchange rate between GPUs and the CPU. In this paper, to address the overhead problems and to adapt GPUs efficiently, we propose a novel approach which uses a GPU as a buffer to parallelize query processing on R-tree. The use of buffer algorithm can give improved performance by reducing the number of disk access and maximizing coalesced memory access resulting in minimizing GPU memory access latencies. Through the extensive performance studies, we observed that the proposed approach achieved up to 5 times higher query performance than the original CPU-based R-trees.