• Title/Summary/Keyword: 컨테이너 오케스트레이션

Search Result 13, Processing Time 0.018 seconds

Technique to Reduce Container Restart for Improving Execution Time of Container Workflow in Kubernetes Environments (쿠버네티스 환경에서 컨테이너 워크플로의 실행 시간 개선을 위한 컨테이너 재시작 감소 기법)

  • Taeshin Kang;Heonchang Yu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.3
    • /
    • pp.91-101
    • /
    • 2024
  • The utilization of container virtualization technology ensures the consistency and portability of data-intensive and memory volatile workflows. Kubernetes serves as the de facto standard for orchestrating these container applications. Cloud users often overprovision container applications to avoid container restarts caused by resource shortages. However, overprovisioning results in decreased CPU and memory resource utilization. To address this issue, oversubscription of container resources is commonly employed, although excessive oversubscription of memory resources can lead to a cascade of container restarts due to node memory scarcity. Container restarts can reset operations and impose substantial overhead on containers with high memory volatility that include numerous stateful applications. This paper proposes a technique to mitigate container restarts in a memory oversubscription environment based on Kubernetes. The proposed technique involves identifying containers that are likely to request memory allocation on nodes experiencing high memory usage and temporarily pausing these containers. By significantly reducing the CPU usage of containers, an effect similar to a paused state is achieved. The suspension of the identified containers is released once it is determined that the corresponding node's memory usage has been reduced. The average number of container restarts was reduced by an average of 40% and a maximum of 58% when executing a high memory volatile workflow in a Kubernetes environment with the proposed method compared to its absence. Furthermore, the total execution time of a container workflow is decreased by an average of 7% and a maximum of 13% due to the reduced frequency of container restarts.

Dynamic Resource Adjustment Operator Based on Autoscaling for Improving Distributed Training Job Performance on Kubernetes (쿠버네티스에서 분산 학습 작업 성능 향상을 위한 오토스케일링 기반 동적 자원 조정 오퍼레이터)

  • Jeong, Jinwon;Yu, Heonchang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.7
    • /
    • pp.205-216
    • /
    • 2022
  • One of the many tools used for distributed deep learning training is Kubeflow, which runs on Kubernetes, a container orchestration tool. TensorFlow jobs can be managed using the existing operator provided by Kubeflow. However, when considering the distributed deep learning training jobs based on the parameter server architecture, the scheduling policy used by the existing operator does not consider the task affinity of the distributed training job and does not provide the ability to dynamically allocate or release resources. This can lead to long job completion time and low resource utilization rate. Therefore, in this paper we proposes a new operator that efficiently schedules distributed deep learning training jobs to minimize the job completion time and increase resource utilization rate. We implemented the new operator by modifying the existing operator and conducted experiments to evaluate its performance. The experiment results showed that our scheduling policy improved the average job completion time reduction rate of up to 84% and average CPU utilization increase rate of up to 92%.

Deep Learning OCR based document processing platform and its application in financial domain (금융 특화 딥러닝 광학문자인식 기반 문서 처리 플랫폼 구축 및 금융권 내 활용)

  • Dongyoung Kim;Doohyung Kim;Myungsung Kwak;Hyunsoo Son;Dongwon Sohn;Mingi Lim;Yeji Shin;Hyeonjung Lee;Chandong Park;Mihyang Kim;Dongwon Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.143-174
    • /
    • 2023
  • With the development of deep learning technologies, Artificial Intelligence powered Optical Character Recognition (AI-OCR) has evolved to read multiple languages from various forms of images accurately. For the financial industry, where a large number of diverse documents are processed through manpower, the potential for using AI-OCR is great. In this study, we present a configuration and a design of an AI-OCR modality for use in the financial industry and discuss the platform construction with application cases. Since the use of financial domain data is prohibited under the Personal Information Protection Act, we developed a deep learning-based data generation approach and used it to train the AI-OCR models. The AI-OCR models are trained for image preprocessing, text recognition, and language processing and are configured as a microservice architected platform to process a broad variety of documents. We have demonstrated the AI-OCR platform by applying it to financial domain tasks of document sorting, document verification, and typing assistance The demonstrations confirm the increasing work efficiency and conveniences.