• Title/Summary/Keyword: CPU time

Search Result 940, Processing Time 0.028 seconds

Analysis of Worst Case DMA Response Time in Fixed-Priority Bus Arbitration Protocol (고정우선순위 버스 프로토콜 환경에서 DMA I/O 요구의 최악 응답시간 분석)

  • Hahn, Joo-Sun;Ha, Rhan;Min, Sang-Lyul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10c
    • /
    • pp.21-23
    • /
    • 1999
  • CPU에게 최상위 우선순위가 할당된 고정 우선순위 버스 프로토콜에서는 CPU와 DMA 컨트롤러의 버스 요구가 충돌할 경우 DMA 전송이 지연된다. 본 논문에서는 CPU와 다수의 DMA 컨트롤러가 시스템 버스를 공유하는 환경에서 DAM I/O 요구의 최악 응답시간을 분석하는 기법을 제안한다. 제안하는 최악 응답시간 분석 기법은 다음의 세단계로 구성되어 있다. 첫 번째 단계에서는 CPU 상에서 수행중인 각 CPU 태스크별로 최악 버스 요구 패턴을 구한다. 두 번째 단계에서는 이들 CPU 태스크의 최악 버스 요구 패턴을 모두 통합해 CPU 전체의 최악 버스 요구 패턴을 구한다. 최종 세 번째 단계에서는 CPU의 최악 버스 요구 패턴으로부터 DMA 컨트롤러의 버스 가용량을 구하고 DMA I/O 요구의 최악 응답시간을 산출한다. 모의 실험을 통해 제안하는 분석 기법일 일반적인 DMA전송량에 대해 20% 오차 범위 이내에서 안전한 응답시간을 산출함을 보였다.

  • PDF

Construction of a CPU Cluster and Implementation of a 3-D Domain Decomposition Parallel FDTD Algorithm (CPU 클러스터 구축 및 3차원 공간분할 병렬 FDTD 알고리즘 구현)

  • Park, Sungmin;Chu, Kwang-Uk;Ju, Saehoon;Park, Yoon-Mi;Kim, Ki-Baek;Jung, Kyung-Young
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.25 no.3
    • /
    • pp.357-364
    • /
    • 2014
  • In this work, we construct a CPU cluster to implement a parallel finite-difference time domain(FDTD) algorithm for fast electromagnetic analyses. This parallel FDTD algorithm can reduce the computational time significantly and also analyze electrically larger structures, compared to a single FDTD counterpart. The parallel FDTD algorithm needs communication between neighboring processors, which is performed by the MPI(Message Passing Interface) library and a 3-D domain decomposition is employed to decrease the communication time between neighboring processors. Compared to a single-processor FDTD, the speed up factor of a-CPU-cluster-based parallel FDTD algorithm is investigated for the normal mode and the hypermode and finally analyze an electrically large concrete structure by the developed parallel algorithm.

Twowheeled Motor Vehicle License Plate Recognition Algorithm using CPU based Deep Learning Convolutional Neural Network (CPU 기반의 딥러닝 컨볼루션 신경망을 이용한 이륜 차량 번호판 인식 알고리즘)

  • Kim Jinho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.127-136
    • /
    • 2023
  • Many research results on the traffic enforcement of illegal driving of twowheeled motor vehicles using license plate recognition are introduced. Deep learning convolutional neural networks can be used for character and word recognition of license plates because of better generalization capability compared to traditional Backpropagation neural networks. In the plates of twowheeled motor vehicles, the interdependent government and city words are included. If we implement the mutually independent word recognizers using error correction rules for two word recognition results, efficient license plate recognition results can be derived. The CPU based convolutional neural network without library under real time processing has an advantage of low cost real application compared to GPU based convolutional neural network with library. In this paper twowheeled motor vehicle license plate recognition algorithm is introduced using CPU based deep-learning convolutional neural network. The experimental results show that the proposed plate recognizer has 96.2% success rate for outdoor twowheeled motor vehicle images in real time.

VTF: A Timer Hypercall to Support Real-time of Guest Operating Systems (VIT: 게스트 운영체제의 실시간성 지원을 위한 타이머 하이퍼콜)

  • Park, Mi-Ri;Hong, Cheol-Ho;Yoo, See-Hwan;Yoo, Chuck
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.1
    • /
    • pp.35-42
    • /
    • 2010
  • Guest operating systems running over the virtual machines share a variety of resources. Since CPU is allocated in a time division manner it consequently leads them to having the unknown physical time. It is not regarded as a serious problem in the server virtualization fields. However, it becomes critical in embedded systems because it prevents guest OS from executing real time tasks when it does not occupy CPU. In this paper we propose a hypercall to register a timer service to notify the timer request related real time. It enables hypervisor to schedule a virtual machine which has real time tasks to execute, and allows guest OS to take CPU on time to support real time. The following experiment shows its implementation on Xen-Arm and para-virtualized Linux. We also analyze the real time performance with response time of test application and frames per second of Mplayer.

Algorithm or Parallel Computation for a multi-CPU controlled Robot Manipulator (복수의 CPU로 제어되는 매니퓰레이터의 병렬계산 알고리즘)

  • Woo, Kwang-Bang;Kim, Hyun-Ki;Choi, Gyoo-Suck
    • Proceedings of the KIEE Conference
    • /
    • 1987.07a
    • /
    • pp.288-292
    • /
    • 1987
  • The purpose of this paper is to develope the parallel computation algorithm that enables it to minimize the completion tine of computation execution of the entire subtasks, under the constraints of the series-parallel precedence relation in each subtask. The developed algorithm was applied to the control of a robot manipulator functioned by multi-CPU's and to obtain the minimum time schedule so that real time control may be achieved. The completion time of computation execution was minimized by applying "Variable" Branch and Bound algorithm which was developed In this paper in determining the optimum ordered schedule for each CPU.

  • PDF

Implementation and Performance Evaluation of the Faddev-Leverrier Algorithm using GPGPU (GPGPU를 이용한 파데브-레브리어 알고리즘 구현 및 성능 분석)

  • Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.3
    • /
    • pp.171-178
    • /
    • 2013
  • In this paper, we implement the Faddev-Leverier algorithm using GPGPU (General-Purpose Graphics Processing Unit) to accelerate singular value decomposition. In addition, we compare the performance of the algorithm using CPU and CPU plus GPGPU for eleven ${\times}n$ matrix sizes in order to decompose singular values, where =4, 8, 16, 32, 64, 128, 256, 512, 1,024, 2,048, and 4,096. Experimental results indicate that CPU achieves better performance than CPU plus GPGPU for $n{\leq}64$ because of a large number of read and write operations between CPU and GPGPU. However, CPU plus GPGPU outperforms CPU exponentially in the execution time for $n{\geq}64$.

Enhancement of H.264/AVC Encoding Speed and Reduction of CPU Load through Parallel Programming Based on CUDA (CUDA 기반의 병렬 프로그래밍을 통한 H.264/AVC 부호화 속도 향상 및 CPU 부하 경감)

  • Jang, Eun-Been;Ha, Yun-Su
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.858-863
    • /
    • 2010
  • In order to enhance encoding speed in dynamic image encoding using H.264/AVC, reducing the time for motion estimation which takes a large portion of the processing time is very important. An approach using graphics processing unit(GPU) as a coprocessor to assist the central processing unit(CPU) in computing massive data, will be a way to reduce the processing time. In this paper, we present an efficient block-level parallel algorithm for the motion estimation(ME) on a computer unified device architecture(CUDA) platform developed in general-purpose computation on GPU. Experiments are carried out to verify the effectiveness of the proposed algorithm.

A controller design for direct drive arm robot using 32-Bit (MC 68020) CPU (32비트(MC 68020) CPU를 사용한 직접구동방식 로보트의 제어기 설계)

  • 이주장;윤형우;곽윤근
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1988.10a
    • /
    • pp.82-85
    • /
    • 1988
  • This paper are the manufacture of controller of direct drive arm robot using 32 bit CPU(MC 69020). The work would draw on KIT of Robotics Laboratory whose extensive experience in 16 bit CPU Controller(MC 68008) in addition to the WHILE languages. We found that this controller is good for the direct drive arm robot controller for the use of self-tuning algorithms and real time control.

  • PDF

CPU Temperature on Traffic Processing between Two Servers

  • Lee, Sang-Bock;Kim, Hyun-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.871-877
    • /
    • 2005
  • The purpose of this paper is to identify the CPU temperatures on traffic processing between two servers system. To test this model, this research applies multi-generator and resource reservation protocol that produce various types of traffics. The empirical results indicate that $56^{\circ}C\mp9^{\circ}C$ of CPU temperature is suitable when 250-300 traffics with 10-15kb per a packet are supplied. And also, no jitter delay time is showed in these cases.

  • PDF

GPU Based Incremental Connected Component Processing in Dynamic Graphs (동적 그래프에서 GPU 기반의 점진적 연결 요소 처리)

  • Kim, Nam-Young;Choi, Do-Jin;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.6
    • /
    • pp.56-68
    • /
    • 2022
  • Recently, as the demand for real-time processing increases, studies on a dynamic graph that changes over time has been actively done. There is a connected components processing algorithm as one of the algorithms for analyzing dynamic graphs. GPUs are suitable for large-scale graph calculations due to their high memory bandwidth and computational performance. However, when computing the connected components of a dynamic graph using the GPU, frequent data exchange occurs between the CPU and the GPU during real graph processing due to the limited memory of the GPU. The proposed scheme utilizes the Weighted-Quick-Union algorithm to process large-scale graphs on the GPU. It supports fast connected components computation by applying the size to the connected component label. It computes the connected component by determining the parts to be recalculated and minimizing the data to be transmitted to the GPU. In addition, we propose a processing structure in which the GPU and the CPU execute asynchronously to reduce the data transfer time between GPU and CPU. We show the excellence of the proposed scheme through performance evaluation using real dataset.