• Title/Summary/Keyword: CPU 시간

Search Result 518, Processing Time 0.029 seconds

Voronoi Diagram Computation for a Molecule Using Graphics Hardware (그래픽 하드웨어를 이용한 분자용 보로노이 다이어그램 계산)

  • Lee, Jung-Eun;Baek, Nak-Hoon;Kim, Ku-Jin
    • The KIPS Transactions:PartA
    • /
    • v.19A no.4
    • /
    • pp.169-174
    • /
    • 2012
  • We present an algorithm that computes a 3 dimensional Voronoi diagram for a protein molecule in this paper. The molecule is represented as a set of spheres with van der Waals radii. The Voronoi diagram is constructed in the 3D space by finding the voxels containing it. For the feasibility of the computation, we represent the molecule as a BVH (bounding volume hierarchy), and our system is accelerated by modern graphics hardware with CUDA programming support. Compared to single-core CPU implementations, experimental results show 323 times faster performance in the computation time, when the space is partitioned into $2^{24}$ voxels.

A Rate Regulating Proportational-Share Scheduler for Multimedia Tasks (멀티미디어 태스크를 위한 비율조정 비계지분 스케줄러)

  • Gong, Gi-Seok;Kim, Man-Hui;Jo, Si-Hun;Kim, Cheol-Gi;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.7
    • /
    • pp.788-812
    • /
    • 1999
  • 본 논문에서는 범용 워크스테이션 환경하에서 수행되는 멀티미디어 응용프로그램(application)을 지원하기 위한 비례지분 방식의 CPU 스케줄러를 제시한다. 이러한 목적을 위하여 일반적 태스크의 지원을 위해 설계된 스트라이드 스케줄러를 확장한다. 멀티미디어 응용프로그램의 시간 요구사항을 명시하기 위하여 새로운 스케줄링 파라미터들을 도입한다. 비율조정기를 도입한 결과 스케줄링의 정확도의 오차는 O(1)로 감소하였다. 별도의 태스크 그룹을 설정하여 상대적 지분과 절대적 지분을 부여했다. 모의실험을 사용하여 스케줄러의 성능을 평가하였다. 그 결과, 제안된 스케줄러는 증가된 정확도와 적응성 및 유연성을 가짐을 알 수 있었다. Abstract This paper presents a proportional-share CPU scheduler which can support multimedia applications in a general-purpose workstation environment. For this purpose, we have extended the stride scheduler which is designed originally for conventional tasks. New scheduling parameters are introduced to specify timing requirements of multimedia applications. Through the use of the rate regulator, the accuracy error of the scheduling is reduced to O(1). Separate task groups are proposed to represent both relative shares and absolute shares. The proposed scheduler is evaluated using a simulation study. The results show that the proposed scheduler achieves improved accuracy and adaptability as well as flexibility.

Molecular Dynamics Simulation Design and Implementation for Nozzles and Turbines (노즐과 터빈에 대한 분자동력학 시뮬레이션 설계 및 구현)

  • Kim, Su-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.1
    • /
    • pp.147-154
    • /
    • 2019
  • In this research, a molecular dynamics system was designed and developed to calculate trajectories of molecules in nozzles and turbin blades. The Lennard-Jones potential model was used to approximate the interaction between a pair of molecules and the Verlet integration is used as a numerical method to integrate Newton's equations of motion. To compute Lennard-Jones potential functions, for the number of molecules N, the computation complexity $O(N^2)$ for interactions of all pairs of molecules is reduced to O(N) by using cutoff radius $r_c$. This was implemented to save CPU times.

Multi-disciplinary Optimization of Composite Sandwich Structure for an Aircraft Wing Skin Using Proper Orthogonal Decomposition (적합직교분해법을 이용한 항공기 날개 스킨 복합재 샌드위치 구조의 다분야 최적화)

  • Park, Chanwoo;Kim, Young Sang
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.47 no.7
    • /
    • pp.535-540
    • /
    • 2019
  • The coupling between different models for MDO (Multi-disciplinary Optimization) greatly increases the complexity of the computational framework, while at the same time increasing CPU time and memory usage. To overcome these difficulties, POD (Proper Orthogonal Decomposition) and RBF (Radial Basis Function) are used to solve the optimization problem of determining the thickness of composites and sandwich cores when composite sandwich structures are used as aircraft wing skin materials. POD and RBF are used to construct surrogate models for the wing shape and the load data. Optimization is performed using the objective function and constraint function values which are obtained from the surrogate models.

A Light Weighted Robust Korean Morphological Analyzer for Korean-to-English Mobile Translator (한영 모바일 번역기를 위한 강건하고 경량화된 한국어 형태소 분석기)

  • Yuh, Sang-Hwa
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.191-199
    • /
    • 2009
  • In this paper we present a light weighted robust Korean morphological analyzer for mobile devices such as mobile phones, smart phones, and PDA phones. Such mobile devices are not suitable for natural language interfaces for their low CPU performance and memory restriction. In order to overcome the difficulties we propose 1) an online analysis by using Key Event Handler mechanism, 2) and a robust analysis of the Korean sentences with spacing errors without its correction pre-processing. We adapt the proposed Korean analyzer to a Korean-English mobile translator, which shows 5.8% memory usage reduction and 19.0% enhancement of average response time.

Parallel Computation For The Edit Distance Based On The Four-Russians' Algorithm (4-러시안 알고리즘 기반의 편집거리 병렬계산)

  • Kim, Young Ho;Jeong, Ju-Hui;Kang, Dae Woong;Sim, Jeong Seop
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.2
    • /
    • pp.67-74
    • /
    • 2013
  • Approximate string matching problems have been studied in diverse fields. Recently, fast approximate string matching algorithms are being used to reduce the time and costs for the next generation sequencing. To measure the amounts of errors between two strings, we use a distance function such as the edit distance. Given two strings X(|X| = m) and Y(|Y| = n) over an alphabet ${\Sigma}$, the edit distance between X and Y is the minimum number of edit operations to convert X into Y. The edit distance between X and Y can be computed using the well-known dynamic programming technique in O(mn) time and space. The edit distance also can be computed using the Four-Russians' algorithm whose preprocessing step runs in $O((3{\mid}{\Sigma}{\mid})^{2t}t^2)$ time and $O((3{\mid}{\Sigma}{\mid})^{2t}t)$ space and the computation step runs in O(mn/t) time and O(mn) space where t represents the size of the block. In this paper, we present a parallelized version of the computation step of the Four-Russians' algorithm. Our algorithm computes the edit distance between X and Y in O(m+n) time using m/t threads. Then we implemented both the sequential version and our parallelized version of the Four-Russians' algorithm using CUDA to compare the execution times. When t = 1 and t = 2, our algorithm runs about 10 times and 3 times faster than the sequential algorithm, respectively.

Ubiquitous Workspace Synchronization in a Cloud-based Framework (클라우드 기반 프레임워크에서 유비쿼터스 워크스페이스 동기화)

  • Elijorde, Frank I.;Yang, Hyunho;Lee, Jaewan
    • Journal of Internet Computing and Services
    • /
    • v.14 no.1
    • /
    • pp.53-62
    • /
    • 2013
  • It is common among users to have multiple computing devices as well as to access their files or do work at different locations. To achieve file consistency as well as mobility in this scenario, an efficient approach for workspace synchronization should be used. However, file synchronization alone cannot guarantee the mobility of work environment which allows activities to be resumed at any place and time. This paper proposes a ubiquitous synchronization approach which provides cloud-based access to a user's workspace. Efficient synchronization is achieved by combining session monitoring with file system management. Experimental results show that the proposed mechanism outperforms Cloud Master-replica Synchronization in terms of number of I/O operations, CPU utilization, as well as the average and maximum latencies in responding to client requests.

Analysis of Job Scheduling and the Efficiency for Multi-core Mobile GPU (멀티코어형 모바일 GPU의 작업 분배 및 효율성 분석)

  • Lim, Hyojeong;Han, Donggeon;Kim, Hyungshin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.7
    • /
    • pp.4545-4553
    • /
    • 2014
  • Mobile GPU has led to the rapid development of smart phone graphic technology. Most recent smart phones are equipped with high-performance multi-core GPU. How a multi-core mobile GPU can be utilized efficiently will be a critical issue for improving the smart phone performance. On the other hand, most current research has focused on a single-core mobile GPU; studies of multi-core mobile GPU are rare. In this paper, the job scheduling patterns and the efficiency of multi-core mobile GPU are analyzed. In the profiling result, despite the higher number of GPU cores, the total processing time required for certain graphics applications were increased. In addition, when GPU is processing for 3D games, a substantial amount of overhead is caused by communication between not only the CPU and GPU, but also within the GPUs. These results confirmed that more active research for multi-core mobile GPU should be performed to optimize the present mobile GPUs.

Study on Hydroelastic Analysis of LNGC Cargo by Global-Local Analysis Technique (전역-국부 해석기법에 의한 LNG 운반선 화물창의 유탄성 해석에 관한 연구)

  • Park, Seong-Woo;Cho, Jin-Rae
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.20 no.1
    • /
    • pp.83-92
    • /
    • 2007
  • There are many numerical methods to solve large-scale fluid-structure interaction(FSI) problems. However, these methods require very fine mesh to achieve the reasonable numerical accuracy and stability due to the concentrated and volatile hydrodynamic pressure caused by the liquid sloshing. Consequently, the numerical analysis targeting for the long-period time response with the desired numerical accuracy Is very highly time-consuming. The aim of this paper is to suggest a new method to analyze the hydroelastic behavior of the LNGC containment by using the global-local numerical approach. The reliability of the presented method is firstly examined, and then its efficiency is demonstrated by presenting that the long-period local responses of the LNGC containment are obtained with relatively short CPU time.

Performance Enhancement and Evaluation of AES Cryptography using OpenCL on Embedded GPGPU (OpenCL을 이용한 임베디드 GPGPU환경에서의 AES 암호화 성능 개선과 평가)

  • Lee, Minhak;Kang, Woochul
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.7
    • /
    • pp.303-309
    • /
    • 2016
  • Recently, an increasing number of embedded processors such as ARM Mali begin to support GPGPU programming frameworks, such as OpenCL. Thus, GPGPU technologies that have been used in PC and server environments are beginning to be applied to the embedded systems. However, many embedded systems have different architectural characteristics compare to traditional PCs and low-power consumption and real-time performance are also important performance metrics in these systems. In this paper, we implement a parallel AES cryptographic algorithm for a modern embedded GPU using OpenCL, a standard parallel computing framework, and compare performance against various baselines. Experimental results show that the parallel GPU AES implementation can reduce the response time by about 1/150 and the energy consumption by approximately 1/290 compare to OpenMP implementation when 1000KB input data is applied. Furthermore, an additional 100 % performance improvement of the parallel AES algorithm was achieved by exploiting the characteristics of embedded GPUs such as removing copying data between GPU and host memory. Our results also demonstrate that higher performance improvement can be achieved with larger size of input data.