• Title/Summary/Keyword: 그래픽 프로세서

Search Result 133, Processing Time 0.025 seconds

Three-dimensional Wave Propagation Modeling using OpenACC and GPU (OpenACC와 GPU를 이용한 3차원 파동 전파 모델링)

  • Kim, Ahreum;Lee, Jongwoo;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.20 no.2
    • /
    • pp.72-77
    • /
    • 2017
  • We calculated 3D frequency- and Laplace-domain wavefields using time-domain modeling and Fourier transform or Laplace transform. We adopted OpenACC and GPU for an efficient parallel calculation. The OpenACC makes it easy to use GPU accelerators by adding directives in conventional C, C++, and Fortran programming languages. Accordingly, one doesn't have to learn new GPGPU programming languages such as CUDA or OpenCL to use GPU. An OpenACC program allocates GPU memory, transfers data between the host CPU and GPU devices and performs GPU operations automatically or following user-defined directives. We compared performance of 3D wave propagation modeling programs using OpenACC and GPU to that using single-core CPU through numerical tests. Results using a homogeneous model and the SEG/EAGE salt model show that the OpenACC programs are approximately 53 and 30 times faster than those using single-core CPU.

A Study on Processing XML Documents (XML 문서 처리에 관한 연구)

  • Kim, Tae Gwon
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.489-496
    • /
    • 2016
  • XML can effectively express structured or semi-structured data as well as relational databases. XQuery is a query language for retrieving information for such an XML document. In this paper, an XQuery composer is designed and implemented, with an API provided for XQuery processors, and a proper processor is registered. This composer shows query results immediately processed by the processor. As this composer contains a parser for XQuery, it can compose XQuery effectively using a diverse dialog box designed for XQuery grammar. A dialog box is affiliated with a clause region, which is a region that algebra operates from the parsing tree. It can compose path expressions for an XML document easily as it shows an element tree from DTD graphically. Path expressions are composed automatically by marking elements in the structural hierarchy and by specifying the predicate of an element partially.

PC 운영체제의 오늘과 내일

  • 유주진
    • The Magazine of the IEIE
    • /
    • v.19 no.4
    • /
    • pp.1-7
    • /
    • 1992
  • IBM이 16비트 PC를 처음 선보인 81년 이후 오늘에 이르기까지 10여년간의 운영체제(0S) 시장은 마이크로소프트(MS)사의 독주시대로 요약할 수 있다. 그러나 지금까지 이렇다 할 변화를 보이지 않던 OS제품은 90년대에 들어서면서 커다란 변화의 조짐에 횝싸이고 있다. 그동안 8086, 80286등의 마이크로프로세서를 탑재한 16비트 PC시장이 80386,80486등을 탑재한 32비트 시장으로 급변하기 시작했고 종전 데스크탑 일변도의 PC시장은 랩톱,노트북형 등의 휴대형컴퓨터와 펜컴퓨터,멀티미디어 등의 차세대 제품등으로 세분화되기 시작한 것이다. 또 32비트 시대가 다가오면서 한사람이 한 대의 PC로 일을 하는 종전 PC운용환경은 넷워킹과 멀티테스킹이 강조되는 다중작업 환경으로 전환되고 있으며 윈도즈(Windows) 3.0의 대히트로 IBM PC에서도 GUI(그래픽 사용자인터페이스) 환경을 요구, 이를 위한 새로운 05의 등장이 불가피해지고 있다. 게다가 지금까지 메인프레임을 중심으로 한 중앙집중방식의 컴퓨터환경이 다운사이징화 되면서 넷워크 환경을 기반으로 한 PC의 역할이 크게 강조, 이를 위한 운영체제 또한 새로운 영역으로 대두되고 있다. 불과 1∼2년 사이에 급진전되고 있는 이같은 변화의 물결은 필연적으로 다양한 운영체제의 개발을 가져왔고 이를 통해 차세대 PC시장을 주도하기 위한 업계의 패권다툼은 전쟁을 방불케할 만큼 치열해지고 있다. MS사의 전유물이었던 DOS 영역에서는 최근 노벨사와의 합병으로 전열을 가 다듬은 디지틀리서치사가 가세, 한판승부를 선언하고 나섰으며 고성능 PC시대의 패권을 잡기 위해 10년지기였던 IBM과 MS사는 각각 OS/2와 윈도즈를 내세우고 양보할 수 없는 힘겨루기에 들어갔다. 또 이들 양사는 펜컴퓨터,멀티미디어등 차세대제품의 운영체제 시장을 둘러싸고 일찍부터 격전에 들어갔으며 IBM과 MS사의 혼전을 틈타 썬마이크로시스템을 필두로한 워크스테이션 업체 및 유닉스진영까지 고성능 PC시장을 겨냥한 OS를 속속 개발, 90년대의 OS 전쟁은 한치 앞을 내다볼 수 없는 안개국면으로 접어들고 있다. DOS에서 32비트시대,펜컴퓨터, 멀티미디어에 이르는 차세대제품을 둘러싼 업계의 OS 쟁탈전을 통해 OS의 발전동향과 미래를 전망해 본다.

  • PDF

Design of a Floating Point Unit for 3D Graphics Geometry Engine (3D 그래픽 Geometry Engine을 위한 부동소수점 연산기의 설계)

  • Kim, Myeong Hwm;Oh, Min Seok;Lee, Kwang Yeob;Kim, Won Jong;Cho, Han Jin
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.10 s.340
    • /
    • pp.55-64
    • /
    • 2005
  • In this paper, we designed floating point units to accelate real-time 3D Graphics for Geometry processing. Designed floating point units support IEEE-754 single precision format and we confirmed 100 MHz performance of floating point add/mul unit, 120 MHz performance of floating point NR inverse division unit, 200 MHz performance of floating point power unit, 120 MHz performance of floating point inverse square root unit at Xilinx-vertex2. Also, using floating point units, designed Geometry processor and confirmed 3D Graphics data processing.

IPC-based Dynamic SM management on GPGPU for Executing AES Algorithm

  • Son, Dong Oh;Choi, Hong Jun;Kim, Cheol Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.11-19
    • /
    • 2020
  • Modern GPU can execute general purpose computation on the graphic processing unit, and provide high performance by exploiting many core on GPU. To run AES algorithm efficiently, parallel computational resources are required. However, computational resource of CPU architecture are not enough to cryptographic algorithm such as AES whereas GPU architecture has mass parallel computation resources. Therefore, this paper reduce the time to execute AES by employing parallel computational resource on GPGPU. Unfortunately, AES cannot utilize computational resource on GPGPU since it isn't suitable to GPGPU architecture. In this paper, IPC based dynamic SM management technique are proposed to efficiently execute AES on GPGPU. IPC based dynamic SM management can increase and decrease the number of active SMs by using IPC in run-time. According to simulation results, proposed technique improve the performance by increasing resource utilization compared to baseline GPGPU architecture. The results show that AES improve the performance by 41.2% on average.

Design and Implementation of the Semantic Query Adapter(SQA) in the Semantic Web Service Environment (시맨틱 웹 서비스 환경에서 시맨틱 질의 어댑터의 설계 및 구현)

  • Jo Myung Hyun;Son Jin Hyun
    • The KIPS Transactions:PartB
    • /
    • v.12B no.2 s.98
    • /
    • pp.191-202
    • /
    • 2005
  • The Semantic Web Services is a next-generation Web technology that supports Web services, based on the semantic Web technologies. Until now, the researches on semantic Web services may be foiled on the semantic Web document management and the inference engine to efficiently process the semantic Queries. However, in order to realize the principle semantic Web environment it is necessary to provide a semantic query interface though which users and/or agents can efficiently request semantic information. In this regard, we propose the Semantic Query Adapter(SQA) to provide a high query transparency with users, especially when querying about a complex semantic information. We first design the procedural user query interface based on a graphic view, by analyzing DAML-S Profile documents. And then, we builds a module which a user input query transforms its corresponding RDQL. We also propose the multiple semantic query generating procedure as a new method to solve the disjunctive query problem of the RDQL primitive.

Study on LLVM application in Parallel Computing System (병렬 컴퓨팅 시스템에서 LLVM 응용 연구)

  • Cho, Jungseok;Cho, Doosan;Kim, Yongyeon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.395-399
    • /
    • 2019
  • In order to support various parallel computing systems, it is necessary to extend LLVM IR to more efficiently support vector / matrix and to design LLVM IR to machine code as a new algorithm. As shown in the IR example, RISC instruction generation is naturally generated because the RISC instruction is basically composed of the RISC instruction, and the vector instruction is also not supported. There is a need for new IR structures, command generation algorithms and related extensions to support vector / matrix more robustly. To do this, it is important to map each instruction in the LLVM IR to the appropriate instruction in the target architecture (vector / matrix) (instruction selection algorithm). It is necessary to understand the meaning of LLVM IR command, to compare the meaning of each instruction of the target architecture with syntax, and to select the instruction that matches the pattern to make mapping efficient.

Performance Evaluation of Workstation System within ATM Integrated Service Switching System using Mean Value Analysis Algorithm (MVA 알고리즘을 이용한 ATM 기반 통합 서비스 교환기 내 워크스테이션의 성능 평가)

  • Jang, Seung-Ju;Kim, Gil-Yong;Lee, Jae-Hum;Park, Ho-Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.4
    • /
    • pp.421-429
    • /
    • 2000
  • In present, ATM integrated switching system has been developed to a mixed modules that complexed switching system including maintenance, operation based on B-ISDN/LAN service and plug-in module, , which runs on workstation computer system. Meanwhile, workstation has HMI operation system feature including file system management, time management, graphic processing, TMN agent function. The workstation has communicated with between ATM switching module and clients. This computer system architecture has much burden messages communication among processes or processor. These messages communication consume system resources which are socket, message queue, IO device files, regular files, and so on. Therefore, in this paper we proposed new performance modeling with this system architecture. We will analyze the system bottleneck and improve system performance. In addition, in the future, the system has many additional features should be migrated to workstation system, we need previously to evaluate system bottleneck and redesign it. In performance model, we use queueing network model and the simulation package is used PDQ and C-program.

  • PDF

An Efficient 2-dimensional Addressing Mode for Image Processor (영상처리용 프로세서를 위한 효율적인 이차원 어드레스 지정 기법)

  • Go, Yun-Ho;Yun, Byeong-Ju;Kim, Seong-Dae
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.5
    • /
    • pp.486-497
    • /
    • 2001
  • In this paper, we propose a new addressing mode, which can be used for programmable image processor to perform image-processing algorithms effectively. Conventional addressing modes are suitable for one-dimensional data processing such as voice, but the proposed addressing mode consider two-dimensional characteristics of image data. The proposed instruction for two-dimensional addressing requires two operands to specify a pixel and doesn't require any change of memory architecture. The proposed two-dimensional addressing mode for image processor has the following advantages. The proposed instruction combines several instructions to load a pixel data from an external memory to a register. Hence, the proposed instruction reduces required code size so that it satisfies high performance and low power requirements of image processor. In addition, it uses inherent two-dimensional characteristics of image data and offers user-friendly instruction to assembler programmer The proposed two-dimensional addressing mode is applicable to DSP, media processor, graphic device, and so on. In this paper, we propose a new concept of two-dimensional addressing mode and an efficient hardware implementation method of it.

  • PDF

CUDA-based Parallel Bi-Conjugate Gradient Matrix Solver for BioFET Simulation (BioFET 시뮬레이션을 위한 CUDA 기반 병렬 Bi-CG 행렬 해법)

  • Park, Tae-Jung;Woo, Jun-Myung;Kim, Chang-Hun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.1
    • /
    • pp.90-100
    • /
    • 2011
  • We present a parallel bi-conjugate gradient (Bi-CG) matrix solver for large scale Bio-FET simulations based on recent graphics processing units (GPUs) which can realize a large-scale parallel processing with very low cost. The proposed method is focused on solving the Poisson equation in a parallel way, which requires massive computational resources in not only semiconductor simulation, but also other various fields including computational fluid dynamics and heat transfer simulations. As a result, our solver is around 30 times faster than those with traditional methods based on single core CPU systems in solving the Possion equation in a 3D FDM (Finite Difference Method) scheme. The proposed method is implemented and tested based on NVIDIA's CUDA (Compute Unified Device Architecture) environment which enables general purpose parallel processing in GPUs. Unlike other similar GPU-based approaches which apply usually 32-bit single-precision floating point arithmetics, we use 64-bit double-precision operations for better convergence. Applications on the CUDA platform are rather easy to implement but very hard to get optimized performances. In this regard, we also discuss the optimization strategy of the proposed method.