• Title/Summary/Keyword: OpenCL

Search Result 283, Processing Time 0.043 seconds

Accelerating Fingerprint Enhancement Algorithm on GPGPU using OpenCL (OpenCL을 이용한 GPGPU 기반 지문개선 알고리즘 가속화)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.4
    • /
    • pp.666-672
    • /
    • 2016
  • Recently the fingerprint is widely used as one of biometrics to improve the security of financial mobile applications, because of its user convenience and high recognition rate. However, in order to apply fingerprint algorithms to finance and security applications, the recognition rate and processing speed of the fingerprint algorithms have to be improved further. In this paper, we propose the parallel fingerprint enhancement algorithm on general-purpose computing on graphics processing unit (GPGPU) using OpenCL. We discuss the analysis of the parallelism in the fingerprint algorithm as well as the exploration of optimization parameters of the parallel fingerprint algorithm to improve the performance. The experimental results showed that the execution of parallel fingerprint enhancement algorithm on GPGPUs was accelerated from 29.4 upto 69.2 times compared with the execution of the original one on the host CPUs.

Fingerprint enhancement acceleration using OpenCL (OpenCL을 이용한 지문개선 가속화)

  • Ko, Sunghak;Lee, Chul;Park, Neungsoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.115-117
    • /
    • 2014
  • 최근 OpenCL, CUDA와 같은 이종 병렬 컴퓨팅 프레임워크가 등장함에 따라, 많은 연산량을 요구하는 알고리즘에 대한 이종 병렬 처리 연구가 늘고 있다. 본 논문에서는 연산량이 많은 지문개선(fingerprint enhancement) 알고리즘을 OpenCL을 이용해 병렬화하고 최적화하여 연산 시간을 단축하고자 한다. 이를 위하여 2차원 FFT 및 필터링 알고리즘을 병렬화하고, Loop Unrolling 및 메모리 접근 최적화 등의 기법을 적용하였다. 실험을 통하여 CPU의 순차적 처리기법과 비교하여 개선된 가속화 기법을 이용한 지문개선 알고리즘이 최대 25배의 성능이 향상하였음을 확인하였다.

A Benchmark of Hardware Acceleration Technology for Real-time Simulation in Smart Farm (CUDA vs OpenCL) (스마트 시설환경 실시간 시뮬레이션을 위한 하드웨어 가속 기술 분석)

  • Min, Jae-Ki;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.160-160
    • /
    • 2017
  • 자동화 기술을 통한 한국형 스마트팜의 발전이 비약적으로 이루어지고 있는 가운데 무인화를 위한 지능적인 스마트 시설환경 관찰 및 분석에 대한 요구가 점점 증가 하고 있다. 스마트 시설환경에서 취득 가능한 시계열 데이터는 온도, 습도, 조도, CO2, 토양 수분, 환기량 등 다양하다. 시스템의 경계가 명확함에도 해당 속성의 특성상 타임도메인과 공간도메인 상에서 정확한 추정 또는 예측이 난해하다. 시설 환경에 접목이 증가하고 있는 지능형 관리 기술 구현을 위해선 시계열 공간 데이터에 대한 신속하고 정확한 정량화 기술이 필수적이라 할 수 있다. 이러한 기술적인 요구사항을 해결하고자 시도되는 다양한 방법 중에서 공간 분해능 향상을 위한 다지점 계측 메트릭스를 실험적으로 구성하였다. $50m{\times}100m$의 단면적인 연동 딸기 온실을 대상으로 $3{\times}3{\times}3$의 3차원 환경 인자 계측 매트릭스를 설치하였다. 1 Hz의 주기로 4가지 환경인자(온도, 습도, 조도, CO2)를 계측하였으며, 계측 하는 시점과 동시에 병렬적으로 공간통계법을 이용하여 미지의 지점에 대한 환경 인자들을 실시간으로 추정하였다. 선행적으로 50 cm 공간 분해능에 대응하기 위하여 Kriging interpolation법을 횡단면에 대하여 분석한 후 다시 종단면에 대하여 분석하였다. 3 Ghz에 해당하는 연산 능력을 보유한 컴퓨터에서 1초 동안 획득한 데이터에 대한 분석을 마치는데 소요되는 시간이 15초 내외로 나타났다. 이는 해당 알고리즘의 매우 높은 시간 복잡도(Order of $O=O^3$)에 기인하는 것으로 다양한 시설 환경의 관리 방법론에 적절히 대응하기에 한계가 있다 할 수 있다. 실시간으로 시간 복잡도가 높은 연산을 수행하기 위한 기술적인 과제를 해결하고자, 근래에 관심이 증가하고 있는 NVIDIA 사에서 제공하는 CUDA 엔진과 Apple사의 제안을 시작으로 하여 공개 소프트웨어 개발 컨소시엄인 크로노스 그룹에서 제공하는 OpenCL 엔진을 비교 분석하였다. CUDA 엔진은 GPU(Graphics Processing Unit)에서 정보 분석 프로그램의 연산 집약적인 부분만을 담당하여 신속한 결과를 산출할 수 있는 라이브러리이며 해당 하드웨어를 구비하였을 때 사용이 가능하다. 반면, OpenCL은 CUDA 엔진이 특정 하드웨어에서 구동이 되는 한계를 극복하고자 하드웨어에 비의존적인 라이브러리를 제공하는 것이 다르며 클러스터링 기술과 연계를 통해 낮은 하드웨어 성능으로 인한 단점을 극복하고자 하였다. 본 연구에서는 CUDA 8.0(https://developer.nvidia.com/cuda-downloads)버전과 Pascal Titan X(NVIDIA, CA, USA)를 사용한 방법과 OpenCL 1.2(https://www.khronos.org/opencl/)버전과 Samsung Exynos5422 칩을 장착한 ODROID-XU4(Hardkernel, AnYang, Korea)를 사용한 방법을 비교 분석하였다. 50 cm의 공간 분해능에 대응하기 위한 4차원 행렬($100{\times}200{\times}5{\times}4$)에 대하여 정수 지수화를 위한 Quantization을 거쳐 CUDA 엔진과 OpenCL 엔진을 적용한 비교한 결과, CUDA 엔진은 1초 내외, OpenCL 엔진의 경우 5초 내외의 연산 속도를 보였다. CUDA 엔진의 경우 비용측면에서 약 10배, 전력 소모 측면에서 20배 이상 소요되었다. 따라서 우선적으로 OpenCL 엔진 기반 하드웨어 가속 기술 최적화 연구를 통해 스마트 시설환경 실시간 시뮬레이션 기술 도입을 위한 기술적 과제를 풀어갈 것이다.

  • PDF

A Parallel Implementation of JPEG2000 4K Ultra High Definition Image using OpenCL (OpenCL을 이용한 JPEG2000 4K 초고화질 영상처리의 병렬고속화 구현)

  • Park, Daeseung;Kim, Cheong Ghil
    • Journal of Satellite, Information and Communications
    • /
    • v.10 no.1
    • /
    • pp.1-5
    • /
    • 2015
  • With the help of fast growing multimedia technology and high preference for users of large screens, the newest video coding standard, HEVC (High Efficiency Video Coding) high-quality video compression), has been introduced. Therefore, the high definition image services which are four times more clear than conventional HD video, are getting popular. JPEG 2000 also has stated to support 4K and 8K UHD. As a result, it requires fast processing technology to read and write UHD images. This paper introduces a study on fast parallel processing technology for UHD images. For this purpose, first, JPEG 2000 is reviewed and a GPU based parallel implementation is proposed for a preprocessing of color conversion stage. The parallelled algorithm is implemented with OpenCL (Open Computing Language). The simulation results show that the proposed method shows 5 times performance improvements on processing speed for 4K UHD over the method using threads.

A Study on the Chemical Characteristics for the Leachate of Open(Illegal) Dumping Waste Landfill Mixing with Bentonite (벤토나이트 첨가시 불량폐기물매립지의 침출수에 미치는 화학적 특성에 관한 연구)

  • 이재영;노회정
    • Journal of Korea Soil Environment Society
    • /
    • v.4 no.1
    • /
    • pp.75-83
    • /
    • 1999
  • The purpose of this study is to investigate the chemical characteristics of the leachate for the open(illegal) dumping waste. In this study, the open dumping waste were mixed with 0, 5, 10, 15% of bentonite in each Iysimeter as a rate of weight. The simulation was evaluated by CODcr, ${NO_3}^-$, ${SO_4}^{2-}$, $Cl^-$ and heavy metals in leachate. As a result, the mixed waste with bentonite in all Iysimeters showed the reduction of CODcr and heavy metals were hardly detected. The removal rate of ${NO_3}^-$, ${SO_4}^{2-}$, $Cl^-$ was increased with the mixing rate of bentonite.

  • PDF

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API (GCN 아키텍쳐 상에서의 OpenCL을 이용한 GPGPU 성능향상 기법 연구)

  • Woo, DongHee;Kim, YoonHo
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.1
    • /
    • pp.37-45
    • /
    • 2018
  • The current system upon which a variety of programs are in operation has continuously expanded its domain from conventional single-core and multi-core system to many-core and heterogeneous system. However, existing researches have focused mostly on parallelizing programs based CUDA framework and rarely on AMD based GCN-GPU optimization. In light of the aforementioned problems, our study focuses on the optimization techniques of the GCN architecture in a GPGPU environment and achieves a performance improvement. Specifically, by using performance techniques we propose, we have reduced more then 30% of the computation time of matrix multiplication and convolution algorithm in GPGPU. Also, we increase the kernel throughput by more then 40%.

Design and Implementation of an Approximate Surface Lens Array System based on OpenCL (OpenCL 기반 근사곡면 렌즈어레이 시스템의 설계 및 구현)

  • Kim, Do-Hyeong;Song, Min-Ho;Jung, Ji-Sung;Kwon, Ki-Chul;Kim, Nam;Kim, Kyung-Ah;Yoo, Kwan-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.1-9
    • /
    • 2014
  • Generally, integral image used for autostereoscopic 3d display is generated for flat lens array, but flat lens array cannot provide a wide range of view for generated integral image because of narrow range of view. To make up for this flat lens array's weak point, curved lens array has been proposed, and due to technical and cost problem, approximate surface lens array composed of several flat lens array is used instead of ideal curved lens array. In this paper, we constructed an approximate surface lens array arranged for $20{\times}8$ square flat lens in 100mm radius sphere, and we could get about twice angle of view compared to flat lens array. Specially, unlike existing researches which manually generate integral image, we propose an OpenCL GPU parallel process algorithm for generating real-time integral image. As a result, we could get 12-20 frame/sec speed about various 3D volume data from $15{\times}15$ approximate surface lens array.

Comic Image Normalization using the gradient Radon Transform based on OpenCL implementation (OpenCL 기반의 그래디언트 라돈변환을 이용한 만화영상의 정규화)

  • Kim, Dong-Keun;Jeon, Hyeok-June;Hwang, Chi-Jung
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.221-230
    • /
    • 2011
  • Digital comic images are one of popular contents on the Internet. Usually, they are scanned from comic books by digital scanners. Without post-processing, they may have different sizes, skews and margins other than contents at the boundary. To normalize the size of their contents without the skews and margins is an important step in comic image analysis and application such as content-based comic image retrieval system. In this paper, we propose a method to detect a box frame in comic images by extracting of line segments using the gradient Radon transform. The box frame in comic images is the maximum rectangle which consists of contents without margins. We use the detected box frame to normalize the size of comic images and to make them no skew. In addition, the proposed method is implemented by OpenCL to speed up the detection of the line segments. Experimental results show that our proposed method effectively detects the box frame in comic images.

Adaptive Processing Algorithm Allocation on OpenCL-based FPGA-GPU Hybrid Layer for Energy-Efficient Reconfigurable Acceleration of Abnormal ECG Diagnosis (비정상 ECG 진단의 에너지 효율적인 재구성 가능한 가속을 위한 OpenCL 기반 FPGA-GPU 혼합 계층 적응 처리 알고리즘 할당)

  • Lee, Dongkyu;Lee, Seungmin;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1279-1286
    • /
    • 2021
  • The electrocardiogram (ECG) signal is a good indicator for early diagnosis of heart abnormalities. The ECG signal has a different reference normal signal for each person. And it requires lots of data to diagnosis. In this paper, we propose an adaptive OpenCL-based FPGA-GPU hybrid-layer platform to efficiently accelerate ECG signal diagnosis. As a result of diagnosing 19870 number of ECG signals of MIT-BIH arrhythmia database on the platform, the FPGA accelerator takes 1.15s, that the execution time was reduced by 89.94% and the power consumption was reduced by 84.0% compared to the software execution. The GPU accelerator takes 1.87s, that the execution time was reduced by 83.56% and the power consumption was reduced by 62.3% compared to the software execution. Although the proposed FPGA-GPU hybrid platform has a slower diagnostic speed than the FPGA accelerator, it can operate a flexible algorithm according to the situation by using the GPU.

Study on the Anode Electrode Reaction in the Metal-Air Cell (금속-공기전지의 Anode전극 반응에 관한 연구)

  • Kim, Yong-Hyuk
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.23 no.12
    • /
    • pp.1002-1006
    • /
    • 2010
  • In this study, magnesium (Mg), zinc (Zn) and aluminium (Al) as anode electrode and the solution of NaCl dissolved with 2~20 wt% as electrolytes were used for the metal-air cell. The open circuit voltage, short circuit current and I-V characteristics upon different kinds of anode electrode and electrolyte concentration were investigated. The open circuit voltage, initially about 1.45 V, rises to 1.6 V during the first 10 minutes indicating the necessity of an induction time to activate the catalyst on the air cathode. The short circuit current increases with an increased concentration of NaCl, causes an increase in the conductivity of the electrolyte solution, but the open circuit voltage did not under undergo influence of electrolyte. From NaCl 20 wt% electrolyte, the maximum output power of the magnesium electrode materials was measured with 177mW. It is found that the power characteristics of metal-air cell could be improved by using magnesium electrode materials in the NaCl electrolyte.