• Title/Summary/Keyword: 커널기법

Search Result 347, Processing Time 0.025 seconds

Blocking artifacts reduction for improving visual quality of highly compressed images (압축영상의 화질향상을 위한 블록킹 현상 제거에 관한 연구)

  • 이주홍;김민구;정제창;최병욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.8
    • /
    • pp.1677-1690
    • /
    • 1997
  • Block-transform coding is one of the most popular approaches for image compression. For example, DCT is widely used in the internaltional standards standards such as MPEG-1, MPEG-2, JPEG, and H.261. In the block-based transform coding, blocking artifacts may appear along block boundaries, and they can cause severe image degradation eqpecially when the transform coefficients are coarsely quantized. In this paper, we propose a new method for blocking artifacts reduction in transform-coded images. For blocking artifacts reduction, we add a correction term, on a block basis, composed of a linear combination of 28 basis images that are orthonormal on block boundaries. We select 28 DCT kernel functions of which boundary values are linearly independent, and Gram-Schmidt process is applied to the boundary values in order to obtain 28 boundary-orthonormal basis images. A threshold of bolock discontinuity is introduced for improvement of visual quality by reducing image blurring. We also investigate the number of basis images needed for efficient blocking artifacts reduction when the compression ratio changes.

  • PDF

Face Recognition Evaluation of an Illumination Property of Subspace Based Feature Extractor (부분공간 기반 특징 추출기의 조명 변인에 대한 얼굴인식 성능 분석)

  • Kim, Kwang-Soo;Boo, Deok-Hee;Ahn, Jung-Ho;Kwak, Soo-Yeong;Byun, Hye-Ran
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.7
    • /
    • pp.681-687
    • /
    • 2007
  • Face recognition technique is very popular for a personal information security and user identification in recent years. However, the face recognition system is very hard to be implemented due to the difficulty where change in illumination, pose and facial expression. In this paper, we consider that an illumination change causing the variety of face appearance, virtual image data is generated and added to the D-LDA which was selected as the most suitable feature extractor. A less sensitive recognition system in illumination is represented in this paper. This way that consider nature of several illumination directions generate the virtual training image data that considered an illumination effect of the directions and the change of illumination density. As result of experiences, D-LDA has a less sensitive property in an illumination through ORL, Yale University and Pohang University face database.

Method that determining the Hyperparameter of CNN using HS algorithm (HS 알고리즘을 이용한 CNN의 Hyperparameter 결정 기법)

  • Lee, Woo-Young;Ko, Kwang-Eun;Geem, Zong-Woo;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.27 no.1
    • /
    • pp.22-28
    • /
    • 2017
  • The Convolutional Neural Network(CNN) can be divided into two stages: feature extraction and classification. The hyperparameters such as kernel size, number of channels, and stride in the feature extraction step affect the overall performance of CNN as well as determining the structure of CNN. In this paper, we propose a method to optimize the hyperparameter in CNN feature extraction stage using Parameter-Setting-Free Harmony Search (PSF-HS) algorithm. After setting the overall structure of CNN, hyperparameter was set as a variable and the hyperparameter was optimized by applying PSF-HS algorithm. The simulation was conducted using MATLAB, and CNN learned and tested using mnist data. We update the parameters for a total of 500 times, and it is confirmed that the structure with the highest accuracy among the CNN structures obtained by the proposed method classifies the mnist data with an accuracy of 99.28%.

Bearing Faults Identification of an Induction Motor using Acoustic Emission Signals and Histogram Modeling (음향 방출 신호와 히스토그램 모델링을 이용한 유도전동기의 베어링 결함 검출)

  • Jang, Won-Chul;Seo, Jun-Sang;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.11
    • /
    • pp.17-24
    • /
    • 2014
  • This paper proposes a fault detection method for low-speed rolling element bearings of an induction motor using acoustic emission signals and histogram modeling. The proposed method performs envelop modeling of the histogram of normalized fault signals. It then extracts and selects significant features of each fault using partial autocorrelation coefficients and distance evaluation technique, respectively. Finally, using the extracted features as inputs, the support vector regression (SVR) classifies bearing's inner, outer, and roller faults. To obtain optimal classification performance, we evaluate the proposed method with varying an adjustable parameter of the Gaussian radial basis function of SVR from 0.01 to 1.0 and the number of features from 2 to 150. Experimental results show that the proposed fault identification method using 0.64-0.65 of the adjustable parameter and 75 features achieves 91% in classification performance and outperforms conventional fault diagnosis methods as well.

A Tool for On-the-fly Repairing of Atomicity Violation in GPU Program Execution

  • Lee, Keonpyo;Lee, Seongjin;Jun, Yong-Kee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.9
    • /
    • pp.1-12
    • /
    • 2021
  • In this paper, we propose a tool called ARCAV (Atomatic Recovery of CUDA Atomicity violation) to automatically repair atomicity violations in GPU (Graphics Processing Unit) program. ARCAV monitors information of every barrier and memory to make actual memory writes occur at the end of the barrier region or to make the program execute barrier region again. Existing methods do not repair atomicity violations but only detect the atomicity violations in GPU programs because GPU programs generally do not support lock and sleep instructions which are necessary for repairing the atomicity violations. Proposed ARCAV is designed for GPU execution model. ARCAV detects and repairs four patterns of atomicity violations which represent real-world cases. Moreover, ARCAV is independent of memory hierarchy and thread configuration. Our experiments show that the performance of ARCAV is stable regardless of the number of threads or blocks. The overhead of ARCAV is evaluated using four real-world kernels, and its slowdown is 2.1x, in average, of native execution time.

An Optimization Method for Hologram Generation on Multiple GPU-based Parallel Processing (다중 GPU기반 홀로그램 생성을 위한 병렬처리 성능 최적화 기법)

  • Kook, Joongjin
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.9-15
    • /
    • 2019
  • Since the computational complexity for hologram generation increases exponentially with respect to the size of the point cloud, parallel processing using CUDA and/or OpenCL library based on multiple GPUs has recently become popular. The CUDA kernel for parallelization needs to consist of threads, blocks, and grids properly in accordance with the number of cores and the memory size in the GPU. In addition, in case of multiple GPU environments, the distribution in grid-by-grid, in block-by-block, or in thread-by-thread is needed according to the number of GPUs. In order to evaluate the performance of CGH generation, we compared the computational speed in CPU, in single GPU, and in multi-GPU environments by gradually increasing the number of points in a point cloud from 10 to 1,000,000. We also present a memory structure design and a calculation method required in the CUDA-based parallel processing to accelerate the CGH (Computer Generated Hologram) generation operation in multiple GPU environments.

Preserving and Breakup for the Detailed Representation of Liquid Sheets in Particle-Based Fluid Simulations (입자 기반 유체 시뮬레이션에서 디테일한 액체 시트를 표현하기 위한 보존과 분해 기법)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.25 no.1
    • /
    • pp.13-22
    • /
    • 2019
  • In this paper, we propose a new method to improve the details of the fluid surface by removing liquid sheets that are over-preserved in particle-based water simulation. A variety of anisotropic approaches have been proposed to address the surface noise problem, one of the chronic problems in particle-based fluid simulation. However, a method of stably expressing the preservation and breakup of the liquid sheet has not been proposed. We propose a new framework that can dynamically add and remove the water particles based on anisotropic kernel and density to simultaneously represent two features of liquid sheet preservation and breakup in particle-based fluid simulations. The proposed technique well represented the characteristics of a fluid sheet that was breakup by removing the excessively preserved liquid sheet in a particle-based fluid simulation approach. As a result, the quality of the liquid sheet was improved without noise.

FPGA-Based Acceleration of Range Doppler Algorithm for Real-Time Synthetic Aperture Radar Imaging (실시간 SAR 영상 생성을 위한 Range Doppler 알고리즘의 FPGA 기반 가속화)

  • Jeong, Dongmin;Lee, Wookyung;Jung, Yunho
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.634-643
    • /
    • 2021
  • In this paper, an FPGA-based acceleration scheme of range Doppler algorithm (RDA) is proposed for the real time synthetic aperture radar (SAR) imaging. Hardware architectures of matched filter based on systolic array architecture and a high speed sinc interpolator to compensate range cell migration (RCM) are presented. In addition, the proposed hardware was implemented and accelerated on Xilinx Alveo FPGA. Experimental results for 4096×4096-size SAR imaging showed that FPGA-based implementation achieves 2 times acceleration compared to GPU-based design. It was also confirmed the proposed design can be implemented with 60,247 CLB LUTs, 103,728 CLB registers, 20 block RAM tiles and 592 DPSs at the operating frequency of 312 MHz.

A Study on the Image Preprosessing model linkage method for usability of Pix2Pix (Pix2Pix의 활용성을 위한 학습이미지 전처리 모델연계방안 연구)

  • Kim, Hyo-Kwan;Hwang, Won-Yong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.5
    • /
    • pp.380-386
    • /
    • 2022
  • This paper proposes a method for structuring the preprocessing process of a training image when color is applied using Pix2Pix, one of the adversarial generative neural network techniques. This paper concentrate on the prediction result can be damaged according to the degree of light reflection of the training image. Therefore, image preprocesisng and parameters for model optimization were configured before model application. In order to increase the image resolution of training and prediction results, it is necessary to modify the of the model so this part is designed to be tuned with parameters. In addition, in this paper, the logic that processes only the part where the prediction result is damaged by light reflection is configured together, and the pre-processing logic that does not distort the prediction result is also configured.Therefore, in order to improve the usability, the accuracy was improved through experiments on the part that applies the light reflection tuning filter to the training image of the Pix2Pix model and the parameter configuration.

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU (GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법)

  • Kim, Mincheol;Lee, Kwangyeob
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.10
    • /
    • pp.935-943
    • /
    • 2017
  • CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.