• Title/Summary/Keyword: GAN(Generative Adversarial Network

Search Result 176, Processing Time 0.026 seconds

GAN-based shadow removal using context information

  • Yoon, Hee-jin;Kim, Kang-jik;Chun, Jun-chul
    • Journal of Internet Computing and Services
    • /
    • v.20 no.6
    • /
    • pp.29-36
    • /
    • 2019
  • When dealing with outdoor images in a variety of computer vision applications, the presence of shadow degrades performance. In order to understand the information occluded by shadow, it is essential to remove the shadow. To solve this problem, in many studies, involves a two-step process of shadow detection and removal. However, the field of shadow detection based on CNN has greatly improved, but the field of shadow removal has been difficult because it needs to be restored after removing the shadow. In this paper, it is assumed that shadow is detected, and shadow-less image is generated by using original image and shadow mask. In previous methods, based on CGAN, the image created by the generator was learned from only the aspect of the image patch in the adversarial learning through the discriminator. In the contrast, we propose a novel method using a discriminator that judges both the whole image and the local patch at the same time. We not only use the residual generator to produce high quality images, but we also use joint loss, which combines reconstruction loss and GAN loss for training stability. To evaluate our approach, we used an ISTD datasets consisting of a single image. The images generated by our approach show sharp and restored detailed information compared to previous methods.

A Study on the Synthetic ECG Generation for User Recognition (사용자 인식을 위한 가상 심전도 신호 생성 기술에 관한 연구)

  • Kim, Min Gu;Kim, Jin Su;Pan, Sung Bum
    • Smart Media Journal
    • /
    • v.8 no.4
    • /
    • pp.33-37
    • /
    • 2019
  • Because the ECG signals are time-series data acquired as time elapses, it is important to obtain comparative data the same in size as the enrolled data every time. This paper suggests a network model of GAN (Generative Adversarial Networks) based on an auxiliary classifier to generate synthetic ECG signals which may address the different data size issues. The Cosine similarity and Cross-correlation are used to examine the similarity of synthetic ECG signals. The analysis shows that the Average Cosine similarity was 0.991 and the Average Euclidean distance similarity based on cross-correlation was 0.25: such results indicate that data size difference issue can be resolved while the generated synthetic ECG signals, similar to real ECG signals, can create synthetic data even when the registered data are not the same as the comparative data in size.

Grammatical Error Correction Using Generative Adversarial Network (적대적 생성 신경망을 이용한 문법 오류 교정)

  • Kwon, Soonchoul;Yu, Hwanjo;Lee, Gary Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.488-491
    • /
    • 2019
  • 문법 오류 교정은 문법적으로 오류가 있는 문장을 입력 받아 오류를 교정하는 시스템이다. 문법 오류 교정을 위해서는 문법 오류를 제거하는 것과 더불어 자연스러운 문장을 생성하는 것이 중요하다. 이 연구는 적대적 생성 신경망(GAN)을 이용하여 정답 문장과 구분이 되지 않을 만큼 자연스러운 문장을 생성하는 것을 목적으로 한다. 실험 결과 GAN을 이용한 문법 오류 교정은 MaxMatch F0.5 score 기준으로 0.4942을 달성하여 Baseline의 0.4462보다 높은 성능을 기록했다.

  • PDF

A study on artificial intelligence algorithm for imagery through 3D pagoda voxelization (3D 탑 복셀화를 통한 형상화 인공지능 알고리즘에 대한 연구)

  • Beom-Jun kim;Byong-Kwon Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.323-324
    • /
    • 2023
  • 본 논문에서는 다양한 복원 인공지능 알고리즘 중 하나인 3차원 복원 기술은 실제로 존재하는 물체의 2차원적인 픽셀을 3차원의 형태로 구현하여 형상화한다. 정확한 3차원 정보 처리가 요구됨에 따라 포인트 클라우드로 표현되는 데이터를 통해 정확한 쿨체의 크기 정보나 좌표 정보를 표시할 수 있다. 데이터의 픽셀을 분석하여 3차원의 형태로 구현할 것을 정의하는 복셀화(Voxelization) 알고리즘 전처리 과정을 통해 3차원 복원 기술 3D-GAN 활용으로 3차원 형태 형상화를 하였다. 본 논문에서는 3차원 복원 알고리즘 통하여 2차원 포인트 클라우드를 분석해 3차원 형태로 복원하는 기술에 대한 설명한다.

  • PDF

Fall detection based on GAN and LSTM (적대적 생성 신경망과 장단기 메모리셀을 이용한 낙상 검출)

  • Hyojin Shin;Jiyoung Woo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.21-22
    • /
    • 2023
  • 본 논문에서는 낙상과 비낙상 구별을 위한 분류 모델을 제안한다. 일상생활과 낙상을 구분해 내는 것은 낙상이 발생하기 이전에 감지하고 사고를 예방할 수 있다. 낙상은 일상생활 중 일어나기 쉬우며, 노인들에게는 골절 및 기관 파열 등과 같은 심각한 부상을 초래할 수 있기 때문에 낙상 방지를 위한 낙상과 비낙상 행동의 구분은 중요한 문제이다. 따라서 실시간으로 수집되는 다양한 활동에서의 센서 데이터를 활용하여 낙상과 비낙상의 행동을 구분하였다.

  • PDF

Dog-Species Classification through CycleGAN and Standard Data Augmentation

  • Chan, Park;Nammee, Moon
    • Journal of Information Processing Systems
    • /
    • v.19 no.1
    • /
    • pp.67-79
    • /
    • 2023
  • In the image field, data augmentation refers to increasing the amount of data through an editing method such as rotating or cropping a photo. In this study, a generative adversarial network (GAN) image was created using CycleGAN, and various colors of dogs were reflected through data augmentation. In particular, dog data from the Stanford Dogs Dataset and Oxford-IIIT Pet Dataset were used, and 10 breeds of dog, corresponding to 300 images each, were selected. Subsequently, a GAN image was generated using CycleGAN, and four learning groups were established: 2,000 original photos (group I); 2,000 original photos + 1,000 GAN images (group II); 3,000 original photos (group III); and 3,000 original photos + 1,000 GAN images (group IV). The amount of data in each learning group was augmented using existing data augmentation methods such as rotating, cropping, erasing, and distorting. The augmented photo data were used to train the MobileNet_v3_Large, ResNet-152, InceptionResNet_v2, and NASNet_Large frameworks to evaluate the classification accuracy and loss. The top-3 accuracy for each deep neural network model was as follows: MobileNet_v3_Large of 86.4% (group I), 85.4% (group II), 90.4% (group III), and 89.2% (group IV); ResNet-152 of 82.4% (group I), 83.7% (group II), 84.7% (group III), and 84.9% (group IV); InceptionResNet_v2 of 90.7% (group I), 88.4% (group II), 93.3% (group III), and 93.1% (group IV); and NASNet_Large of 85% (group I), 88.1% (group II), 91.8% (group III), and 92% (group IV). The InceptionResNet_v2 model exhibited the highest image classification accuracy, and the NASNet_Large model exhibited the highest increase in the accuracy owing to data augmentation.

Raindrop Removal and Background Information Recovery in Coastal Wave Video Imagery using Generative Adversarial Networks (적대적생성신경망을 이용한 연안 파랑 비디오 영상에서의 빗방울 제거 및 배경 정보 복원)

  • Huh, Dong;Kim, Jaeil;Kim, Jinah
    • Journal of the Korea Computer Graphics Society
    • /
    • v.25 no.5
    • /
    • pp.1-9
    • /
    • 2019
  • In this paper, we propose a video enhancement method using generative adversarial networks to remove raindrops and restore the background information on the removed region in the coastal wave video imagery distorted by raindrops during rainfall. Two experimental models are implemented: Pix2Pix network widely used for image-to-image translation and Attentive GAN, which is currently performing well for raindrop removal on a single images. The models are trained with a public dataset of paired natural images with and without raindrops and the trained models are evaluated their performance of raindrop removal and background information recovery of rainwater distortion of coastal wave video imagery. In order to improve the performance, we have acquired paired video dataset with and without raindrops at the real coast and conducted transfer learning to the pre-trained models with those new dataset. The performance of fine-tuned models is improved by comparing the results from pre-trained models. The performance is evaluated using the peak signal-to-noise ratio and structural similarity index and the fine-tuned Pix2Pix network by transfer learning shows the best performance to reconstruct distorted coastal wave video imagery by raindrops.

A Study on Optimization of Classification Performance through Fourier Transform and Image Augmentation (푸리에 변환 및 이미지 증강을 통한 분류 성능 최적화에 관한 연구)

  • Kihyun Kim;Seong-Mok Kim;Yong Soo Kim
    • Journal of Korean Society for Quality Management
    • /
    • v.51 no.1
    • /
    • pp.119-129
    • /
    • 2023
  • Purpose: This study proposes a classification model for implementing condition-based maintenance (CBM) by monitoring the real-time status of a machine using acceleration sensor data collected from a vehicle. Methods: The classification model's performance was improved by applying Fourier transform to convert the acceleration sensor data from the time domain to the frequency domain. Additionally, the Generative Adversarial Network (GAN) algorithm was used to augment images and further enhance the classification model's performance. Results: Experimental results demonstrate that the GAN algorithm can effectively serve as an image augmentation technique to enhance the performance of the classification model. Consequently, the proposed approach yielded a significant improvement in the classification model's accuracy. Conclusion: While this study focused on the effectiveness of the GAN algorithm as an image augmentation method, further research is necessary to compare its performance with other image augmentation techniques. Additionally, it is essential to consider the potential for performance degradation due to class imbalance and conduct follow-up studies to address this issue.

Assessment and Analysis of Fidelity and Diversity for GAN-based Medical Image Generative Model (GAN 기반 의료영상 생성 모델에 대한 품질 및 다양성 평가 및 분석)

  • Jang, Yoojin;Yoo, Jaejun;Hong, Helen
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.2
    • /
    • pp.11-19
    • /
    • 2022
  • Recently, various researches on medical image generation have been suggested, and it becomes crucial to accurately evaluate the quality and diversity of the generated medical images. For this purpose, the expert's visual turing test, feature distribution visualization, and quantitative evaluation through IS and FID are evaluated. However, there are few methods for quantitatively evaluating medical images in terms of fidelity and diversity. In this paper, images are generated by learning a chest CT dataset of non-small cell lung cancer patients through DCGAN and PGGAN generative models, and the performance of the two generative models are evaluated in terms of fidelity and diversity. The performance is quantitatively evaluated through IS and FID, which are one-dimensional score-based evaluation methods, and Precision and Recall, Improved Precision and Recall, which are two-dimensional score-based evaluation methods, and the characteristics and limitations of each evaluation method are also analyzed in medical imaging.

Waste Classification by Fine-Tuning Pre-trained CNN and GAN

  • Alsabei, Amani;Alsayed, Ashwaq;Alzahrani, Manar;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.65-70
    • /
    • 2021
  • Waste accumulation is becoming a significant challenge in most urban areas and if it continues unchecked, is poised to have severe repercussions on our environment and health. The massive industrialisation in our cities has been followed by a commensurate waste creation that has become a bottleneck for even waste management systems. While recycling is a viable solution for waste management, it can be daunting to classify waste material for recycling accurately. In this study, transfer learning models were proposed to automatically classify wastes based on six materials (cardboard, glass, metal, paper, plastic, and trash). The tested pre-trained models were ResNet50, VGG16, InceptionV3, and Xception. Data augmentation was done using a Generative Adversarial Network (GAN) with various image generation percentages. It was found that models based on Xception and VGG16 were more robust. In contrast, models based on ResNet50 and InceptionV3 were sensitive to the added machine-generated images as the accuracy degrades significantly compared to training with no artificial data.