• Title/Summary/Keyword: 이미지 데이터 셋

Search Result 283, Processing Time 0.112 seconds

Comparative study of data augmentation methods for fake audio detection (음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구)

  • KwanYeol Park;Il-Youp Kwak
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data.

Analysis of facial expression recognition (표정 분류 연구)

  • Son, Nayeong;Cho, Hyunsun;Lee, Sohyun;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.539-554
    • /
    • 2018
  • Effective interaction between user and device is considered an important ability of IoT devices. For some applications, it is necessary to recognize human facial expressions in real time and make accurate judgments in order to respond to situations correctly. Therefore, many researches on facial image analysis have been preceded in order to construct a more accurate and faster recognition system. In this study, we constructed an automatic recognition system for facial expressions through two steps - a facial recognition step and a classification step. We compared various models with different sets of data with pixel information, landmark coordinates, Euclidean distances among landmark points, and arctangent angles. We found a fast and efficient prediction model with only 30 principal components of face landmark information. We applied several prediction models, that included linear discriminant analysis (LDA), random forests, support vector machine (SVM), and bagging; consequently, an SVM model gives the best result. The LDA model gives the second best prediction accuracy but it can fit and predict data faster than SVM and other methods. Finally, we compared our method to Microsoft Azure Emotion API and Convolution Neural Network (CNN). Our method gives a very competitive result.

Online Multi-Object Tracking by Learning Discriminative Appearance with Fourier Transform and Partial Least Square Analysis

  • Lee, Seong-Ho;Bae, Seung-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.49-58
    • /
    • 2020
  • In this study, we solve an online multi-object problem which finds object states (i.e. locations and sizes) while conserving their identifications in online-provided images and detections. We handle this problem based on a tracking-by-detection approach by linking (or associating) detections between frames. For more accurate online association, we propose novel online appearance learning with discrete fourier transform and partial least square analysis (PLS). We first transform each object image into a Fourier image in order to extract meaningful features on a frequency domain. We then learn PLS subspaces which can discriminate frequency features of different objects. In addition, we incorporate the proposed appearance learning into the recent confidence-based association method, and extensively compare our methods with the state-of-the-art methods on MOT benchmark challenge datasets.

Development of Autonomous Vehicle Learning Data Generation System (자율주행 차량의 학습 데이터 자동 생성 시스템 개발)

  • Yoon, Seungje;Jung, Jiwon;Hong, June;Lim, Kyungil;Kim, Jaehwan;Kim, Hyungjoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.162-177
    • /
    • 2020
  • The perception of traffic environment based on various sensors in autonomous driving system has a direct relationship with driving safety. Recently, as the perception model based on deep neural network is used due to the development of machine learning/in-depth neural network technology, a the perception model training and high quality of a training dataset are required. However, there are several realistic difficulties to collect data on all situations that may occur in self-driving. The performance of the perception model may be deteriorated due to the difference between the overseas and domestic traffic environments, and data on bad weather where the sensors can not operate normally can not guarantee the qualitative part. Therefore, it is necessary to build a virtual road environment in the simulator rather than the actual road to collect the traning data. In this paper, a training dataset collection process is suggested by diversifying the weather, illumination, sensor position, type and counts of vehicles in the simulator environment that simulates the domestic road situation according to the domestic situation. In order to achieve better performance, the authors changed the domain of image to be closer to due diligence and diversified. And the performance evaluation was conducted on the test data collected in the actual road environment, and the performance was similar to that of the model learned only by the actual environmental data.

Fashion-show Animation Generation using a Single Image to 3D Human Reconstruction Technique (이미지에서 3차원 인물복원 기법을 사용한 패션쇼 애니메이션 생성기법)

  • Ahn, Heejune;Minar, Matiur Rahman
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.24 no.5
    • /
    • pp.17-25
    • /
    • 2019
  • In this paper, we introduce the technology to convert a single human image into a fashion show animation video clip. The technology can help the customers confirm the dynamic fitting result when combined with the virtual try on technique as well as the interesting experience to a normal person of being a fashion model. We developed an extended technique of full human 2D to 3D inverse modeling based on SMPLify human body inverse modeling technique, and a rigged model animation method. The 3D shape deformation of the full human from the body model was performed by 2 part deformation in the image domain and reconstruction using the estimated depth information. The quality of resultant animation videos are made to be publically available for evaluation. We consider it is a promising approach for commercial application when supplemented with the post - processing technology such as image segmentation technique, mapping technique and restoration technique of obscured area.

Target Image Exchange Model for Object Tracking Based on Siamese Network (샴 네트워크 기반 객체 추적을 위한 표적 이미지 교환 모델)

  • Park, Sung-Jun;Kim, Gyu-Min;Hwang, Seung-Jun;Baek, Joong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.389-395
    • /
    • 2021
  • In this paper, we propose a target image exchange model to improve performance of the object tracking algorithm based on a Siamese network. The object tracking algorithm based on the Siamese network tracks the object by finding the most similar part in the search image using only the target image specified in the first frame of the sequence. Since only the object of the first frame and the search image compare similarity, if tracking fails once, errors accumulate and drift in a part other than the tracked object occurs. Therefore, by designing a CNN(Convolutional Neural Network) based model, we check whether the tracking is progressing well, and the target image exchange timing is defined by using the score output from the Siamese network-based object tracking algorithm. The proposed model is evaluated the performance using the VOT-2018 dataset, and finally achieved an accuracy of 0.611 and a robustness of 22.816.

A USB classification system using deep neural networks (인공신경망을 이용한 USB 인식 시스템)

  • Woo, Sae-Hyeong;Park, Jisu;Eun, Seongbae;Cha, Shin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.535-538
    • /
    • 2022
  • For Plug & Play of IoT devices, we develop a module that recognizes the type of USB, which is a typical wired interface of IoT devices, through image recognition. In order to drive an IoT device, a driver for communication and device hardware is required. The wired interface for connecting to the IoT device is recognized by using the image obtained through the camera of smartphone shooting to recognize the corresponding communication interface. For USB, which is a most popular wired interface, types of USB are classified through artificial neural network-based machine learning. In order to secure sufficient data set of artificial neural networks, USB images are collected through the Internet, and additional image data sets are secured through image processing. In addition to the convolution neural networks, recognizers are implemented with various deep artificial neural networks, and their performance is compared and evaluated.

  • PDF

Detecting Foreign Objects in Chest X-Ray Images using Artificial Intelligence (인공 지능을 이용한 흉부 엑스레이 이미지에서의 이물질 검출)

  • Chang-Hwa Han
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.6
    • /
    • pp.873-879
    • /
    • 2023
  • This study explored the use of artificial intelligence(AI) to detect foreign bodies in chest X-ray images. Medical imaging, especially chest X-rays, plays a crucial role in diagnosing diseases such as pneumonia and lung cancer. With the increase in imaging tests, AI has become an important tool for efficient and fast diagnosis. However, images can contain foreign objects, including everyday jewelry like buttons and bra wires, which can interfere with accurate readings. In this study, we developed an AI algorithm that accurately identifies these foreign objects and processed the National Institutes of Health chest X-ray dataset based on the YOLOv8 model. The results showed high detection performance with accuracy, precision, recall, and F1-score all close to 0.91. Despite the excellent performance of AI, the study solved the problem that foreign objects in the image can distort the reading results, emphasizing the innovative role of AI in radiology and its reliability based on accuracy, which is essential for clinical implementation.

multi-scale feature compression for VCM (VCM 을 위한 다중 스케일 특징 압축 방법)

  • Han, Heeji;Choi, Minseok;Jung, Soon-heung;Kwak, Sangwoon;Choo, Hyon-Gon;Cheong, Won-Sik;Seo, Jeongil;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.140-142
    • /
    • 2022
  • 최근 신경망 기반 기술들의 발달에 따라, 신경망 기술들은 충분히 높은 임무 수행 성능을 달성하고 있으며 사물인터넷, 스마트시티, 자율주행 등 다양한 환경을 고려한 응용 역시 활발히 연구되고 있다. 하지만 이러한 신경망의 임무 다양성과 복잡성은 더욱 많은 비디오 데이터가 요구되며 대역폭이 제한된 환경을 고려한 응용에서 이러한 비디오 데이터를 효과적으로 전송할 방법이 필요하다. 이에 따라 국제 표준화 단체인 MPEG 에서는 신경망 기계 소비에 적합한 비디오 부호화 표준 개발을 위해 Video Coding for Machines (VCM) 표준화를 진행하고 있다. 본 논문에서는 신경망의 특징 부호화 효율을 개선하기 위하여 VCM 을 위한 다중 스케일 특징 압축 방법을 제안한다. COCO2017 데이터셋의 검증 영상을 기반으로 제안방법을 평가한 결과, 압축된 특징의 크기는 원본 이미지의 0.03 배이며 6.8% 미만의 임무 정확도 손실을 보였다.

  • PDF

A study on the detection of pedestrians in crosswalks using multi-spectrum (다중스펙트럼을 이용한 횡단보도 보행자 검지에 관한 연구)

  • kim, Junghun;Choi, Doo-Hyun;Lee, JongSun;Lee, Donghwa
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.11-18
    • /
    • 2022
  • The use of multi-spectral cameras is essential for day and night pedestrian detection. In this paper, a color camera and a thermal imaging infrared camera were used to detect pedestrians near a crosswalk for 24 hours at an intersection with a high risk of traffic accidents. For pedestrian detection, the YOLOv5 object detector was used, and the detection performance was improved by using color images and thermal images at the same time. The proposed system showed a high performance of 0.940 mAP in the day/night multi-spectral (color and thermal image) pedestrian dataset obtained from the actual crosswalk site.