• Title/Summary/Keyword: image datasets

Search Result 427, Processing Time 0.026 seconds

SAR Image Target Detection based on Attention YOLOv4 (어텐션 적용 YOLOv4 기반 SAR 영상 표적 탐지 및 인식)

  • Park, Jongmin;Youk, Geunhyuk;Kim, Munchurl
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.5
    • /
    • pp.443-461
    • /
    • 2022
  • Target Detection in synthetic aperture radar(SAR) image is critical for military and national defense. In this paper, we propose YOLOv4-Attention architecture which adds attention modules to YOLOv4 backbone architecture to complement the feature extraction ability for SAR target detection with high accuracy. For training and testing our framework, we present new SAR embedding datasets based on MSTAR SAR public datasets which are about poor environments for target detection such as various clutter, crowded objects, various object size, close to buildings, and weakness of signal-to-clutter ratio. Experiments show that our Attention YOLOv4 architecture outperforms original YOLOv4 architecture in SAR image target detection tasks in poor environments for target detection.

Manufacture artificial intelligence education kit using Jetson Nano and 3D printer (Jetson Nano와 3D프린터를 이용한 인공지능 교육용 키트 제작)

  • SeongJu Park;NamHo Kim
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.40-48
    • /
    • 2022
  • In this paper, an educational kit that can be used in AI education was developed to solve the difficulties of AI education. Through this, object detection and person detection in computer vision using CNN and OpenCV to learn practical-oriented experiences from theory-centered and user image recognition (Your Own) that learns and recognizes specific objects Image Recognition), user object classification (Segmentation) and segmentation (Classification Datasets), IoT hardware control that attacks the learned target, and Jetson Nano GPIO, an AI board, are developed and utilized to develop and utilize textbooks that help effective AI learning made it possible.

Semi-Supervised SAR Image Classification via Adaptive Threshold Selection (선별적인 임계값 선택을 이용한 준지도 학습의 SAR 분류 기술)

  • Jaejun Do;Minjung Yoo;Jaeseok Lee;Hyoi Moon;Sunok Kim
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.27 no.3
    • /
    • pp.319-328
    • /
    • 2024
  • Semi-supervised learning is a good way to train a classification model using a small number of labeled and large number of unlabeled data. We applied semi-supervised learning to a synthetic aperture radar(SAR) image classification model with a limited number of datasets that are difficult to create. To address the previous difficulties, semi-supervised learning uses a model trained with a small amount of labeled data to generate and learn pseudo labels. Besides, a lot of number of papers use a single fixed threshold to create pseudo labels. In this paper, we present a semi-supervised synthetic aperture radar(SAR) image classification method that applies different thresholds for each class instead of all classes sharing a fixed threshold to improve SAR classification performance with a small number of labeled datasets.

A Motion Detection Approach based on UAV Image Sequence

  • Cui, Hong-Xia;Wang, Ya-Qi;Zhang, FangFei;Li, TingTing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1224-1242
    • /
    • 2018
  • Aiming at motion analysis and compensation, it is essential to conduct motion detection with images. However, motion detection and tracking from low-altitude images obtained from an unmanned aerial system may pose many challenges due to degraded image quality caused by platform motion, image instability and illumination fluctuation. This research tackles these challenges by proposing a modified joint transform correlation algorithm which includes two preprocessing strategies. In spatial domain, a modified fuzzy edge detection method is proposed for preprocessing the input images. In frequency domain, to eliminate the disturbance of self-correlation items, the cross-correlation items are extracted from joint power spectrum output plane. The effectiveness and accuracy of the algorithm has been tested and evaluated by both simulation and real datasets in this research. The simulation experiments show that the proposed approach can derive satisfactory peaks of cross-correlation and achieve detection accuracy of displacement vectors with no more than 0.03pixel for image pairs with displacement smaller than 20pixels, when addition of image motion blurring in the range of 0~10pixel and 0.002variance of additive Gaussian noise. Moreover,this paper proposes quantitative analysis approach using tri-image pairs from real datasets and the experimental results show that detection accuracy can be achieved with sub-pixel level even if the sampling frequency can only attain 50 frames per second.

A Development of Façade Dataset Construction Technology Using Deep Learning-based Automatic Image Labeling (딥러닝 기반 이미지 자동 레이블링을 활용한 건축물 파사드 데이터세트 구축 기술 개발)

  • Gu, Hyeong-Mo;Seo, Ji-Hyo;Choo, Seung-Yeon
    • Journal of the Architectural Institute of Korea Planning & Design
    • /
    • v.35 no.12
    • /
    • pp.43-53
    • /
    • 2019
  • The construction industry has made great strides in the past decades by utilizing computer programs including CAD. However, compared to other manufacturing sectors, labor productivity is low due to the high proportion of workers' knowledge-based task in addition to simple repetitive task. Therefore, the knowledge-based task efficiency of workers should be improved by recognizing the visual information of computers. A computer needs a lot of training data, such as the ImageNet project, to recognize visual information. This study, aim at proposing building facade datasets that is efficiently constructed by quickly collecting building facade data through portal site road view and automatically labeling using deep learning as part of construction of image dataset for visual recognition construction by the computer. As a method proposed in this study, we constructed a dataset for a part of Dongseong-ro, Daegu Metropolitan City and analyzed the utility and reliability of the dataset. Through this, it was confirmed that the computer could extract the significant facade information of the portal site road view by recognizing the visual information of the building facade image. Additionally, In contribution to verifying the feasibility of building construction image datasets. this study suggests the possibility of securing quantitative and qualitative facade design knowledge by extracting the facade design knowledge from any facade all over the world.

Comprehensive analysis of deep learning-based target classifiers in small and imbalanced active sonar datasets (소량 및 불균형 능동소나 데이터세트에 대한 딥러닝 기반 표적식별기의 종합적인 분석)

  • Geunhwan Kim;Youngsang Hwang;Sungjin Shin;Juho Kim;Soobok Hwang;Youngmin Choo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.329-344
    • /
    • 2023
  • In this study, we comprehensively analyze the generalization performance of various deep learning-based active sonar target classifiers when applied to small and imbalanced active sonar datasets. To generate the active sonar datasets, we use data from two different oceanic experiments conducted at different times and ocean. Each sample in the active sonar datasets is a time-frequency domain image, which is extracted from audio signal of contact after the detection process. For the comprehensive analysis, we utilize 22 Convolutional Neural Networks (CNN) models. Two datasets are used as train/validation datasets and test datasets, alternatively. To calculate the variance in the output of the target classifiers, the train/validation/test datasets are repeated 10 times. Hyperparameters for training are optimized using Bayesian optimization. The results demonstrate that shallow CNN models show superior robustness and generalization performance compared to most of deep CNN models. The results from this paper can serve as a valuable reference for future research directions in deep learning-based active sonar target classification.

Image Caption Generation using Recurrent Neural Network (Recurrent Neural Network를 이용한 이미지 캡션 생성)

  • Lee, Changki
    • Journal of KIISE
    • /
    • v.43 no.8
    • /
    • pp.878-882
    • /
    • 2016
  • Automatic generation of captions for an image is a very difficult task, due to the necessity of computer vision and natural language processing technologies. However, this task has many important applications, such as early childhood education, image retrieval, and navigation for blind. In this paper, we describe a Recurrent Neural Network (RNN) model for generating image captions, which takes image features extracted from a Convolutional Neural Network (CNN). We demonstrate that our models produce state of the art results in image caption generation experiments on the Flickr 8K, Flickr 30K, and MS COCO datasets.

A Survey on Image Emotion Recognition

  • Zhao, Guangzhe;Yang, Hanting;Tu, Bing;Zhang, Lei
    • Journal of Information Processing Systems
    • /
    • v.17 no.6
    • /
    • pp.1138-1156
    • /
    • 2021
  • Emotional semantics are the highest level of semantics that can be extracted from an image. Constructing a system that can automatically recognize the emotional semantics from images will be significant for marketing, smart healthcare, and deep human-computer interaction. To understand the direction of image emotion recognition as well as the general research methods, we summarize the current development trends and shed light on potential future research. The primary contributions of this paper are as follows. We investigate the color, texture, shape and contour features used for emotional semantics extraction. We establish two models that map images into emotional space and introduce in detail the various processes in the image emotional semantic recognition framework. We also discuss important datasets and useful applications in the field such as garment image and image retrieval. We conclude with a brief discussion about future research trends.

Multiple Mixed Modes: Single-Channel Blind Image Separation

  • Tiantian Yin;Yina Guo;Ningning Zhang
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.858-869
    • /
    • 2023
  • As one of the pivotal techniques of image restoration, single-channel blind source separation (SCBSS) is capable of converting a visual-only image into multi-source images. However, image degradation often results from multiple mixing methods. Therefore, this paper introduces an innovative SCBSS algorithm to effectively separate source images from a composite image in various mixed modes. The cornerstone of this approach is a novel triple generative adversarial network (TriGAN), designed based on dual learning principles. The TriGAN redefines the discriminator's function to optimize the separation process. Extensive experiments have demonstrated the algorithm's capability to distinctly separate source images from a composite image in diverse mixed modes and to facilitate effective image restoration. The effectiveness of the proposed method is quantitatively supported by achieving an average peak signal-to-noise ratio exceeding 30 dB, and the average structural similarity index surpassing 0.95 across multiple datasets.

Analysis of JPEG Image Compression Effect on Convolutional Neural Network-Based Cat and Dog Classification

  • Yueming Qu;Qiong Jia;Euee S. Jang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.112-115
    • /
    • 2022
  • The process of deep learning usually needs to deal with massive data which has greatly limited the development of deep learning technologies today. Convolutional Neural Network (CNN) structure is often used to solve image classification problems. However, a large number of images may be required in order to train an image in CNN, which is a heavy burden for existing computer systems to handle. If the image data can be compressed under the premise that the computer hardware system remains unchanged, it is possible to train more datasets in deep learning. However, image compression usually adopts the form of lossy compression, which will lose part of the image information. If the lost information is key information, it may affect learning performance. In this paper, we will analyze the effect of image compression on deep learning performance on CNN-based cat and dog classification. Through the experiment results, we conclude that the compression of images does not have a significant impact on the accuracy of deep learning.

  • PDF