• Title/Summary/Keyword: Synthetic Dataset

Search Result 107, Processing Time 0.026 seconds

A Study of Development and Application of an Inland Water Body Training Dataset Using Sentinel-1 SAR Images in Korea (Sentinel-1 SAR 영상을 활용한 국내 내륙 수체 학습 데이터셋 구축 및 알고리즘 적용 연구)

  • Eu-Ru Lee;Hyung-Sup Jung
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1371-1388
    • /
    • 2023
  • Floods are becoming more severe and frequent due to global warming-induced climate change. Water disasters are rising in Korea due to severe rainfall and wet seasons. This makes preventive climate change measures and efficient water catastrophe responses crucial, and synthetic aperture radar satellite imagery can help. This research created 1,423 water body learning datasets for individual water body regions along the Han and Nakdong waterways to reflect domestic water body properties discovered by Sentinel-1 satellite radar imagery. We created a document with exact data annotation criteria for many situations. After the dataset was processed, U-Net, a deep learning model, analyzed water body detection results. The results from applying the learned model to water body locations not involved in the learning process were studied to validate soil water body monitoring on a national scale. The analysis showed that the created water body area detected water bodies accurately (F1-Score: 0.987, Intersection over Union [IoU]: 0.955). Other domestic water body regions not used for training and evaluation showed similar accuracy (F1-Score: 0.941, IoU: 0.89). Both outcomes showed that the computer accurately spotted water bodies in most areas, however tiny streams and gloomy areas had problems. This work should improve water resource change and disaster damage surveillance. Future studies will likely include more water body attribute datasets. Such databases could help manage and monitor water bodies nationwide and shed light on misclassified regions.

A Re-configuration Scheme for Social Network Based Large-scale SMS Spam (소셜 네트워크 기반 대량의 SMS 스팸 데이터 재구성 기법)

  • Jeong, Sihyun;Noh, Giseop;Oh, Hayoung;Kim, Chong-Kwon
    • Journal of KIISE
    • /
    • v.42 no.6
    • /
    • pp.801-806
    • /
    • 2015
  • The Short Message Service (SMS) is one of the most popular communication tools in the world. As the cost of SMS decreases, SMS spam has been growing largely. Even though there are many existing studies on SMS spam detection, researchers commonly have limitation collecting users' private SMS contents. They need to gather the information related to social network as well as personal SMS due to the intelligent spammers being aware of the social networks. Therefore, this paper proposes the Social network Building Scheme for SMS spam detection (SBSS) algorithm that builds synthetic social network dataset realistically, without the collection of private information. Also, we analyze and categorize the attack types of SMS spam to build more complete and realistic social network dataset including SMS spam.

High-Quality Coarse-to-Fine Fruit Detector for Harvesting Robot in Open Environment

  • Zhang, Li;Ren, YanZhao;Tao, Sha;Jia, Jingdun;Gao, Wanlin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.421-441
    • /
    • 2021
  • Fruit detection in orchards is one of the most crucial tasks for designing the visual system of an automated harvesting robot. It is the first and foremost tool employed for tasks such as sorting, grading, harvesting, disease control, and yield estimation, etc. Efficient visual systems are crucial for designing an automated robot. However, conventional fruit detection methods always a trade-off with accuracy, real-time response, and extensibility. Therefore, an improved method is proposed based on coarse-to-fine multitask cascaded convolutional networks (MTCNN) with three aspects to enable the practical application. First, the architecture of Fruit-MTCNN was improved to increase its power to discriminate between objects and their backgrounds. Then, with a few manual labels and operations, synthetic images and labels were generated to increase the diversity and the number of image samples. Further, through the online hard example mining (OHEM) strategy during training, the detector retrained hard examples. Finally, the improved detector was tested for its performance that proved superior in predicted accuracy and retaining good performances on portability with the low time cost. Based on performance, it was concluded that the detector could be applied practically in the actual orchard environment.

Deep Unsupervised Learning for Rain Streak Removal using Time-varying Rain Streak Scene (시간에 따라 변화하는 빗줄기 장면을 이용한 딥러닝 기반 비지도 학습 빗줄기 제거 기법)

  • Cho, Jaehoon;Jang, Hyunsung;Ha, Namkoo;Lee, Seungha;Park, Sungsoon;Sohn, Kwanghoon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.1
    • /
    • pp.1-9
    • /
    • 2019
  • Single image rain removal is a typical inverse problem which decomposes the image into a background scene and a rain streak. Recent works have witnessed a substantial progress on the task due to the development of convolutional neural network (CNN). However, existing CNN-based approaches train the network with synthetically generated training examples. These data tend to make the network bias to the synthetic scenes. In this paper, we present an unsupervised framework for removing rain streaks from real-world rainy images. We focus on the natural phenomena that static rainy scenes capture a common background but different rain streak. From this observation, we train siamese network with the real rain image pairs, which outputs identical backgrounds from the pairs. To train our network, a real rainy dataset is constructed via web-crawling. We show that our unsupervised framework outperforms the recent CNN-based approaches, which are trained by supervised manner. Experimental results demonstrate that the effectiveness of our framework on both synthetic and real-world datasets, showing improved performance over previous approaches.

Image-to-Image Translation with GAN for Synthetic Data Augmentation in Plant Disease Datasets

  • Nazki, Haseeb;Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.46-57
    • /
    • 2019
  • In recent research, deep learning-based methods have achieved state-of-the-art performance in various computer vision tasks. However, these methods are commonly supervised, and require huge amounts of annotated data to train. Acquisition of data demands an additional costly effort, particularly for the tasks where it becomes challenging to obtain large amounts of data considering the time constraints and the requirement of professional human diligence. In this paper, we present a data level synthetic sampling solution to learn from small and imbalanced data sets using Generative Adversarial Networks (GANs). The reason for using GANs are the challenges posed in various fields to manage with the small datasets and fluctuating amounts of samples per class. As a result, we present an approach that can improve learning with respect to data distributions, reducing the partiality introduced by class imbalance and hence shifting the classification decision boundary towards more accurate results. Our novel method is demonstrated on a small dataset of 2789 tomato plant disease images, highly corrupted with class imbalance in 9 disease categories. Moreover, we evaluate our results in terms of different metrics and compare the quality of these results for distinct classes.

Study of oversampling algorithms for soil classifications by field velocity resistivity probe

  • Lee, Jong-Sub;Park, Junghee;Kim, Jongchan;Yoon, Hyung-Koo
    • Geomechanics and Engineering
    • /
    • v.30 no.3
    • /
    • pp.247-258
    • /
    • 2022
  • A field velocity resistivity probe (FVRP) can measure compressional waves, shear waves and electrical resistivity in boreholes. The objective of this study is to perform the soil classification through a machine learning technique through elastic wave velocity and electrical resistivity measured by FVRP. Field and laboratory tests are performed, and the measured values are used as input variables to classify silt sand, sand, silty clay, and clay-sand mixture layers. The accuracy of k-nearest neighbors (KNN), naive Bayes (NB), random forest (RF), and support vector machine (SVM), selected to perform classification and optimize the hyperparameters, is evaluated. The accuracies are calculated as 0.76, 0.91, 0.94, and 0.88 for KNN, NB, RF, and SVM algorithms, respectively. To increase the amount of data at each soil layer, the synthetic minority oversampling technique (SMOTE) and conditional tabular generative adversarial network (CTGAN) are applied to overcome imbalance in the dataset. The CTGAN provides improved accuracy in the KNN, NB, RF and SVM algorithms. The results demonstrate that the measured values by FVRP can classify soil layers through three kinds of data with machine learning algorithms.

SAR Recognition of Target Variants Using Channel Attention Network without Dimensionality Reduction (차원축소 없는 채널집중 네트워크를 이용한 SAR 변형표적 식별)

  • Park, Ji-Hoon;Choi, Yeo-Reum;Chae, Dae-Young;Lim, Ho
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.3
    • /
    • pp.219-230
    • /
    • 2022
  • In implementing a robust automatic target recognition(ATR) system with synthetic aperture radar(SAR) imagery, one of the most important issues is accurate classification of target variants, which are the same targets with different serial numbers, configurations and versions, etc. In this paper, a deep learning network with channel attention modules is proposed to cope with the recognition problem for target variants based on the previous research findings that the channel attention mechanism selectively emphasizes the useful features for target recognition. Different from other existing attention methods, this paper employs the channel attention modules without dimensionality reduction along the channel direction from which direct correspondence between feature map channels can be preserved and the features valuable for recognizing SAR target variants can be effectively derived. Experiments with the public benchmark dataset demonstrate that the proposed scheme is superior to the network with other existing channel attention modules.

Comparative Study of Deep Learning Model for Semantic Segmentation of Water System in SAR Images of KOMPSAT-5 (아리랑 5호 위성 영상에서 수계의 의미론적 분할을 위한 딥러닝 모델의 비교 연구)

  • Kim, Min-Ji;Kim, Seung Kyu;Lee, DoHoon;Gahm, Jin Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.2
    • /
    • pp.206-214
    • /
    • 2022
  • The way to measure the extent of damage from floods and droughts is to identify changes in the extent of water systems. In order to effectively grasp this at a glance, satellite images are used. KOMPSAT-5 uses Synthetic Aperture Radar (SAR) to capture images regardless of weather conditions such as clouds and rain. In this paper, various deep learning models are applied to perform semantic segmentation of the water system in this SAR image and the performance is compared. The models used are U-net, V-Net, U2-Net, UNet 3+, PSPNet, Deeplab-V3, Deeplab-V3+ and PAN. In addition, performance comparison was performed when the data was augmented by applying elastic deformation to the existing SAR image dataset. As a result, without data augmentation, U-Net was the best with IoU of 97.25% and pixel accuracy of 98.53%. In case of data augmentation, Deeplab-V3 showed IoU of 95.15% and V-Net showed the best pixel accuracy of 96.86%.

Development of relational river data model based on river network for multi-dimensional river information system (다차원 하천정보체계 구축을 위한 하천네트워크 기반 관계형 하천 데이터 모델 개발)

  • Choi, Seungsoo;Kim, Dongsu;You, Hojun
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.4
    • /
    • pp.335-346
    • /
    • 2018
  • A vast amount of riverine spatial dataset have recently become available, which include hydrodynamic and morphological survey data by advanced instrumentations such as ADCP (Acoustic Doppler Current Profiler), transect measurements obtained through building various river basic plans, riverine environmental and ecological data, optical images using UAVs, river facilities like multi-purposed weir and hydrophilic sectors. In this regard, a standardized data model has been subsequently required in order to efficiently store, manage, and share riverine spatial dataset. Given that riverine spatial dataset such as river facility, transect measurement, time-varying observed data should be synthetically managed along specified river network, conventional data model showed a tendency to maintain them individually in a form of separate layer corresponding to each theme, which can miss their spatial relationship, thereby resulting in inefficiency to derive synthetic information. Moreover, the data model had to be significantly modified to ingest newly produced data and hampered efficient searches for specific conditions. To avoid such drawbacks for layer-based data model, this research proposed a relational data model in conjunction with river network which could be a backbone to relate additional spatial dataset such as flowline, river facility, transect measurement and surveyed dataset. The new data model contains flexibility to minimize changes of its structure when it deals with any multi-dimensional river data, and assigned reach code for multiple river segments delineated from a river. To realize the newly developed data model, Seom river was applied, where geographic informations related with national and local rivers are available.

Exploring the Performance of Synthetic Minority Over-sampling Technique (SMOTE) to Predict Good Borrowers in P2P Lending (P2P 대부 우수 대출자 예측을 위한 합성 소수집단 오버샘플링 기법 성과에 관한 탐색적 연구)

  • Costello, Francis Joseph;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.71-78
    • /
    • 2019
  • This study aims to identify good borrowers within the context of P2P lending. P2P lending is a growing platform that allows individuals to lend and borrow money from each other. Inherent in any loans is credit risk of borrowers and needs to be considered before any lending. Specifically in the context of P2P lending, traditional models fall short and thus this study aimed to rectify this as well as explore the problem of class imbalances seen within credit risk data sets. This study implemented an over-sampling technique known as Synthetic Minority Over-sampling Technique (SMOTE). To test our approach, we implemented five benchmarking classifiers such as support vector machines, logistic regression, k-nearest neighbor, random forest, and deep neural network. The data sample used was retrieved from the publicly available LendingClub dataset. The proposed SMOTE revealed significantly improved results in comparison with the benchmarking classifiers. These results should help actors engaged within P2P lending to make better informed decisions when selecting potential borrowers eliminating the higher risks present in P2P lending.