• Title/Summary/Keyword: Training Datasets

Search Result 335, Processing Time 0.027 seconds

Enhancing Gene Expression Classification of Support Vector Machines with Generative Adversarial Networks

  • Huynh, Phuoc-Hai;Nguyen, Van Hoa;Do, Thanh-Nghi
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.14-20
    • /
    • 2019
  • Currently, microarray gene expression data take advantage of the sufficient classification of cancers, which addresses the problems relating to cancer causes and treatment regimens. However, the sample size of gene expression data is often restricted, because the price of microarray technology on studies in humans is high. We propose enhancing the gene expression classification of support vector machines with generative adversarial networks (GAN-SVMs). A GAN that generates new data from original training datasets was implemented. The GAN was used in conjunction with nonlinear SVMs that efficiently classify gene expression data. Numerical test results on 20 low-sample-size and very high-dimensional microarray gene expression datasets from the Kent Ridge Biomedical and Array Expression repositories indicate that the model is more accurate than state-of-the-art classifying models.

A Study on the Classification Model of Minhwa Genre Based on Deep Learning (딥러닝 기반 민화 장르 분류 모델 연구)

  • Yoon, Soorim;Lee, Young-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1524-1534
    • /
    • 2022
  • This study proposes the classification model of Minhwa genre based on object detection of deep learning. To detect unique Korean traditional objects in Minhwa, we construct custom datasets by labeling images using object keywords in Minhwa DB. We train YOLOv5 models with custom datasets, and classify images using predicted object labels result, the output of model training. The algorithm consists of two classification steps: 1) according to the painting technique and 2) genre of Minhwa. Through classifying paintings using this algorithm on the Internet, it is expected that the correct information of Minhwa can be built and provided to users forward.

Performance Improvement of Fuzzy C-Means Clustering Algorithm by Optimized Early Stopping for Inhomogeneous Datasets

  • Chae-Rim Han;Sun-Jin Lee;Il-Gu Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.198-207
    • /
    • 2023
  • Responding to changes in artificial intelligence models and the data environment is crucial for increasing data-learning accuracy and inference stability of industrial applications. A learning model that is overfitted to specific training data leads to poor learning performance and a deterioration in flexibility. Therefore, an early stopping technique is used to stop learning at an appropriate time. However, this technique does not consider the homogeneity and independence of the data collected by heterogeneous nodes in a differential network environment, thus resulting in low learning accuracy and degradation of system performance. In this study, the generalization performance of neural networks is maximized, whereas the effect of the homogeneity of datasets is minimized by achieving an accuracy of 99.7%. This corresponds to a decrease in delay time by a factor of 2.33 and improvement in performance by a factor of 2.5 compared with the conventional method.

Human Posture Recognition: Methodology and Implementation

  • Htike, Kyaw Kyaw;Khalifa, Othman O.
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.4
    • /
    • pp.1910-1914
    • /
    • 2015
  • Human posture recognition is an attractive and challenging topic in computer vision due to its promising applications in the areas of personal health care, environmental awareness, human-computer-interaction and surveillance systems. Human posture recognition in video sequences consists of two stages: the first stage is training and evaluation and the second is deployment. In the first stage, the system is trained and evaluated using datasets of human postures to ‘teach’ the system to classify human postures for any future inputs. When the training and evaluation process is deemed satisfactory as measured by recognition rates, the trained system is then deployed to recognize human postures in any input video sequence. Different classifiers were used in the training such as Multilayer Perceptron Feedforward Neural networks, Self-Organizing Maps, Fuzzy C Means and K Means. Results show that supervised learning classifiers tend to perform better than unsupervised classifiers for the case of human posture recognition.

Classification of Class-Imbalanced Data: Effect of Over-sampling and Under-sampling of Training Data (계급불균형자료의 분류: 훈련표본 구성방법에 따른 효과)

  • 김지현;정종빈
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.445-457
    • /
    • 2004
  • Given class-imbalanced data in two-class classification problem, we often do over-sampling and/or under-sampling of training data to make it balanced. We investigate the validity of such practice. Also we study the effect of such sampling practice on boosting of classification trees. Through experiments on twelve real datasets it is observed that keeping the natural distribution of training data is the best way if you plan to apply boosting methods to class-imbalanced data.

Pipeline wall thinning rate prediction model based on machine learning

  • Moon, Seongin;Kim, Kyungmo;Lee, Gyeong-Geun;Yu, Yongkyun;Kim, Dong-Jin
    • Nuclear Engineering and Technology
    • /
    • v.53 no.12
    • /
    • pp.4060-4066
    • /
    • 2021
  • Flow-accelerated corrosion (FAC) of carbon steel piping is a significant problem in nuclear power plants. The basic process of FAC is currently understood relatively well; however, the accuracy of prediction models of the wall-thinning rate under an FAC environment is not reliable. Herein, we propose a methodology to construct pipe wall-thinning rate prediction models using artificial neural networks and a convolutional neural network, which is confined to a straight pipe without geometric changes. Furthermore, a methodology to generate training data is proposed to efficiently train the neural network for the development of a machine learning-based FAC prediction model. Consequently, it is concluded that machine learning can be used to construct pipe wall thinning rate prediction models and optimize the number of training datasets for training the machine learning algorithm. The proposed methodology can be applied to efficiently generate a large dataset from an FAC test to develop a wall thinning rate prediction model for a real situation.

Benchmark for Deep Learning based Visual Odometry and Monocular Depth Estimation (딥러닝 기반 영상 주행기록계와 단안 깊이 추정 및 기술을 위한 벤치마크)

  • Choi, Hyukdoo
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.114-121
    • /
    • 2019
  • This paper presents a new benchmark system for visual odometry (VO) and monocular depth estimation (MDE). As deep learning has become a key technology in computer vision, many researchers are trying to apply deep learning to VO and MDE. Just a couple of years ago, they were independently studied in a supervised way, but now they are coupled and trained together in an unsupervised way. However, before designing fancy models and losses, we have to customize datasets to use them for training and testing. After training, the model has to be compared with the existing models, which is also a huge burden. The benchmark provides input dataset ready-to-use for VO and MDE research in 'tfrecords' format and output dataset that includes model checkpoints and inference results of the existing models. It also provides various tools for data formatting, training, and evaluation. In the experiments, the exsiting models were evaluated to verify their performances presented in the corresponding papers and we found that the evaluation result is inferior to the presented performances.

Channel modeling based on multilayer artificial neural network in metro tunnel environments

  • Jingyuan Qian;Asad Saleem;Guoxin Zheng
    • ETRI Journal
    • /
    • v.45 no.4
    • /
    • pp.557-569
    • /
    • 2023
  • Traditional deterministic channel modeling is accurate in prediction, but due to its complexity, improving computational efficiency remains a challenge. In an alternative approach, we investigated a multilayer artificial neural network (ANN) to predict large-scale and small-scale channel characteristics in metro tunnels. Simulated high-precision training datasets were obtained by combining measurement campaign with a ray tracing (RT) method in a metro tunnel. Performance on the training data was used to determine the number of hidden layers and neurons of the multilayer ANN. The proposed multilayer ANN performed efficiently (10 s for training; 0.19 ms for prediction), and accurately, with better approximation of the RT data than the single-layer ANN. The root mean square errors (RMSE) of path loss (2.82 dB), root mean square delay spread (0.61 ns), azimuth angle spread (3.06°), and elevation angle spread (1.22°) were impressive. These results demonstrate the superior computing efficiency and model complexity of ANNs.

Prediction of Venous Trans-Stenotic Pressure Gradient Using Shape Features Derived From Magnetic Resonance Venography in Idiopathic Intracranial Hypertension Patients

  • Chao Ma;Haoyu Zhu;Shikai Liang;Yuzhou Chang;Dapeng Mo;Chuhan Jiang;Yupeng Zhang
    • Korean Journal of Radiology
    • /
    • v.25 no.1
    • /
    • pp.74-85
    • /
    • 2024
  • Objective: Idiopathic intracranial hypertension (IIH) is a condition of unknown etiology associated with venous sinus stenosis. This study aimed to develop a magnetic resonance venography (MRV)-based radiomics model for predicting a high trans-stenotic pressure gradient (TPG) in IIH patients diagnosed with venous sinus stenosis. Materials and Methods: This retrospective study included 105 IIH patients (median age [interquartile range], 35 years [27-42 years]; female:male, 82:23) who underwent MRV and catheter venography complemented by venous manometry. Contrast enhanced-MRV was conducted under 1.5 Tesla system, and the images were reconstructed using a standard algorithm. Shape features were derived from MRV images via the PyRadiomics package and selected by utilizing the least absolute shrinkage and selection operator (LASSO) method. A radiomics score for predicting high TPG (≥ 8 mmHg) in IIH patients was formulated using multivariable logistic regression; its discrimination performance was assessed using the area under the receiver operating characteristic curve (AUROC). A nomogram was constructed by incorporating the radiomics scores and clinical features. Results: Data from 105 patients were randomly divided into two distinct datasets for model training (n = 73; 50 and 23 with and without high TPG, respectively) and testing (n = 32; 22 and 10 with and without high TPG, respectively). Three informative shape features were identified in the training datasets: least axis length, sphericity, and maximum three-dimensional diameter. The radiomics score for predicting high TPG in IIH patients demonstrated an AUROC of 0.906 (95% confidence interval, 0.836-0.976) in the training dataset and 0.877 (95% confidence interval, 0.755-0.999) in the test dataset. The nomogram showed good calibration. Conclusion: Our study presents the feasibility of a novel model for predicting high TPG in IIH patients using radiomics analysis of noninvasive MRV-based shape features. This information may aid clinicians in identifying patients who may benefit from stenting.

Land Cover Classification Using Sematic Image Segmentation with Deep Learning (딥러닝 기반의 영상분할을 이용한 토지피복분류)

  • Lee, Seonghyeok;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.2
    • /
    • pp.279-288
    • /
    • 2019
  • We evaluated the land cover classification performance of SegNet, which features semantic segmentation of aerial imagery. We selected four semantic classes, i.e., urban, farmland, forest, and water areas, and created 2,000 datasets using aerial images and land cover maps. The datasets were divided at a 8:2 ratio into training (1,600) and validation datasets (400); we evaluated validation accuracy after tuning the hyperparameters. SegNet performance was optimal at a batch size of five with 100,000 iterations. When 200 test datasets were subjected to semantic segmentation using the trained SegNet model, the accuracies were farmland 87.89%, forest 87.18%, water 83.66%, and urban regions 82.67%; the overall accuracy was 85.48%. Thus, deep learning-based semantic segmentation can be used to classify land cover.