• 제목/요약/키워드: 10-fold cross validation

검색결과 203건 처리시간 0.022초

유출예측을 위한 진화적 기계학습 접근법의 구현: 알제리 세이보스 하천의 사례연구 (Implementation on the evolutionary machine learning approaches for streamflow forecasting: case study in the Seybous River, Algeria)

  • 자크로프 마샵;보첼키아 하미드;스탬바울 마대니;김성원;싱 비제이
    • 한국수자원학회논문집
    • /
    • 제53권6호
    • /
    • pp.395-408
    • /
    • 2020
  • 본 연구논문은 북부아프리카의 알제리에 위치한 하천유역에서 다중선행일 유출량의 예측을 위하여 진화적 최적화기법과 k-fold 교차검증을 결합한 세 개의 서로 다른 기계학습 접근법 (인공신경망, 적응 뉴로퍼지 시스템, 그리고 웨이블릿 기반 신경망)을 개발하고 적용하는 것이다. 인공신경망과 적응 뉴로퍼지 시스템은 root mean squared error (RMSE), Nash-Sutcliffe efficiency (NSE), correlation coefficient (R), 그리고 peak flow criteria (PFC) 의 네 개의 통계지표를 기반으로 하여 모형의 훈련 및 테스팅 결과 유사한 모형수행결과를 나타내었다. 웨이블릿 기반 신경망모형은 하루선행일 테스팅의 결과 RMSE = 8.590 ㎥/sec 과 PFC = 0.252로 분석되어서 인공신경망의 RMSE = 19.120 ㎥/sec, PFC = 0.446 과 적응 뉴로퍼지 시스템의 RMSE = 18.520 ㎥/sec, PFC = 0.444 보다 양호한 결과를 나타내었고, NSE와 R의 값도 웨이블릿 기반 신경망모형이 우수한 것으로 나타났다. 그러므로 웨이블릿 기반 신경망은 알제리 세이보스 하천에서 다중선행일의 예측을 위하여 효율적인 도구로 사용할 수 있다.

A novel method for predicting protein subcellular localization based on pseudo amino acid composition

  • Ma, Junwei;Gu, Hong
    • BMB Reports
    • /
    • 제43권10호
    • /
    • pp.670-676
    • /
    • 2010
  • In this paper, a novel approach, ELM-PCA, is introduced for the first time to predict protein subcellular localization. Firstly, Protein Samples are represented by the pseudo amino acid composition (PseAAC). Secondly, the principal component analysis (PCA) is employed to extract essential features. Finally, the Elman Recurrent Neural Network (RNN) is used as a classifier to identify the protein sequences. The results demonstrate that the proposed approach is effective and practical.

fMRI 데이터를 이용한 알츠하이머 진행상태 분류 (Alzheimer progression classification using fMRI data)

  • 노주현;양희덕
    • 스마트미디어저널
    • /
    • 제13권4호
    • /
    • pp.86-93
    • /
    • 2024
  • 기능적 자기 공명영상(functional magnetic resonance imaging;fMRI)의 발전은 뇌 기능의 매핑, 휴식 상태에서 뇌 네트워크의 이해에 상당한 기여를 하였다. 본 논문은 알츠하이머의 진행상태를 분류하기 위해 CNN-LSTM 기반의 분류 모델을 제안한다. 첫 번째로 특징 추출 이전 fMRI 데이터에서 잡음을 제거하기 위해 4단계의 전처리를 수행한다. 두 번째, 전처리가 끝나면 U-Net 구조를 활용하여 공간적 특징을 추출한다. 세 번째, 추출된 공간적 특징은 LSTM을 활용하여 시간적 특징을 추출하여 최종적으로 분류하는 과정을 거친다. 실험은 데이터의 시간차원을 조절하여 진행하였다. 5-fold 교차 검증을 사용하여 평균 96.4%의 정확도를 달성하였고 이러한 결과는 제안된 방법이 fMRI 데이터를 분석하여 알츠하이머의 진행을 식별하는데 높은 잠재력을 가지고 있음을 보여준다.

Object Detection from Mongolian Nomadic Environmental Images

  • Perenleilkhundev, Gantuya;Batdemberel, Mungunshagai;Battulga, Batnyam;Batsuuri, Suvdaa
    • Journal of Multimedia Information System
    • /
    • 제6권4호
    • /
    • pp.173-178
    • /
    • 2019
  • Mongolian historical and cultural monuments on settlement areas of stone inscriptions, stone images, rock-drawings, remains of cities, architecture are still telling us their stories. These monuments depict the understanding of the word, philosophical and artistic outlook, beliefs, religion, national art, language, culture and traditions of Mongols [1]. Nowadays computer science, especially computer vision is applying in the other science fields. The main problem is how to apply and which algorithm can detect and classify the objects correctly. In this paper, we propose a method to detect object from Mongolian nomadic environment images. This work proposes a method for object detection that is the combination of the binary operations in the edge detection results. We found out the best method and parameters of state-of-the-art machine learning algorithms. In experimental result, we evaluate our results with 10-fold cross validation and split 66% strategies.

A Comparative Study of Local Features in Face-based Video Retrieval

  • Zhou, Juan;Huang, Lan
    • Journal of Computing Science and Engineering
    • /
    • 제11권1호
    • /
    • pp.24-31
    • /
    • 2017
  • Face-based video retrieval has become an active and important branch of intelligent video analysis. Face profiling and matching is a fundamental step and is crucial to the effectiveness of video retrieval. Although many algorithms have been developed for processing static face images, their effectiveness in face-based video retrieval is still unknown, simply because videos have different resolutions, faces vary in scale, and different lighting conditions and angles are used. In this paper, we combined content-based and semantic-based image analysis techniques, and systematically evaluated four mainstream local features to represent face images in the video retrieval task: Harris operators, SIFT and SURF descriptors, and eigenfaces. Results of ten independent runs of 10-fold cross-validation on datasets consisting of TED (Technology Entertainment Design) talk videos showed the effectiveness of our approach, where the SIFT descriptors achieved an average F-score of 0.725 in video retrieval and thus were the most effective, while the SURF descriptors were computed in 0.3 seconds per image on average and were the most efficient in most cases.

가속도센서를 이용한 편마비성보행 평가 (Evaluation of Hemiplegic Gait Using Accelerometer)

  • 이준석;박수지;신항식
    • 전기학회논문지
    • /
    • 제66권11호
    • /
    • pp.1634-1640
    • /
    • 2017
  • The study aims to distinguish hemiplegic gait and normal gait using simple wearable device and classification algorithm. Thus, we developed a wearable system equipped three axis accelerometer and three axis gyroscope. The developed wearable system was verified by clinical experiment. In experiment, twenty one normal subjects and twenty one patients undergoing stroke treatment were participated. Based on the measured inertial signal, a random forest algorithm was used to classify hemiplegic gait. Four-fold cross validation was applied to ensure the reliability of the results. To select optimal attributes, we applied the forward search algorithm with 10 times of repetition, then selected five most frequently attributes were chosen as a final attribute. The results of this study showed that 95.2% of accuracy in hemiplegic gait and normal gait classification and 77.4% of accuracy in hemiplegic-side and normal gait classification.

An Improvement of AdaBoost using Boundary Classifier

  • 이원주;천민규;현창호;박민용
    • 한국지능시스템학회논문지
    • /
    • 제23권2호
    • /
    • pp.166-171
    • /
    • 2013
  • The method proposed in this paper can improve the performance of the Boosting algorithm in machine learning. The proposed Boundary AdaBoost algorithm can make up for the weak points of Normal binary classifier using threshold boundary concepts. The new proposed boundary can be located near the threshold of the binary classifier. The proposed algorithm improves classification in areas where Normal binary classifier is weak. Thus, the optimal boundary final classifier can decrease error rates classified with more reasonable features. Finally, this paper derives the new algorithm's optimal solution, and it demonstrates how classifier accuracy can be improved using the proposed Boundary AdaBoost in a simulation experiment of pedestrian detection using 10-fold cross validation.

Hybridized Decision Tree methods for Detecting Generic Attack on Ciphertext

  • Alsariera, Yazan Ahmad
    • International Journal of Computer Science & Network Security
    • /
    • 제21권7호
    • /
    • pp.56-62
    • /
    • 2021
  • The surge in generic attacks execution against cipher text on the computer network has led to the continuous advancement of the mechanisms to protect information integrity and confidentiality. The implementation of explicit decision tree machine learning algorithm is reported to accurately classifier generic attacks better than some multi-classification algorithms as the multi-classification method suffers from detection oversight. However, there is a need to improve the accuracy and reduce the false alarm rate. Therefore, this study aims to improve generic attack classification by implementing two hybridized decision tree algorithms namely Naïve Bayes Decision tree (NBTree) and Logistic Model tree (LMT). The proposed hybridized methods were developed using the 10-fold cross-validation technique to avoid overfitting. The generic attack detector produced a 99.8% accuracy, an FPR score of 0.002 and an MCC score of 0.995. The performances of the proposed methods were better than the existing decision tree method. Similarly, the proposed method outperformed multi-classification methods for detecting generic attacks. Hence, it is recommended to implement hybridized decision tree method for detecting generic attacks on a computer network.

텍스트 분류 기반 기계학습의 정신과 진단 예측 적용 (Application of Text-Classification Based Machine Learning in Predicting Psychiatric Diagnosis)

  • 백두현;황민규;이민지;우성일;한상우;이연정;황재욱
    • 생물정신의학
    • /
    • 제27권1호
    • /
    • pp.18-26
    • /
    • 2020
  • Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpatient admission notes with three diagnoses of major depressive disorder, type 1 bipolar disorder, and schizophrenia. Data were split into 400 training data and 94 independent validation data. Data were vectorized by two different models such as term frequency-inverse document frequency (TF-IDF) and Doc2vec. Machine learning models for classification including stochastic gradient descent, logistic regression, support vector classification, and deep learning (DL) were applied to predict three psychiatric diagnoses. Five-fold cross-validation was used to find an effective model. Metrics such as accuracy, precision, recall, and F1-score were measured for comparison between the models. Results Five-fold cross-validation in training data showed DL model with Doc2vec was the most effective model to predict the diagnosis (accuracy = 0.87, F1-score = 0.87). However, these metrics have been reduced in independent test data set with final working DL models (accuracy = 0.79, F1-score = 0.79), while the model of logistic regression and support vector machine with Doc2vec showed slightly better performance (accuracy = 0.80, F1-score = 0.80) than the DL models with Doc2vec and others with TF-IDF. Conclusions The current results suggest that the vectorization may have more impact on the performance of classification than the machine learning model. However, data set had a number of limitations including small sample size, imbalance among the category, and its generalizability. With this regard, the need for research with multi-sites and large samples is suggested to improve the machine learning models.

The Prediction Ability of Genomic Selection in the Wheat Core Collection

  • Yuna Kang;Changsoo Kim
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.235-235
    • /
    • 2022
  • Genome selection is a promising tool for plant and animal breeding, which uses genome-wide molecular marker data to capture large and small effect quantitative trait loci and predict the genetic value of selection candidates. Genomic selection has been shown previously to have higher prediction accuracies than conventional marker-assisted selection (MAS) for quantitative traits. In this study, the prediction accuracy of 10 agricultural traits in the wheat core group with 567 points was compared. We used a cross-validation approach to train and validate prediction accuracy to evaluate the effects of training population size and training model.As for the prediction accuracy according to the model, the prediction accuracy of 0.4 or more was evaluated except for the SVN model among the 6 models (GBLUP, LASSO, BayseA, RKHS, SVN, RF) used in most all traits. For traits such as days to heading and days to maturity, the prediction accuracy was very high, over 0.8. As for the prediction accuracy according to the training group, the prediction accuracy increased as the number of training groups increased in all traits. It was confirmed that the prediction accuracy was different in the training population according to the genetic composition regardless of the number. All training models were verified through 5-fold cross-validation. To verify the prediction ability of the training population of the wheat core collection, we compared the actual phenotype and genomic estimated breeding value using 35 breeding population. In fact, out of 10 individuals with the fastest days to heading, 5 individuals were selected through genomic selection, and 6 individuals were selected through genomic selection out of the 10 individuals with the slowest days to heading. Therefore, we confirmed the possibility of selecting individuals according to traits with only the genotype for a shorter period of time through genomic selection.

  • PDF