• Title/Summary/Keyword: SVM algorithm

Search Result 638, Processing Time 0.026 seconds

Filter-Bank Based Regularized Common Spatial Pattern for Classification of Motor Imagery EEG (동작 상상 EEG 분류를 위한 필터 뱅크 기반 정규화 공통 공간 패턴)

  • Park, Sang-Hoon;Kim, Ha-Young;Lee, David;Lee, Sang-Goog
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.587-594
    • /
    • 2017
  • Recently, motor imagery electroencephalogram(EEG) based Brain-Computer Interface(BCI) systems have received a significant amount of attention in various fields, including medicine and engineering. The Common Spatial Pattern(CSP) algorithm is the most commonly-used method to extract the features from motor imagery EEG. However, the CSP algorithm has limited applicability in Small-Sample Setting(SSS) situations because these situations rely on a covariance matrix. In addition, large differences in performance depend on the frequency bands that are being used. To address these problems, 4-40Hz band EEG signals are divided using nine filter-banks and Regularized CSP(R-CSP) is applied to individual frequency bands. Then, the Mutual Information-Based Individual Feature(MIBIF) algorithm is applied to the features of R-CSP for selecting discriminative features. Thereafter, selected features are used as inputs of the classifier Least Square Support Vector Machine(LS-SVM). The proposed method yielded a classification accuracy of 87.5%, 100%, 63.78%, 82.14%, and 86.11% in five subjects("aa", "al", "av", "aw", and "ay", respectively) for BCI competition III dataset IVa by using 18 channels in the vicinity of the motor area of the cerebral cortex. The proposed method improved the mean classification accuracy by 16.21%, 10.77% and 3.32% compared to the CSP, R-CSP and FBCSP, respectively The proposed method shows a particularly excellent performance in the SSS situation.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

A study on the rock mass classification in boreholes for a tunnel design using machine learning algorithms (머신러닝 기법을 활용한 터널 설계 시 시추공 내 암반분류에 관한 연구)

  • Lee, Je-Kyum;Choi, Won-Hyuk;Kim, Yangkyun;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.469-484
    • /
    • 2021
  • Rock mass classification results have a great influence on construction schedule and budget as well as tunnel stability in tunnel design. A total of 3,526 tunnels have been constructed in Korea and the associated techniques in tunnel design and construction have been continuously developed, however, not many studies have been performed on how to assess rock mass quality and grade more accurately. Thus, numerous cases show big differences in the results according to inspectors' experience and judgement. Hence, this study aims to suggest a more reliable rock mass classification (RMR) model using machine learning algorithms, which is surging in availability, through the analyses based on various rock and rock mass information collected from boring investigations. For this, 11 learning parameters (depth, rock type, RQD, electrical resistivity, UCS, Vp, Vs, Young's modulus, unit weight, Poisson's ratio, RMR) from 13 local tunnel cases were selected, 337 learning data sets as well as 60 test data sets were prepared, and 6 machine learning algorithms (DT, SVM, ANN, PCA & ANN, RF, XGBoost) were tested for various hyperparameters for each algorithm. The results show that the mean absolute errors in RMR value from five algorithms except Decision Tree were less than 8 and a Support Vector Machine model is the best model. The applicability of the model, established through this study, was confirmed and this prediction model can be applied for more reliable rock mass classification when additional various data is continuously cumulated.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

Research on Classification of Human Emotions Using EEG Signal (뇌파신호를 이용한 감정분류 연구)

  • Zubair, Muhammad;Kim, Jinsul;Yoon, Changwoo
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.821-827
    • /
    • 2018
  • Affective computing has gained increasing interest in the recent years with the development of potential applications in Human computer interaction (HCI) and healthcare. Although momentous research has been done on human emotion recognition, however, in comparison to speech and facial expression less attention has been paid to physiological signals. In this paper, Electroencephalogram (EEG) signals from different brain regions were investigated using modified wavelet energy features. For minimization of redundancy and maximization of relevancy among features, mRMR algorithm was deployed significantly. EEG recordings of a publically available "DEAP" database have been used to classify four classes of emotions with Multi class Support Vector Machine. The proposed approach shows significant performance compared to existing algorithms.

A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork

  • Xu, Yi;Chen, Quansheng;Liu, Yan;Sun, Xin;Huang, Qiping;Ouyang, Qin;Zhao, Jiewen
    • Food Science of Animal Resources
    • /
    • v.38 no.2
    • /
    • pp.362-375
    • /
    • 2018
  • This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control.

Decision Making Support System for VTSO using Extracted Ships' Tracks (항적모델 추출을 통한 해상교통관제사 의사결정 지원 방안)

  • Kim, Joo-Sung;Jeong, Jung Sik;Jeong, Jae-Yong;Kim, Yun Ha;Choi, Ikhwan;Kim, Jinhan
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2015.07a
    • /
    • pp.310-311
    • /
    • 2015
  • Ships' tracking data are being monitored and collected by vessel traffic service center in real time. In this paper, we intend to contribute to vessel traffic service operators' decision making through extracting ships' tracking patterns and models based on these data. Support Vector Machine algorithm was used for vessel track modeling to handle and process the data sets and k-fold cross validation was used to select the proper parameters. Proposed data processing methods could support vessel traffic service operators' decision making on case of anomaly detection, calculation ships' dead reckoning positions and etc.

  • PDF

An Analytical Study on Automatic Classification of Domestic Journal articles Based on Machine Learning (기계학습에 기초한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.37-62
    • /
    • 2018
  • This study examined the factors affecting the performance of automatic classification based on machine learning for domestic journal articles in the field of LIS. In particular, In view of the classification performance that assigning automatically the class labels to the articles in "Journal of the Korean Society for Information Management", I investigated the characteristics of the key factors(weighting schemes, training set size, classification algorithms, label assigning methods) through the diversified experiments. Consequently, It is effective to apply each element appropriately according to the classification environment and the characteristics of the document set, and a fairly good performance can be obtained by using a simpler model. In addition, the classification of domestic journals can be considered as a multi-label classification that assigns more than one category to a specific article. Therefore, I proposed an optimal classification model using simple and fast classification algorithm and small learning set considering this environment.

Prediction Model for the Cellular Immortalization and Transformation Potentials of Cell Substrates

  • Lee, Min-Su;Matthews Clayton A.;Chae Min-Ju;Choi, Jung-Yun;Sohn Yeo-Won;Kim, Min-Jung;Lee, Su-Jae;Park, Woong-Yang
    • Genomics & Informatics
    • /
    • v.4 no.4
    • /
    • pp.161-166
    • /
    • 2006
  • The establishment of DNA microarray technology has enabled high-throughput analysis and molecular profiling of various types of cancers. By using the gene expression data from microarray analysis we are able to investigate diagnostic applications at the molecular level. The most important step in the application of microarray technology to cancer diagnostics is the selection of specific markers from gene expression profiles. In order to select markers of Immortalization and transformation we used c-myc and $H-ras^{V12}$ oncogene-transfected NIH3T3 cells as our model system. We have identified 8751 differentially expressed genes in the immortalization/transformation model by multivariate permutation F-test (95% confidence, FDR<0.01). Using the support vector machine algorithm, we selected 13 discriminative genes which could be used to predict immortalization and transformation with perfect accuracy. We assayed $H-ras^{V12}$-transfected 'transformed' cells to validate our immortalization/transformation dassification system. The selected molecular markers generated valuable additional information for tumor diagnosis, prognosis and therapy development.

A Product Quality Prediction Model Using Real-Time Process Monitoring in Manufacturing Supply Chain (실시간 공정 모니터링을 통한 제품 품질 예측 모델 개발)

  • Oh, YeongGwang;Park, Haeseung;Yoo, Arm;Kim, Namhun;Kim, Younghak;Kim, Dongchul;Choi, JinUk;Yoon, Sung Ho;Yang, HeeJong
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.4
    • /
    • pp.271-277
    • /
    • 2013
  • In spite of the emphasis on quality control in auto-industry, most of subcontract enterprises still lack a systematic in-process quality monitoring system for predicting the product/part quality for their customers. While their manufacturing processes have been getting automated and computer-controlled ever, there still exist many uncertain parameters and the process controls still rely on empirical works by a few skilled operators and quality experts. In this paper, a real-time product quality monitoring system for auto-manufacturing industry is presented to provide the systematic method of predicting product qualities from real-time production data. The proposed framework consists of a product quality ontology model for complex manufacturing supply chain environments, and a real-time quality prediction tool using support vector machine algorithm that enables the quality monitoring system to classify the product quality patterns from the in-process production data. A door trim production example is illustrated to verify the proposed quality prediction model.