• 제목/요약/키워드: SVM (Support Vector Method)

Search Result 655, Processing Time 0.031 seconds

A Method for Prediction of Quality Defects in Manufacturing Using Natural Language Processing and Machine Learning (자연어 처리 및 기계학습을 활용한 제조업 현장의 품질 불량 예측 방법론)

  • Roh, Jeong-Min;Kim, Yongsung
    • Journal of Platform Technology
    • /
    • v.9 no.3
    • /
    • pp.52-62
    • /
    • 2021
  • Quality control is critical at manufacturing sites and is key to predicting the risk of quality defect before manufacturing. However, the reliability of manual quality control methods is affected by human and physical limitations because manufacturing processes vary across industries. These limitations become particularly obvious in domain areas with numerous manufacturing processes, such as the manufacture of major nuclear equipment. This study proposed a novel method for predicting the risk of quality defects by using natural language processing and machine learning. In this study, production data collected over 6 years at a factory that manufactures main equipment that is installed in nuclear power plants were used. In the preprocessing stage of text data, a mapping method was applied to the word dictionary so that domain knowledge could be appropriately reflected, and a hybrid algorithm, which combined n-gram, Term Frequency-Inverse Document Frequency, and Singular Value Decomposition, was constructed for sentence vectorization. Next, in the experiment to classify the risky processes resulting in poor quality, k-fold cross-validation was applied to categorize cases from Unigram to cumulative Trigram. Furthermore, for achieving objective experimental results, Naive Bayes and Support Vector Machine were used as classification algorithms and the maximum accuracy and F1-score of 0.7685 and 0.8641, respectively, were achieved. Thus, the proposed method is effective. The performance of the proposed method were compared and with votes of field engineers, and the results revealed that the proposed method outperformed field engineers. Thus, the method can be implemented for quality control at manufacturing sites.

Development on Early Warning System about Technology Leakage of Small and Medium Enterprises (중소기업 기술 유출에 대한 조기경보시스템 개발에 대한 연구)

  • Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.143-159
    • /
    • 2017
  • Due to the rapid development of IT in recent years, not only personal information but also the key technologies and information leakage that companies have are becoming important issues. For the enterprise, the core technology that the company possesses is a very important part for the survival of the enterprise and for the continuous competitive advantage. Recently, there have been many cases of technical infringement. Technology leaks not only cause tremendous financial losses such as falling stock prices for companies, but they also have a negative impact on corporate reputation and delays in corporate development. In the case of SMEs, where core technology is an important part of the enterprise, compared to large corporations, the preparation for technological leakage can be seen as an indispensable factor in the existence of the enterprise. As the necessity and importance of Information Security Management (ISM) is emerging, it is necessary to check and prepare for the threat of technology infringement early in the enterprise. Nevertheless, previous studies have shown that the majority of policy alternatives are represented by about 90%. As a research method, literature analysis accounted for 76% and empirical and statistical analysis accounted for a relatively low rate of 16%. For this reason, it is necessary to study the management model and prediction model to prevent leakage of technology to meet the characteristics of SMEs. In this study, before analyzing the empirical analysis, we divided the technical characteristics from the technology value perspective and the organizational factor from the technology control point based on many previous researches related to the factors affecting the technology leakage. A total of 12 related variables were selected for the two factors, and the analysis was performed with these variables. In this study, we use three - year data of "Small and Medium Enterprise Technical Statistics Survey" conducted by the Small and Medium Business Administration. Analysis data includes 30 industries based on KSIC-based 2-digit classification, and the number of companies affected by technology leakage is 415 over 3 years. Through this data, we conducted a randomized sampling in the same industry based on the KSIC in the same year, and compared with the companies (n = 415) and the unaffected firms (n = 415) 1:1 Corresponding samples were prepared and analyzed. In this research, we will conduct an empirical analysis to search for factors influencing technology leakage, and propose an early warning system through data mining. Specifically, in this study, based on the questionnaire survey of SMEs conducted by the Small and Medium Business Administration (SME), we classified the factors that affect the technology leakage of SMEs into two factors(Technology Characteristics, Organization Characteristics). And we propose a model that informs the possibility of technical infringement by using Support Vector Machine(SVM) which is one of the various techniques of data mining based on the proven factors through statistical analysis. Unlike previous studies, this study focused on the cases of various industries in many years, and it can be pointed out that the artificial intelligence model was developed through this study. In addition, since the factors are derived empirically according to the actual leakage of SME technology leakage, it will be possible to suggest to policy makers which companies should be managed from the viewpoint of technology protection. Finally, it is expected that the early warning model on the possibility of technology leakage proposed in this study will provide an opportunity to prevent technology Leakage from the viewpoint of enterprise and government in advance.

Filter-Bank Based Regularized Common Spatial Pattern for Classification of Motor Imagery EEG (동작 상상 EEG 분류를 위한 필터 뱅크 기반 정규화 공통 공간 패턴)

  • Park, Sang-Hoon;Kim, Ha-Young;Lee, David;Lee, Sang-Goog
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.587-594
    • /
    • 2017
  • Recently, motor imagery electroencephalogram(EEG) based Brain-Computer Interface(BCI) systems have received a significant amount of attention in various fields, including medicine and engineering. The Common Spatial Pattern(CSP) algorithm is the most commonly-used method to extract the features from motor imagery EEG. However, the CSP algorithm has limited applicability in Small-Sample Setting(SSS) situations because these situations rely on a covariance matrix. In addition, large differences in performance depend on the frequency bands that are being used. To address these problems, 4-40Hz band EEG signals are divided using nine filter-banks and Regularized CSP(R-CSP) is applied to individual frequency bands. Then, the Mutual Information-Based Individual Feature(MIBIF) algorithm is applied to the features of R-CSP for selecting discriminative features. Thereafter, selected features are used as inputs of the classifier Least Square Support Vector Machine(LS-SVM). The proposed method yielded a classification accuracy of 87.5%, 100%, 63.78%, 82.14%, and 86.11% in five subjects("aa", "al", "av", "aw", and "ay", respectively) for BCI competition III dataset IVa by using 18 channels in the vicinity of the motor area of the cerebral cortex. The proposed method improved the mean classification accuracy by 16.21%, 10.77% and 3.32% compared to the CSP, R-CSP and FBCSP, respectively The proposed method shows a particularly excellent performance in the SSS situation.

Face Identification Using a Near-Infrared Camera in a Nonrestrictive In-Vehicle Environment (적외선 카메라를 이용한 비제약적 환경에서의 얼굴 인증)

  • Ki, Min Song;Choi, Yeong Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.3
    • /
    • pp.99-108
    • /
    • 2021
  • There are unrestricted conditions on the driver's face inside the vehicle, such as changes in lighting, partial occlusion and various changes in the driver's condition. In this paper, we propose a face identification system in an unrestricted vehicle environment. The proposed method uses a near-infrared (NIR) camera to minimize the changes in facial images that occur according to the illumination changes inside and outside the vehicle. In order to process a face exposed to extreme light, the normal face image is changed to a simulated overexposed image using mean and variance for training. Thus, facial classifiers are simultaneously generated under both normal and extreme illumination conditions. Our method identifies a face by detecting facial landmarks and aggregating the confidence score of each landmark for the final decision. In particular, the performance improvement is the highest in the class where the driver wears glasses or sunglasses, owing to the robustness to partial occlusions by recognizing each landmark. We can recognize the driver by using the scores of remaining visible landmarks. We also propose a novel robust rejection and a new evaluation method, which considers the relations between registered and unregistered drivers. The experimental results on our dataset, PolyU and ORL datasets demonstrate the effectiveness of the proposed method.

PVC Classification based on QRS Pattern using QS Interval and R Wave Amplitude (QRS 패턴에 의한 QS 간격과 R파의 진폭을 이용한 조기심실수축 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.4
    • /
    • pp.825-832
    • /
    • 2014
  • Previous works for detecting arrhythmia have mostly used nonlinear method such as artificial neural network, fuzzy theory, support vector machine to increase classification accuracy. Most methods require accurate detection of P-QRS-T point, higher computational cost and larger processing time. Even if some methods have the advantage in low complexity, but they generally suffer form low sensitivity. Also, it is difficult to detect PVC accurately because of the various QRS pattern by person's individual difference. Therefore it is necessary to design an efficient algorithm that classifies PVC based on QRS pattern in realtime and decreases computational cost by extracting minimal feature. In this paper, we propose PVC classification based on QRS pattern using QS interval and R wave amplitude. For this purpose, we detected R wave, RR interval, QRS pattern from noise-free ECG signal through the preprocessing method. Also, we classified PVC in realtime through QS interval and R wave amplitude. The performance of R wave detection, PVC classification is evaluated by using 9 record of MIT-BIH arrhythmia database that included over 30 PVC. The achieved scores indicate the average of 99.02% in R wave detection and the rate of 93.72% in PVC classification.

Arrhythmia Classification based on Binary Coding using QRS Feature Variability (QRS 특징점 변화에 따른 바이너리 코딩 기반의 부정맥 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.8
    • /
    • pp.1947-1954
    • /
    • 2013
  • Previous works for detecting arrhythmia have mostly used nonlinear method such as artificial neural network, fuzzy theory, support vector machine to increase classification accuracy. Most methods require accurate detection of P-QRS-T point, higher computational cost and larger processing time. But it is difficult to detect the P and T wave signal because of person's individual difference. Therefore it is necessary to design efficient algorithm that classifies different arrhythmia in realtime and decreases computational cost by extrating minimal feature. In this paper, we propose arrhythmia detection based on binary coding using QRS feature varibility. For this purpose, we detected R wave, RR interval, QRS width from noise-free ECG signal through the preprocessing method. Also, we classified arrhythmia in realtime by converting threshold variability of feature to binary code. PVC, PAC, Normal, BBB, Paced beat classification is evaluated by using 39 record of MIT-BIH arrhythmia database. The achieved scores indicate the average of 97.18%, 94.14%, 99.83%, 92.77%, 97.48% in PVC, PAC, Normal, BBB, Paced beat classification.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

Vehicle Detection and Tracking using Billboard Sweep Stereo Matching Algorithm (빌보드 스윕 스테레오 시차정합 알고리즘을 이용한 차량 검출 및 추적)

  • Park, Min Woo;Won, Kwang Hee;Jung, Soon Ki
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.6
    • /
    • pp.764-781
    • /
    • 2013
  • In this paper, we propose a highly precise vehicle detection method with low false alarm using billboard sweep stereo matching and multi-stage hypothesis generation. First, we capture stereo images from cameras established in front of the vehicle and obtain the disparity map in which the regions of ground plane or background are removed using billboard sweep stereo matching algorithm. And then, we perform the vehicle detection and tracking on the labeled disparity map. The vehicle detection and tracking consists of three steps. In the learning step, the SVM(support vector machine) classifier is obtained using the features extracted from the gabor filter. The second step is the vehicle detection which performs the sobel edge detection in the image of the left camera and extracts candidates of the vehicle using edge image and billboard sweep stereo disparity map. The final step is the vehicle tracking using template matching in the next frame. Removal process of the tracking regions improves the system performance in the candidate region of the vehicle on the succeeding frames.