• Title/Summary/Keyword: Machine Learning SVM

Search Result 625, Processing Time 0.042 seconds

Forensic Decision of Median Filtering by Pixel Value's Gradients of Digital Image (디지털 영상의 픽셀값 경사도에 의한 미디언 필터링 포렌식 판정)

  • RHEE, Kang Hyeon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.79-84
    • /
    • 2015
  • In a distribution of digital image, there is a serious problem that is a distribution of the altered image by a forger. For the problem solution, this paper proposes a median filtering (MF) image forensic decision algorithm using a feature vector according to the pixel value's gradients. In the proposed algorithm, AR (Autoregressive) coefficients are computed from pixel value' gradients of original image then 1th~6th order coefficients to be six feature vector. And the reconstructed image is produced by the solution of Poisson's equation with the gradients. From the difference image between original and its reconstructed image, four feature vector (Average value, Max. value and the coordinate i,j of Max. value) is extracted. Subsequently, Two kinds of the feature vector combined to 10 Dim. feature vector that is used in the learning of a SVM (Support Vector Machine) classification for MF (Median Filtering) detector of the altered image. On the proposed algorithm of the median filtering detection, compare to MFR (Median Filter Residual) scheme that had the same 10 Dim. feature vectors, the performance is excellent at Unaltered, Averaging filtering ($3{\times}3$) and JPEG (QF=90) images, and less at Gaussian filtering ($3{\times}3$) image. However, in the measured performances of all items, AUC (Area Under Curve) by the sensitivity and 1-specificity is approached to 1. Thus, it is confirmed that the grade evaluation of the proposed algorithm is 'Excellent (A)'.

The Design of Feature Selection Classifier based on Physiological Signal for Emotion Detection (감성판별을 위한 생체신호기반 특징선택 분류기 설계)

  • Lee, JeeEun;Yoo, Sun K.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.206-216
    • /
    • 2013
  • The emotion plays a critical role in human's daily life including learning, action, decision and communication. In this paper, emotion discrimination classifier is designed to reduce system complexity through reduced selection of dominant features from biosignals. The photoplethysmography(PPG), skin temperature, skin conductance, fontal and parietal electroencephalography(EEG) signals were measured during 4 types of movie watching associated with the induction of neutral, sad, fear joy emotions. The genetic algorithm with support vector machine(SVM) based fitness function was designed to determine dominant features among 24 parameters extracted from measured biosignals. It shows maximum classification accuracy of 96.4%, which is 17% higher than that of SVM alone. The minimum error features selected are the mean and NN50 of heart rate variability from PPG signal, the mean of PPG induced pulse transit time, the mean of skin resistance, and ${\delta}$ and ${\beta}$ frequency band powers of parietal EEG. The combination of parietal EEG, PPG, and skin resistance is recommendable in high accuracy instrumentation, while the combinational use of PPG and skin conductance(79% accuracy) is affordable in simplified instrumentation.

Adverse Effects on EEGs and Bio-Signals Coupling on Improving Machine Learning-Based Classification Performances

  • SuJin Bak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.133-153
    • /
    • 2023
  • In this paper, we propose a novel approach to investigating brain-signal measurement technology using Electroencephalography (EEG). Traditionally, researchers have combined EEG signals with bio-signals (BSs) to enhance the classification performance of emotional states. Our objective was to explore the synergistic effects of coupling EEG and BSs, and determine whether the combination of EEG+BS improves the classification accuracy of emotional states compared to using EEG alone or combining EEG with pseudo-random signals (PS) generated arbitrarily by random generators. Employing four feature extraction methods, we examined four combinations: EEG alone, EG+BS, EEG+BS+PS, and EEG+PS, utilizing data from two widely-used open datasets. Emotional states (task versus rest states) were classified using Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) classifiers. Our results revealed that when using the highest accuracy SVM-FFT, the average error rates of EEG+BS were 4.7% and 6.5% higher than those of EEG+PS and EEG alone, respectively. We also conducted a thorough analysis of EEG+BS by combining numerous PSs. The error rate of EEG+BS+PS displayed a V-shaped curve, initially decreasing due to the deep double descent phenomenon, followed by an increase attributed to the curse of dimensionality. Consequently, our findings suggest that the combination of EEG+BS may not always yield promising classification performance.

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Analysis of Disaster Safety Situation Classification Algorithm Based on Natural Language Processing Using 119 Calls Data (119 신고 데이터를 이용한 자연어처리 기반 재난안전 상황 분류 알고리즘 분석)

  • Kwon, Su-Jeong;Kang, Yun-Hee;Lee, Yong-Hak;Lee, Min-Ho;Park, Seung-Ho;Kang, Myung-Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.10
    • /
    • pp.317-322
    • /
    • 2020
  • Due to the development of artificial intelligence, it is used as a disaster response support system in the field of disaster. Disasters can occur anywhere, anytime. In the event of a disaster, there are four types of reports: fire, rescue, emergency, and other call. Disaster response according to the 119 call also responds differently depending on the type and situation. In this paper, 1280 data set of 119 calls were tested with 3 classes of SVM, NB, k-NN, DT, SGD, and RF situation classification algorithms using a training data set. Classification performance showed the highest performance of 92% and minimum of 77%. In the future, it is necessary to secure an effective data set by disaster in various fields to study disaster response.

A hybrid intrusion detection system based on CBA and OCSVM for unknown threat detection (알려지지 않은 위협 탐지를 위한 CBA와 OCSVM 기반 하이브리드 침입 탐지 시스템)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Yun, Jiyoung;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.3
    • /
    • pp.27-35
    • /
    • 2021
  • With the development of the Internet, various IT technologies such as IoT, Cloud, etc. have been developed, and various systems have been built in countries and companies. Because these systems generate and share vast amounts of data, they needed a variety of systems that could detect threats to protect the critical data contained in the system, which has been actively studied to date. Typical techniques include anomaly detection and misuse detection, and these techniques detect threats that are known or exhibit behavior different from normal. However, as IT technology advances, so do technologies that threaten systems, and these methods of detection. Advanced Persistent Threat (APT) attacks national or companies systems to steal important information and perform attacks such as system down. These threats apply previously unknown malware and attack technologies. Therefore, in this paper, we propose a hybrid intrusion detection system that combines anomaly detection and misuse detection to detect unknown threats. Two detection techniques have been applied to enable the detection of known and unknown threats, and by applying machine learning, more accurate threat detection is possible. In misuse detection, we applied Classification based on Association Rule(CBA) to generate rules for known threats, and in anomaly detection, we used One-Class SVM(OCSVM) to detect unknown threats. Experiments show that unknown threat detection accuracy is about 94%, and we confirm that unknown threats can be detected.

Decision Making Support System for VTSO using Extracted Ships' Tracks (항적모델 추출을 통한 해상교통관제사 의사결정 지원 방안)

  • Kim, Joo-Sung;Jeong, Jung Sik;Jeong, Jae-Yong;Kim, Yun Ha;Choi, Ikhwan;Kim, Jinhan
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2015.07a
    • /
    • pp.310-311
    • /
    • 2015
  • Ships' tracking data are being monitored and collected by vessel traffic service center in real time. In this paper, we intend to contribute to vessel traffic service operators' decision making through extracting ships' tracking patterns and models based on these data. Support Vector Machine algorithm was used for vessel track modeling to handle and process the data sets and k-fold cross validation was used to select the proper parameters. Proposed data processing methods could support vessel traffic service operators' decision making on case of anomaly detection, calculation ships' dead reckoning positions and etc.

  • PDF

Competitor Extraction based on Machine Learning Methods (기계학습 기반 경쟁자 자동추출 방법)

  • Lee, Chung-Hee;Kim, Hyun-Jin;Ryu, Pum-Mo;Kim, Hyun-Ki;Seo, Young-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.107-112
    • /
    • 2012
  • 본 논문은 일반 텍스트에 나타나는 경쟁 관계에 있는 고유명사들을 경쟁자로 자동 추출하는 방법에 대한 것으로, 규칙 기반 방법과 기계 학습 기반 방법을 모두 제안하고 비교하였다. 제안한 시스템은 뉴스 기사를 대상으로 하였고, 문장에 경쟁관계를 나타내는 명확한 정보가 있는 경우에만 추출하는 것을 목표로 하였다. 규칙기반 경쟁어 추출 시스템은 2개의 고유명사가 경쟁관계임을 나타내는 단서단어에 기반해서 경쟁어를 추출하는 시스템이며, 경쟁표현 단서단어는 620개가 수집되어 사용됐다. 기계학습 기반 경쟁어 추출시스템은 경쟁어 추출을 경쟁어 후보에 대한 경쟁여부의 바이너리 분류 문제로 접근하였다. 분류 알고리즘은 Support Vector Machines을 사용하였고, 경쟁어 주변 문맥 정보를 대표할 수 있는 언어 독립적 5개 자질에 기반해서 모델을 학습하였다. 성능평가를 위해서 이슈화되고 있는 핫키워드 54개에 대해서 623개의 경쟁어를 뉴스 기사로부터 수집해서 평가셋을 구축하였다. 비교 평가를 위해서 기준시스템으로 연관어에 기반해서 경쟁어를 추출하는 시스템을 구현하였고, Recall/Precision/F1 성능으로 0.119/0.214/0.153을 얻었다. 제안 시스템의 실험 결과로 규칙기반 시스템은 0.793/0.207/0.328 성능을 보였고, 기계 학습기반 시스템은 0.578/0.730/0.645 성능을 보였다. Recall 성능은 규칙기반 시스템이 0.793으로 가장 좋았고, 기준시스템에 비해서 67.4%의 성능 향상이 있었다. Precision과 F1 성능은 기계학습기반 시스템이 0.730과 0.645로 가장 좋았고, 기준시스템에 비해서 각각 61.6%, 49.2%의 성능향상이 있었다. 기준시스템에 비해서 제안한 시스템이 Recall, Precision, F1 성능이 모두 대폭적으로 향상되었으므로 제안한 방법이 효과적임을 알 수 있다.

  • PDF

Discriminant analysis of grain flours for rice paper using fluorescence hyperspectral imaging system and chemometric methods

  • Seo, Youngwook;Lee, Ahyeong;Kim, Bal-Geum;Lim, Jongguk
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.3
    • /
    • pp.633-644
    • /
    • 2020
  • Rice paper is an element of Vietnamese cuisine that can be used to wrap vegetables and meat. Rice and starch are the main ingredients of rice paper and their mixing ratio is important for quality control. In a commercial factory, assessment of food safety and quantitative supply is a challenging issue. A rapid and non-destructive monitoring system is therefore necessary in commercial production systems to ensure the food safety of rice and starch flour for the rice paper wrap. In this study, fluorescence hyperspectral imaging technology was applied to classify grain flours. Using the 3D hyper cube of fluorescence hyperspectral imaging (fHSI, 420 - 730 nm), spectral and spatial data and chemometric methods were applied to detect and classify flours. Eight flours (rice: 4, starch: 4) were prepared and hyperspectral images were acquired in a 5 (L) × 5 (W) × 1.5 (H) cm container. Linear discriminant analysis (LDA), partial least square discriminant analysis (PLSDA), support vector machine (SVM), classification and regression tree (CART), and random forest (RF) with a few preprocessing methods (multivariate scatter correction [MSC], 1st and 2nd derivative and moving average) were applied to classify grain flours and the accuracy was compared using a confusion matrix (accuracy and kappa coefficient). LDA with moving average showed the highest accuracy at A = 0.9362 (K = 0.9270). 1D convolutional neural network (CNN) demonstrated a classification result of A = 0.94 and showed improved classification results between mimyeon flour (MF)1 and MF2 of 0.72 and 0.87, respectively. In this study, the potential of non-destructive detection and classification of grain flours using fHSI technology and machine learning methods was demonstrated.

A Novel Grasshopper Optimization-based Particle Swarm Algorithm for Effective Spectrum Sensing in Cognitive Radio Networks

  • Ashok, J;Sowmia, KR;Jayashree, K;Priya, Vijay
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.520-541
    • /
    • 2023
  • In CRNs, SS is of utmost significance. Every CR user generates a sensing report during the training phase beneath various circumstances, and depending on a collective process, either communicates or remains silent. In the training stage, the fusion centre combines the local judgments made by CR users by a majority vote, and then returns a final conclusion to every CR user. Enough data regarding the environment, including the activity of PU and every CR's response to that activity, is acquired and sensing classes are created during the training stage. Every CR user compares their most recent sensing report to the previous sensing classes during the classification stage, and distance vectors are generated. The posterior probability of every sensing class is derived on the basis of quantitative data, and the sensing report is then classified as either signifying the presence or absence of PU. The ISVM technique is utilized to compute the quantitative variables necessary to compute the posterior probability. Here, the iterations of SVM are tuned by novel GO-PSA by combining GOA and PSO. Novel GO-PSA is developed since it overcomes the problem of computational complexity, returns minimum error, and also saves time when compared with various state-of-the-art algorithms. The dependability of every CR user is taken into consideration as these local choices are then integrated at the fusion centre utilizing an innovative decision combination technique. Depending on the collective choice, the CR users will then communicate or remain silent.