• Title/Summary/Keyword: Feature Learning

Search Result 1,916, Processing Time 0.028 seconds

Semi-supervised Software Defect Prediction Model Based on Tri-training

  • Meng, Fanqi;Cheng, Wenying;Wang, Jingdong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.4028-4042
    • /
    • 2021
  • Aiming at the problem of software defect prediction difficulty caused by insufficient software defect marker samples and unbalanced classification, a semi-supervised software defect prediction model based on a tri-training algorithm was proposed by combining feature normalization, over-sampling technology, and a Tri-training algorithm. First, the feature normalization method is used to smooth the feature data to eliminate the influence of too large or too small feature values on the model's classification performance. Secondly, the oversampling method is used to expand and sample the data, which solves the unbalanced classification of labelled samples. Finally, the Tri-training algorithm performs machine learning on the training samples and establishes a defect prediction model. The novelty of this model is that it can effectively combine feature normalization, oversampling techniques, and the Tri-training algorithm to solve both the under-labelled sample and class imbalance problems. Simulation experiments using the NASA software defect prediction dataset show that the proposed method outperforms four existing supervised and semi-supervised learning in terms of Precision, Recall, and F-Measure values.

Design of the 3D Object Recognition System with Hierarchical Feature Learning (계층적 특징 학습을 이용한 3차원 물체 인식 시스템의 설계)

  • Kim, Joohee;Kim, Dongha;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.1
    • /
    • pp.13-20
    • /
    • 2016
  • In this paper, we propose an object recognition system that can effectively find out its category, its instance name, and several attributes from the color and depth images of an object with hierarchical feature learning. In the preprocessing stage, our system transforms the depth images of the object into the surface normal vectors, which can represent the shape information of the object more precisely. In the feature learning stage, it extracts a set of patch features and image features from a pair of the color image and the surface normal vector through two-layered learning. And then the system trains a set of independent classification models with a set of labeled feature vectors and the SVM learning algorithm. Through experiments with UW RGB-D Object Dataset, we verify the performance of the proposed object recognition system.

Comparative Analysis of Dimensionality Reduction Techniques for Advanced Ransomware Detection with Machine Learning (기계학습 기반 랜섬웨어 공격 탐지를 위한 효과적인 특성 추출기법 비교분석)

  • Kim Han Seok;Lee Soo Jin
    • Convergence Security Journal
    • /
    • v.23 no.1
    • /
    • pp.117-123
    • /
    • 2023
  • To detect advanced ransomware attacks with machine learning-based models, the classification model must train learning data with high-dimensional feature space. And in this case, a 'curse of dimension' phenomenon is likely to occur. Therefore, dimensionality reduction of features must be preceded in order to increase the accuracy of the learning model and improve the execution speed while avoiding the 'curse of dimension' phenomenon. In this paper, we conducted classification of ransomware by applying three machine learning models and two feature extraction techniques to two datasets with extremely different dimensions of feature space. As a result of the experiment, the feature dimensionality reduction techniques did not significantly affect the performance improvement in binary classification, and it was the same even when the dimension of featurespace was small in multi-class clasification. However, when the dataset had high-dimensional feature space, LDA(Linear Discriminant Analysis) showed quite excellent performance.

A Practical Feature Extraction for Improving Accuracy and Speed of IDS Alerts Classification Models Based on Machine Learning (기계학습 기반 IDS 보안이벤트 분류 모델의 정확도 및 신속도 향상을 위한 실용적 feature 추출 연구)

  • Shin, Iksoo;Song, Jungsuk;Choi, Jangwon;Kwon, Taewoong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.2
    • /
    • pp.385-395
    • /
    • 2018
  • With the development of Internet, cyber attack has become a major threat. To detect cyber attacks, intrusion detection system(IDS) has been widely deployed. But IDS has a critical weakness which is that it generates a large number of false alarms. One of the promising techniques that reduce the false alarms in real time is machine learning. However, there are problems that must be solved to use machine learning. So, many machine learning approaches have been applied to this field. But so far, researchers have not focused on features. Despite the features of IDS alerts are important for performance of model, the approach to feature is ignored. In this paper, we propose new feature set which can improve the performance of model and can be extracted from a single alarm. New features are motivated from security analyst's know-how. We trained and tested the proposed model applied new feature set with real IDS alerts. Experimental results indicate the proposed model can achieve better accuracy and false positive rate than SVM model with ordinary features.

Application Consideration of Machine Learning Techniques in Satellite Systems

  • Jin-keun Hong
    • International journal of advanced smart convergence
    • /
    • v.13 no.2
    • /
    • pp.48-60
    • /
    • 2024
  • With the exponential growth of satellite data utilization, machine learning has become pivotal in enhancing innovation and cybersecurity in satellite systems. This paper investigates the role of machine learning techniques in identifying and mitigating vulnerabilities and code smells within satellite software. We explore satellite system architecture and survey applications like vulnerability analysis, source code refactoring, and security flaw detection, emphasizing feature extraction methodologies such as Abstract Syntax Trees (AST) and Control Flow Graphs (CFG). We present practical examples of feature extraction and training models using machine learning techniques like Random Forests, Support Vector Machines, and Gradient Boosting. Additionally, we review open-access satellite datasets and address prevalent code smells through systematic refactoring solutions. By integrating continuous code review and refactoring into satellite software development, this research aims to improve maintainability, scalability, and cybersecurity, providing novel insights for the advancement of satellite software development and security. The value of this paper lies in its focus on addressing the identification of vulnerabilities and resolution of code smells in satellite software. In terms of the authors' contributions, we detail methods for applying machine learning to identify potential vulnerabilities and code smells in satellite software. Furthermore, the study presents techniques for feature extraction and model training, utilizing Abstract Syntax Trees (AST) and Control Flow Graphs (CFG) to extract relevant features for machine learning training. Regarding the results, we discuss the analysis of vulnerabilities, the identification of code smells, maintenance, and security enhancement through practical examples. This underscores the significant improvement in the maintainability and scalability of satellite software through continuous code review and refactoring.

Character Recognition of Vehicle Number Plate using Modular Neural Network (모듈라 신경망을 이용한 자동차 번호판 문자인식)

  • Park, Chang-Seok;Kim, Byeong-Man;Seo, Byung-Hoon;Lee, Kwang-Ho
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.409-415
    • /
    • 2003
  • Recently, the modular learning are very popular and receive much attention for pattern classification. The modular learning method based on the "divide and conquer" strategy can not only solve the complex problems, but also reach a better result than a single classifier′s on the learning quality and speed. In the neural network area, some researches that take the modular learning approach also have been made to improve classification performance. In this paper, we propose a simple modular neural network for characters recognition of vehicle number plate and evaluate its performance on the clustering methods of feature vectors used in constructing subnetworks. We implement two clustering method, one is grouping similar feature vectors by K-means clustering algorithm, the other grouping unsimilar feature vectors by our proposed algorithm. The experiment result shows that our algorithm achieves much better performance.

Abnormal Vibration Diagnostics Algorithm of Rotating Machinery Using Self-Organizing Feature Map nad Learing Vector Quantization (자기조직화특징지도와 학습벡터양자화를 이용한 회전기계의 이상진동진단 알고리듬)

  • 양보석;서상윤;임동수;이수종
    • Journal of KSNVE
    • /
    • v.10 no.2
    • /
    • pp.331-337
    • /
    • 2000
  • The necessity of diagnosis of the rotating machinery which is widely used in the industry is increasing. Many research has been conducted to manipulate field vibration signal data for diagnosing the fault of designated machinery. As the pattern recognition tool of that signal, neural network which use usually back-propagation algorithm was used in the diagnosis of rotating machinery. In this paper, self-organizing feature map(SOFM) which is unsupervised learning algorithm is used in the abnormal defect diagnosis of rotating machinery and then learning vector quantization(LVQ) which is supervised learning algorithm is used to improve the quality of the classifier decision regions.

  • PDF

Feature Extraction Using Convolutional Neural Networks for Random Translation (랜덤 변환에 대한 컨볼루션 뉴럴 네트워크를 이용한 특징 추출)

  • Jin, Taeseok
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.23 no.3
    • /
    • pp.515-521
    • /
    • 2020
  • Deep learning methods have been effectively used to provide great improvement in various research fields such as machine learning, image processing and computer vision. One of the most frequently used deep learning methods in image processing is the convolutional neural networks. Compared to the traditional artificial neural networks, convolutional neural networks do not use the predefined kernels, but instead they learn data specific kernels. This property makes them to be used as feature extractors as well. In this study, we compared the quality of CNN features for traditional texture feature extraction methods. Experimental results demonstrate the superiority of the CNN features. Additionally, the recognition process and result of a pioneering CNN on MNIST database are presented.

Comparing Machine Learning Classifiers for Movie WOM Opinion Mining

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3169-3181
    • /
    • 2015
  • Nowadays, online word-of-mouth has become a powerful influencer to marketing and sales in business. Opinion mining and sentiment analysis is frequently adopted at market research and business analytics field for analyzing word-of-mouth content. However, there still remain several challengeable areas for 1) sentiment analysis aiming for Korean word-of-mouth content in film market, 2) availability of machine learning models only using linguistic features, 3) effect of the size of the feature set. This study took a sample of 10,000 movie reviews which had posted extremely negative/positive rating in a movie portal site, and conducted sentiment analysis with four machine learning algorithms: naïve Bayesian, decision tree, neural network, and support vector machines. We found neural network and support vector machine produced better accuracy than naïve Bayesian and decision tree on every size of the feature set. Besides, the performance of them was boosting with increasing of the feature set size.

Operating Voltage Prediction in Mobile Semiconductor Manufacturing Process Using Machine Learning (기계학습을 활용한 모바일 반도체 제조 공정에서 동작 전압 예측)

  • Inhwan Baek;Seungwoo Jang;Kwangsu Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.1
    • /
    • pp.124-128
    • /
    • 2023
  • Semiconductor engineers have long sought to enhance the energy efficiency of mobile semiconductors by reducing their voltage. During the final stages of the semiconductor manufacturing process, the screening and evaluation of voltage is crucial. However, determining the optimal test start voltage presents a significant challenge as it can increase testing time. In the semiconductor manufacturing process, a wealth of test element group information is collected. If this information can be controlled to predict the test voltage, it could lead to a reduction in testing time and increase the probability of identifying the optimal voltage. To achieve this, this paper is exploring machine learning techniques, such as linear regression and ensemble models, that can leverage large amounts of information for voltage prediction. The outcomes of these machine learning methods not only demonstrate high consistency but can also be used for feature engineering to enhance accuracy in future processes.

  • PDF