• Title/Summary/Keyword: Feature learning

Search Result 1,939, Processing Time 0.025 seconds

DL-ML Fusion Hybrid Model for Malicious Web Site URL Detection Based on URL Lexical Features (악성 URL 탐지를 위한 URL Lexical Feature 기반의 DL-ML Fusion Hybrid 모델)

  • Dae-yeob Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.881-891
    • /
    • 2023
  • Recently, various studies on malicious URL detection using artificial intelligence have been conducted, and most of the research have shown great detection performance. However, not only does classical machine learning require a process of analyzing features, but the detection performance of a trained model also depends on the data analyst's ability. In this paper, we propose a DL-ML Fusion Hybrid Model for malicious web site URL detection based on URL lexical features. the propose model combines the automatic feature extraction layer of deep learning and classical machine learning to improve the feature engineering issue. 60,000 malicious and normal URLs were collected for the experiment and the results showed 23.98%p performance improvement in maximum. In addition, it was possible to train a model in an efficient way with the automation of feature engineering.

Reinforcement Learning Method Based Interactive Feature Selection(IFS) Method for Emotion Recognition (감성 인식을 위한 강화학습 기반 상호작용에 의한 특징선택 방법 개발)

  • Park Chang-Hyun;Sim Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.7
    • /
    • pp.666-670
    • /
    • 2006
  • This paper presents the novel feature selection method for Emotion Recognition, which may include a lot of original features. Specially, the emotion recognition in this paper treated speech signal with emotion. The feature selection has some benefits on the pattern recognition performance and 'the curse of dimension'. Thus, We implemented a simulator called 'IFS' and those result was applied to a emotion recognition system(ERS), which was also implemented for this research. Our novel feature selection method was basically affected by Reinforcement Learning and since it needs responses from human user, it is called 'Interactive feature Selection'. From performing the IFS, we could get 3 best features and applied to ERS. Comparing those results with randomly selected feature set, The 3 best features were better than the randomly selected feature set.

An Optimal Feature Selection Method to Detect Malwares in Real Time Using Machine Learning (기계학습 기반의 실시간 악성코드 탐지를 위한 최적 특징 선택 방법)

  • Joo, Jin-Gul;Jeong, In-Seon;Kang, Seung-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.2
    • /
    • pp.203-209
    • /
    • 2019
  • The performance of an intelligent classifier for detecting malwares added to multimedia contents based on machine learning is highly dependent on the properties of feature set. Especially, in order to determine the malicious code in real time the size of feature set should be as short as possible without reducing the accuracy. In this paper, we introduce an optimal feature selection method to satisfy both high detection rate and the minimum length of feature set against the feature set provided by PEFeatureExtractor well known as a feature extraction tool. For the evaluation of the proposed method, we perform the experiments using Windows Portable Executables 32bits.

A Study on Reducing Learning Time of Deep-Learning using Network Separation (망 분리를 이용한 딥러닝 학습시간 단축에 대한 연구)

  • Lee, Hee-Yeol;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.25 no.2
    • /
    • pp.273-279
    • /
    • 2021
  • In this paper, we propose an algorithm that shortens the learning time by performing individual learning using partitioning the deep learning structure. The proposed algorithm consists of four processes: network classification origin setting process, feature vector extraction process, feature noise removal process, and class classification process. First, in the process of setting the network classification starting point, the division starting point of the network structure for effective feature vector extraction is set. Second, in the feature vector extraction process, feature vectors are extracted without additional learning using the weights previously learned. Third, in the feature noise removal process, the extracted feature vector is received and the output value of each class is learned to remove noise from the data. Fourth, in the class classification process, the noise-removed feature vector is input to the multi-layer perceptron structure, and the result is output and learned. To evaluate the performance of the proposed algorithm, we experimented with the Extended Yale B face database. As a result of the experiment, in the case of the time required for one-time learning, the proposed algorithm reduced 40.7% based on the existing algorithm. In addition, the number of learning up to the target recognition rate was shortened compared with the existing algorithm. Through the experimental results, it was confirmed that the one-time learning time and the total learning time were reduced and improved over the existing algorithm.

Using Higher Order Neuron on the Supervised Learning Machine of Kohonen Feature Map (고차 뉴런을 이용한 교사 학습기의 Kohonen Feature Map)

  • Jung, Jong-Soo;Hagiwara, Masafumi
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.5
    • /
    • pp.277-282
    • /
    • 2003
  • In this paper we propose Using Higher Order Neuron on the Supervised Learning Machine of the Kohonen Feature Map. The architecture of proposed model adopts the higher order neuron in the input layer of Kohonen Feature Map as a Supervised Learning Machine. It is able to estimate boundary on input pattern space because or the higher order neuron. However, it suffers from a problem that the number of neuron weight increases because of the higher order neuron in the input layer. In this time, we solved this problem by placing the second order neuron among the higher order neuron. The feature of the higher order neuron can be mapped similar inputs on the Kohonen Feature Map. It also is the network with topological mapping. We have simulated the proposed model in respect of the recognition rate by XOR problem, discrimination of 20 alphabet patterns, Mirror Symmetry problem, and numerical letters Pattern Problem.

Residual Learning Based CNN for Gesture Recognition in Robot Interaction

  • Han, Hua
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.385-398
    • /
    • 2021
  • The complexity of deep learning models affects the real-time performance of gesture recognition, thereby limiting the application of gesture recognition algorithms in actual scenarios. Hence, a residual learning neural network based on a deep convolutional neural network is proposed. First, small convolution kernels are used to extract the local details of gesture images. Subsequently, a shallow residual structure is built to share weights, thereby avoiding gradient disappearance or gradient explosion as the network layer deepens; consequently, the difficulty of model optimisation is simplified. Additional convolutional neural networks are used to accelerate the refinement of deep abstract features based on the spatial importance of the gesture feature distribution. Finally, a fully connected cascade softmax classifier is used to complete the gesture recognition. Compared with the dense connection multiplexing feature information network, the proposed algorithm is optimised in feature multiplexing to avoid performance fluctuations caused by feature redundancy. Experimental results from the ISOGD gesture dataset and Gesture dataset prove that the proposed algorithm affords a fast convergence speed and high accuracy.

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

Analysis of Feature Extraction Algorithms Based on Deep Learning (Deep Learning을 기반으로 한 Feature Extraction 알고리즘의 분석)

  • Kim, Gyung Tae;Lee, Yong Hwan;Kim, Yeong Seop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.2
    • /
    • pp.60-67
    • /
    • 2020
  • Recently, artificial intelligence related technologies including machine learning are being applied to various fields, and the demand is also increasing. In particular, with the development of AR, VR, and MR technologies related to image processing, the utilization of computer vision based on deep learning has increased. The algorithms for object recognition and detection based on deep learning required for image processing are diversified and advanced. Accordingly, problems that were difficult to solve with the existing methodology were solved more simply and easily by using deep learning. This paper introduces various deep learning-based object recognition and extraction algorithms used to detect and recognize various objects in an image and analyzes the technologies that attract attention.

A Study of Research on Methods of Automated Biomedical Document Classification using Topic Modeling and Deep Learning (토픽모델링과 딥 러닝을 활용한 생의학 문헌 자동 분류 기법 연구)

  • Yuk, JeeHee;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.63-88
    • /
    • 2018
  • This research evaluated differences of classification performance for feature selection methods using LDA topic model and Doc2Vec which is based on word embedding using deep learning, feature corpus sizes and classification algorithms. In addition to find the feature corpus with high performance of classification, an experiment was conducted using feature corpus was composed differently according to the location of the document and by adjusting the size of the feature corpus. Conclusionally, in the experiments using deep learning evaluate training frequency and specifically considered information for context inference. This study constructed biomedical document dataset, Disease-35083 which consisted biomedical scholarly documents provided by PMC and categorized by the disease category. Throughout the study this research verifies which type and size of feature corpus produces the highest performance and, also suggests some feature corpus which carry an extensibility to specific feature by displaying efficiency during the training time. Additionally, this research compares the differences between deep learning and existing method and suggests an appropriate method by classification environment.

Finding the best suited autoencoder for reducing model complexity

  • Ngoc, Kien Mai;Hwang, Myunggwon
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.9-22
    • /
    • 2021
  • Basically, machine learning models use input data to produce results. Sometimes, the input data is too complicated for the models to learn useful patterns. Therefore, feature engineering is a crucial data preprocessing step for constructing a proper feature set to improve the performance of such models. One of the most efficient methods for automating feature engineering is the autoencoder, which transforms the data from its original space into a latent space. However certain factors, including the datasets, the machine learning models, and the number of dimensions of the latent space (denoted by k), should be carefully considered when using the autoencoder. In this study, we design a framework to compare two data preprocessing approaches: with and without autoencoder and to observe the impact of these factors on autoencoder. We then conduct experiments using autoencoders with classifiers on popular datasets. The empirical results provide a perspective regarding the best suited autoencoder for these factors.