• Title/Summary/Keyword: Feature selection algorithm

Search Result 345, Processing Time 0.063 seconds

A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

  • Song, Junseok;Kim, Kyung Tae;Lee, Byungjun;Kim, Sangyoung;Youn, Hee Yong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.2996-3011
    • /
    • 2017
  • With rapid growth of web technology and dissemination of smart devices, social networking service(SNS) is widely used. As a result, huge amount of data are generated from SNS such as Twitter, and sentiment analysis of SNS data is very important for various applications and services. In the existing sentiment analysis based on the Nai¨ve Bayes algorithm, a same number of attributes is usually employed to estimate the weight of each class. Moreover, uncountable and meaningless attributes are included. This results in decreased accuracy of sentiment analysis. In this paper two methods are proposed to resolve these issues, which reflect the difference of the number of positive words and negative words in calculating the weights, and eliminate insignificant words in the feature selection step using Multinomial Nai¨ve Bayes(MNB) algorithm. Performance comparison demonstrates that the proposed scheme significantly increases the accuracy compared to the existing Multivariate Bernoulli Nai¨ve Bayes(BNB) algorithm and MNB scheme.

Vision-Based Two-Arm Gesture Recognition by Using Longest Common Subsequence (최대 공통 부열을 이용한 비전 기반의 양팔 제스처 인식)

  • Choi, Cheol-Min;Ahn, Jung-Ho;Byun, Hye-Ran
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.5C
    • /
    • pp.371-377
    • /
    • 2008
  • In this paper, we present a framework for vision-based two-arm gesture recognition. To capture the motion information of the hands, we perform color-based tracking algorithm using adaptive kernel for each frame. And a feature selection algorithm is performed to classify the motion information into four different phrases. By using gesture phrase information, we build a gesture model which consists of a probability of the symbols and a symbol sequence which is learned from the longest common subsequence. Finally, we present a similarity measurement for two-arm gesture recognition by using the proposed gesture models. In the experimental results, we show the efficiency of the proposed feature selection method, and the simplicity and the robustness of the recognition algorithm.

Truncated Kernel Projection Machine for Link Prediction

  • Huang, Liang;Li, Ruixuan;Chen, Hong
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.2
    • /
    • pp.58-67
    • /
    • 2016
  • With the large amount of complex network data that is increasingly available on the Web, link prediction has become a popular data-mining research field. The focus of this paper is on a link-prediction task that can be formulated as a binary classification problem in complex networks. To solve this link-prediction problem, a sparse-classification algorithm called "Truncated Kernel Projection Machine" that is based on empirical-feature selection is proposed. The proposed algorithm is a novel way to achieve a realization of sparse empirical-feature-based learning that is different from those of the regularized kernel-projection machines. The algorithm is more appealing than those of the previous outstanding learning machines since it can be computed efficiently, and it is also implemented easily and stably during the link-prediction task. The algorithm is applied here for link-prediction tasks in different complex networks, and an investigation of several classification algorithms was performed for comparison. The experimental results show that the proposed algorithm outperformed the compared algorithms in several key indices with a smaller number of test errors and greater stability.

A Study on the Improvement of Multitree Pattern Recognition Algorithm (Multitree 형상 인식 기법의 성능 개선에 관한 연구)

  • 김태성;이정희;김성대
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.4
    • /
    • pp.348-359
    • /
    • 1989
  • The multitree pattern recognition algorithm proposed by [1] and [2] is modified in order to improve its performance. The basic idea of the multitree pattern classification algorithm is that the binary dceision tree used to classify an unknow pattern is constructed for each feature and that at each stage, classification rule decides whether to classify the unknown pattern or to extract the feature value according to the feature ordet. So the feature ordering needed in the calssification procedure is simple and the number of features used in the classification procedure is small compared with other classification algorithms. Thus the algorithm can be easily applied to real pattern recognition problems even when the number of features and that of the classes are very large. In this paper, the wighting factor assignment scheme in the decision procedure is modified and various classification rules are proposed by means of the weighting factor. And the branch and bound method is applied to feature subset selection and feature ordering. Several experimental results show that the performance of the multitree pattern classification algorithm is improved by the proposed scheme.

  • PDF

An Algorithm for Automatic Guided Vehicle Scheduling Problems (자동유도운반차 (Automatic Guided Vehicle) 스케쥴링 해법)

  • Park, Yang-Byeong;Jeon, Deok-Bin
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.13 no.1
    • /
    • pp.11-24
    • /
    • 1987
  • Automatic Guided Vehicle Systems feature battery powered driverless vehicles with programming capabilities for path selection and positoning. Vehicles serve the machines in shop, following a guide path system installed on the shop floor. The basic problem in the system is to determine a fixed set of vehicle routes of minimal total distance(time) while keeping capacity and distance(time) constraints. In this paper, a heuristic algorithm is presented for scheduling the automatic guided vehicles. The algorithm routes the machines based on their distances and polor coordinate angles, taking into account the structural feature of the system. Computational experiments are performed on several test problems in order to evaluate the proposed algorithm. Finally, a framework for dealing with the case where supplies from the machines are probabilistic is described.

  • PDF

Multi-label Feature Selection Using Redundancy and Relevancy based on Regression Optimization

  • Hyunki Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.11
    • /
    • pp.21-30
    • /
    • 2024
  • High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements. In particular, in a multi-label environment, higher complexity is required as much as the number of labels. This paper proposes a feature selection method to improve classification performance in multi-label settings. The method considers three types of relationships: between features, between features and labels, and between labels themselves. To achieve this, a regression-based objective function is designed. This objective function calculates the linear relationships between features and labels and uses mutual information to compute relationships between features and between labels. By minimizing this objective function, the optimal weights for feature selection are found. To optimize the objective function, a gradient descent method is applied to develop a fast-converging algorithm. The experimental results on six multi-label datasets show that the proposed method outperforms existing multi-label feature selection techniques. The classification performance of the proposed method, averaged over six datasets, showed a Hamming loss of 0.1285, a ranking loss of 0.1811, and a multi-label accuracy of 0.6416. Compared to the AMI(Approximating Mutual Information) algorithm, the performance was better by 0.0148, 0.0435, and 0.0852, respectively.

A deep learning model based on triplet losses for a similar child drawing selection algorithm (Triplet Loss 기반 딥러닝 모델을 통한 유사 아동 그림 선별 알고리즘)

  • Moon, Jiyu;Kim, Min-Jong;Lee, Seong-Oak;Yu, Yonggyun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • The goal of this paper is to create a deep learning model based on triplet loss for generating similar child drawing selection algorithms. To assess the similarity of children's drawings, the distance between feature vectors belonging to the same class should be close, and the distance between feature vectors belonging to different classes should be greater. Therefore, a similar child drawing selection algorithm was developed in this study by building a deep learning model combining Triplet Loss and residual network(ResNet), which has an advantage in measuring image similarity regardless of the number of classes. Finally, using this model's similar child drawing selection algorithm, the similarity between the target child drawing and the other drawings can be measured and drawings with a high similarity can be chosen.

STK Feature Tracking Using BMA for Fast Feature Displacement Convergence (빠른 피쳐변위수렴을 위한 BMA을 이용한 STK 피쳐 추적)

  • Jin, Kyung-Chan;Cho, Jin-Ho
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.8
    • /
    • pp.81-87
    • /
    • 1999
  • In general, feature detection and tracking algorithms is classified by EBGM using Garbor-jet, NNC-R and STK algorithm using pixel eigenvalue. In those algorithms, EBGM and NCC-R detect features with feature model, but STK algorithm has a characteristics of an automatic feature selection. In this paper, to solve the initial problem of NR tracking in STK algorithm, we detected features using STK algorithm in modelled feature region and tracked features with NR method. In tracking, to improve the tracking accuracy for features by NR method, we proposed BMA-NR method. We evaluated that BMA-NR method was superior to NBMA-NR in that feature tracking accuracy, since BMA-NR method was able to solve the local minimum problem due to search window size of NR.

  • PDF

Image Set Optimization for Real-Time Video Photomosaics (실시간 비디오 포토 모자이크를 위한 이미지 집합 최적화)

  • Choi, Yoon-Seok;Koo, Bon-Ki
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.502-507
    • /
    • 2009
  • We present a real-time photomosaics method for small image set optimized by feature selection method. Photomosaics is an image that is divided into cells (usually rectangular grids), each of which is replaced with another image of appropriate color, shape and texture pattern. This method needs large set of tile images which have various types of image pattern. But large amount of photo images requires high cost for pattern searching and large space for saving the images. These requirements can cause problems in the application to a real-time domain or mobile devices with limited resources. Our approach is a genetic feature selection method for building an optimized image set to accelerate pattern searching speed and minimize the memory cost.

  • PDF

A Novel Technique for Detection of Repacked Android Application Using Constant Key Point Selection Based Hashing and Limited Binary Pattern Texture Feature Extraction

  • MA Rahim Khan;Manoj Kumar Jain
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.141-149
    • /
    • 2023
  • Repacked mobile apps constitute about 78% of all malware of Android, and it greatly affects the technical ecosystem of Android. Although many methods exist for repacked app detection, most of them suffer from performance issues. In this manuscript, a novel method using the Constant Key Point Selection and Limited Binary Pattern (CKPS: LBP) Feature extraction-based Hashing is proposed for the identification of repacked android applications through the visual similarity, which is a notable feature of repacked applications. The results from the experiment prove that the proposed method can effectively detect the apps that are similar visually even that are even under the double fold content manipulations. From the experimental analysis, it proved that the proposed CKPS: LBP method has a better efficiency of detecting 1354 similar applications from a repository of 95124 applications and also the computational time was 0.91 seconds within which a user could get the decision of whether the app repacked. The overall efficiency of the proposed algorithm is 41% greater than the average of other methods, and the time complexity is found to have been reduced by 31%. The collision probability of the Hashes was 41% better than the average value of the other state of the art methods.