• Title/Summary/Keyword: metric learning

Search Result 128, Processing Time 0.034 seconds

Human Gait Recognition Based on Spatio-Temporal Deep Convolutional Neural Network for Identification

  • Zhang, Ning;Park, Jin-ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.927-939
    • /
    • 2020
  • Gait recognition can identify people's identity from a long distance, which is very important for improving the intelligence of the monitoring system. Among many human features, gait features have the advantages of being remotely available, robust, and secure. Traditional gait feature extraction, affected by the development of behavior recognition, can only rely on manual feature extraction, which cannot meet the needs of fine gait recognition. The emergence of deep convolutional neural networks has made researchers get rid of complex feature design engineering, and can automatically learn available features through data, which has been widely used. In this paper,conduct feature metric learning in the three-dimensional space by combining the three-dimensional convolution features of the gait sequence and the Siamese structure. This method can capture the information of spatial dimension and time dimension from the continuous periodic gait sequence, and further improve the accuracy and practicability of gait recognition.

A distance metric of nominal attribute based on conditional probability (조건부 확률에 기반한 범주형 자료의 거리 측정)

  • 이재호;우종하;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09b
    • /
    • pp.53-56
    • /
    • 2003
  • 유사도 혹은 자료간의 거리 개념은 많은 기계학습 알고리즘에서 사용되고 있는 중요한 측정개념이다 하지만 입력되는 자료의 속성들중 순서가 정의되지 않은 범주형 속성이 포함되어 있는 경우, 자료간의 유사도나 거리 측정에 어려움이 따른다. 비거리 기반의 알고리즘들의 경우-C4.5, CART-거리의 측정없이 작동할 수 있지만, 거리기반의 알고리즘들의 경우 범주형 속성의 거리 정보 결여로 효과적으로 적용될 수 없는 문제점을 갖고 있다. 본 논문에서는 이러한 범주형 자료들간 거리 측정을 자료 집합의 특성을 충분히 고려한 방법을 제안한다. 이를 위해 자료 집합의 선험적인 정보를 필요로 한다. 이런 선험적 정보인 조건부 확률을 기반으로한 거리 측정방법을 제시하고 오류 피드백을 통해서 속성 간 거리 측정을 최적화 하려고 노력한다. 주어진 자료 집합에 대해 서로 다른 두 범주형 값이 목적 속성에 대해서 유사한 분포를 보인다면 이들 값들은 비교적 가까운 거리로 결정한다 이렇게 결정된 거리를 기반으로 학습 단계를 진행하며 이때 발생한 오류들에 대해 피드백 작업을 진행한다. UCI Machine Learning Repository의 자료들을 이용한 실험 결과를 통해 제안한 거리 측정 방법의 우수한 성능을 확인하였다.

  • PDF

Feature reduction based on distance metric learning for musical genre classification (거리 함수 학습을 활용하여 장르 분류를 위한 특징 셋의 간소화 방법 연구)

  • Jang, Dalwon;Shin, Saim;Lee, JongSeol;Jang, Sei-Jin;Lim, Tae-Beom
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.3-4
    • /
    • 2014
  • 음악 장르 분류 분야에서는 다양한 특징을 모아서 특징 벡터를 만들고 이를 support vector machine (SVM)와 같은 분류기에 입력하는 시스템이 주로 사용되고 있다. 이 논문에서는 거리 함수 학습를 음악 장르 분류를 위한 특징 벡터의 간소화에 적용하였다. 여러 거리 함수 학습 방법 중 하나의 방법을 선택하고, 기존의 논문들에서 사용되었던 특징 셋을 활용하여 기존 특징 셋에 대해서 성능을 떨어뜨리지 않으면서 특징 셋의 길이를 줄일 수 있는지 살펴본다. 우리의 실험에서는 168차원의 특징 셋을 10차원까지 줄였는데, 이 경우 분류 정확도가 2% 이내로 저하되었다.

  • PDF

Improving Text Categorization with High Quality Bigrams (고품질 바이그램을 이용한 문서 범주화 성능 향상)

  • Lee, Chan-Do;Tan, Chade-Meng;Wang, Yuan-Fang
    • The KIPS Transactions:PartB
    • /
    • v.9B no.4
    • /
    • pp.415-420
    • /
    • 2002
  • This paper presents an efficient text categorization algorithm that generates high quality bigrams by using the information gain metric, combined with various frequency thresholds. The bigrams, along with unigrams, are then given as features to a Naive Bayes classifier. The experimental results suggest that the bigrams, while small in number, can substantially contribute to improving text categorization. Upon close examination of the results, we conclude that the algorithm is most successful in correctly classifying more positive documents, but may cause more negative documents to be classified incorrectly.

Applying distance metric learning for improvement of genre classification (장르 분류 성능 향상을 위한 거리함수 학습의 활용)

  • Jang, Dal-Won;Sin, Sa-Im;Lee, Jong-Seol;Jang, Se-Jin;Lim, Tae-Beom
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.11a
    • /
    • pp.36-37
    • /
    • 2013
  • 음악 장르 분류 분야에서는 다양한 특징을 모아서 사용하는 방법과 support vector machine (SVM) 분류기가 주로 사용되고 있다. 이 논문에서는 거리 함수 학습를 음악 장르 분류에 적용하여 성능 향상을 꾀한다. 여러 거리 함수 학습 방법 중 하나의 방법을 선택하고, 일반적으로 많이 사용되는 특징 셋을 활용하여 다양한 특징 셋에 대해서 적용하였을 때, 실제 성능 향상이 있는지를 알아본다. 세 종류의 특징 셋을 사용하여 실험한 결과 두 가지 특징이 같이 있는 특징 셋에 대해서만 성능 향상이 있었으며, SVM보다 높은 성능을 보이지 못 했다.

  • PDF

Vision-Based Feature Map-Building and Localization Algorithms for Mobile Robots (주행 로봇을 위한 비젼 기반의 특징지도 작성 및 위치 결정 알고리즘에 관한 연구)

  • Kim, Young-Geun;Choi, Chang-Min;Jin, Sung-Hun;Kim, Hak-Il
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2475-2478
    • /
    • 2002
  • This paper consider's the problem of exploring an unfamiliar environment in search of recognizable objects of visual landmarks. In order to extract and recognize them automatically, a feature map is constructed which records the set of features continually during a learning phase. The map contains photometric geometric, and metric information of each feature. Meanwhile, the localization algorithm can determine the position of the robot by extracting features and matching in the map. These procedures are implemented and tested using an AMR, and preliminary results are presented in this paper.

  • PDF

ModifiedFAST: A New Optimal Feature Subset Selection Algorithm

  • Nagpal, Arpita;Gaur, Deepti
    • Journal of information and communication convergence engineering
    • /
    • v.13 no.2
    • /
    • pp.113-122
    • /
    • 2015
  • Feature subset selection is as a pre-processing step in learning algorithms. In this paper, we propose an efficient algorithm, ModifiedFAST, for feature subset selection. This algorithm is suitable for text datasets, and uses the concept of information gain to remove irrelevant and redundant features. A new optimal value of the threshold for symmetric uncertainty, used to identify relevant features, is found. The thresholds used by previous feature selection algorithms such as FAST, Relief, and CFS were not optimal. It has been proven that the threshold value greatly affects the percentage of selected features and the classification accuracy. A new performance unified metric that combines accuracy and the number of features selected has been proposed and applied in the proposed algorithm. It was experimentally shown that the percentage of selected features obtained by the proposed algorithm was lower than that obtained using existing algorithms in most of the datasets. The effectiveness of our algorithm on the optimal threshold was statistically validated with other algorithms.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensembles

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • v.5 no.2
    • /
    • pp.99-104
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

Nearest Neighbor Based Prototype Classification Preserving Class Regions

  • Hwang, Doosung;Kim, Daewon
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1345-1357
    • /
    • 2017
  • A prototype selection method chooses a small set of training points from a whole set of class data. As the data size increases, the selected prototypes play a significant role in covering class regions and learning a discriminate rule. This paper discusses the methods for selecting prototypes in a classification framework. We formulate a prototype selection problem into a set covering optimization problem in which the sets are composed with distance metric and predefined classes. The formulation of our problem makes us draw attention only to prototypes per class, not considering the other class points. A training point becomes a prototype by checking the number of neighbors and whether it is preselected. In this setting, we propose a greedy algorithm which chooses the most relevant points for preserving the class dominant regions. The proposed method is simple to implement, does not have parameters to adapt, and achieves better or comparable results on both artificial and real-world problems.

A Detailed Analysis of Classifier Ensembles for Intrusion Detection in Wireless Network

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1203-1212
    • /
    • 2017
  • Intrusion detection systems (IDSs) are crucial in this overwhelming increase of attacks on the computing infrastructure. It intelligently detects malicious and predicts future attack patterns based on the classification analysis using machine learning and data mining techniques. This paper is devoted to thoroughly evaluate classifier ensembles for IDSs in IEEE 802.11 wireless network. Two ensemble techniques, i.e. voting and stacking are employed to combine the three base classifiers, i.e. decision tree (DT), random forest (RF), and support vector machine (SVM). We use area under ROC curve (AUC) value as a performance metric. Finally, we conduct two statistical significance tests to evaluate the performance differences among classifiers.