• 제목/요약/키워드: Local Feature Learning

검색결과 124건 처리시간 0.028초

Discriminative Training of Sequence Taggers via Local Feature Matching

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제14권3호
    • /
    • pp.209-215
    • /
    • 2014
  • Sequence tagging is the task of predicting frame-wise labels for a given input sequence and has important applications to diverse domains. Conventional methods such as maximum likelihood (ML) learning matches global features in empirical and model distributions, rather than local features, which directly translates into frame-wise prediction errors. Recent probabilistic sequence models such as conditional random fields (CRFs) have achieved great success in a variety of situations. In this paper, we introduce a novel discriminative CRF learning algorithm to minimize local feature mismatches. Unlike overall data fitting originating from global feature matching in ML learning, our approach reduces the total error over all frames in a sequence. We also provide an efficient gradient-based learning method via gradient forward-backward recursion, which requires the same computational complexity as ML learning. For several real-world sequence tagging problems, we empirically demonstrate that the proposed learning algorithm achieves significantly more accurate prediction performance than standard estimators.

Bio-Inspired Object Recognition Using Parameterized Metric Learning

  • Li, Xiong;Wang, Bin;Liu, Yuncai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권4호
    • /
    • pp.819-833
    • /
    • 2013
  • Computing global features based on local features using a bio-inspired framework has shown promising performance. However, for some tough applications with large intra-class variances, a single local feature is inadequate to represent all the attributes of the images. To integrate the complementary abilities of multiple local features, in this paper we have extended the efficacy of the bio-inspired framework, HMAX, to adapt heterogeneous features for global feature extraction. Given multiple global features, we propose an approach, designated as parameterized metric learning, for high dimensional feature fusion. The fusion parameters are solved by maximizing the canonical correlation with respect to the parameters. Experimental results show that our method achieves significant improvements over the benchmark bio-inspired framework, HMAX, and other related methods on the Caltech dataset, under varying numbers of training samples and feature elements.

이질적 얼굴인식을 위한 심층 정준상관분석을 이용한 지역적 얼굴 특징 학습 방법 (Local Feature Learning using Deep Canonical Correlation Analysis for Heterogeneous Face Recognition)

  • 최여름;김형일;노용만
    • 한국멀티미디어학회논문지
    • /
    • 제19권5호
    • /
    • pp.848-855
    • /
    • 2016
  • Face recognition has received a great deal of attention for the wide range of applications in real-world scenario. In this scenario, mismatches (so called heterogeneity) in terms of resolution and illumination between gallery and test face images are inevitable due to the different capturing conditions. In order to deal with the mismatch problem, we propose a local feature learning method using deep canonical correlation analysis (DCCA) for heterogeneous face recognition. By the DCCA, we can effectively reduce the mismatch between the gallery and the test face images. Furthermore, the proposed local feature learned by the DCCA is able to enhance the discriminative power by using facial local structure information. Through the experiments on two different scenarios (i.e., matching near-infrared to visible face images and matching low-resolution to high-resolution face images), we could validate the effectiveness of the proposed method in terms of recognition accuracy using publicly available databases.

Intra-class Local Descriptor-based Prototypical Network for Few-Shot Learning

  • Huang, Xi-Lang;Choi, Seon Han
    • 한국멀티미디어학회논문지
    • /
    • 제25권1호
    • /
    • pp.52-60
    • /
    • 2022
  • Few-shot learning is a sub-area of machine learning problems, which aims to classify target images that only contain a few labeled samples for training. As a representative few-shot learning method, the Prototypical network has been received much attention due to its simplicity and promising results. However, the Prototypical network uses the sample mean of samples from the same class as the prototypes of that class, which easily results in learning uncharacteristic features in the low-data scenery. In this study, we propose to use local descriptors (i.e., patches along the channel within feature maps) from the same class to explicitly obtain more representative prototypes for Prototypical Network so that significant intra-class feature information can be maintained and thus improving the classification performance on few-shot learning tasks. Experimental results on various benchmark datasets including mini-ImageNet, CUB-200-2011, and tiered-ImageNet show that the proposed method can learn more discriminative intra-class features by the local descriptors and obtain more generic prototype representations under the few-shot setting.

Feature Selection via Embedded Learning Based on Tangent Space Alignment for Microarray Data

  • Ye, Xiucai;Sakurai, Tetsuya
    • Journal of Computing Science and Engineering
    • /
    • 제11권4호
    • /
    • pp.121-129
    • /
    • 2017
  • Feature selection has been widely established as an efficient technique for microarray data analysis. Feature selection aims to search for the most important feature/gene subset of a given dataset according to its relevance to the current target. Unsupervised feature selection is considered to be challenging due to the lack of label information. In this paper, we propose a novel method for unsupervised feature selection, which incorporates embedded learning and $l_{2,1}-norm$ sparse regression into a framework to select genes in microarray data analysis. Local tangent space alignment is applied during embedded learning to preserve the local data structure. The $l_{2,1}-norm$ sparse regression acts as a constraint to aid in learning the gene weights correlatively, by which the proposed method optimizes for selecting the informative genes which better capture the interesting natural classes of samples. We provide an effective algorithm to solve the optimization problem in our method. Finally, to validate the efficacy of the proposed method, we evaluate the proposed method on real microarray gene expression datasets. The experimental results demonstrate that the proposed method obtains quite promising performance.

퍼지 k-Nearest Neighbors 와 Reconstruction Error 기반 Lazy Classifier 설계 (Design of Lazy Classifier based on Fuzzy k-Nearest Neighbors and Reconstruction Error)

  • 노석범;안태천
    • 한국지능시스템학회논문지
    • /
    • 제20권1호
    • /
    • pp.101-108
    • /
    • 2010
  • 본 논문에서는 퍼지 k-NN과 reconstruction error에 기반을 둔 feature selection을 이용한 lazy 분류기 설계를 제안하였다. Reconstruction error는 locally linear reconstruction의 평가 지수이다. 새로운 입력이 주어지면, 퍼지 k-NN은 local 분류기가 유효한 로컬 영역을 정의하고, 로컬 영역 안에 포함된 데이터 패턴에 하중 값을 할당한다. 로컬 영역과 하중 값을 정의한 우에, feature space의 차원을 감소시키기 위하여 feature selection이 수행된다. Reconstruction error 관점에서 우수한 성능을 가진 여러 개의 feature들이 선택 되어 지면, 다항식의 일종인 분류기가 하중 최소자승법에 의해 결정된다. 실험 결과는 기존의 분류기인 standard neural networks, support vector machine, linear discriminant analysis, and C4.5 trees와 비교 결과를 보인다.

전역 및 지역 특징 기반 딥러닝을 이용한 프린터 장치 판별 기술 (Printer Identification Methods Using Global and Local Feature-Based Deep Learning)

  • 이수현;이해연
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제8권1호
    • /
    • pp.37-44
    • /
    • 2019
  • 디지털 IT 기술의 발달로 인하여 프린터와 스캐너의 성능이 향상되고 가격이 저렴해지면서 일반인들도 쉽게 접할 수 있게 되었다. 그러나 이에 따른 부작용으로 공문서 및 사문서 위조 등의 범죄들이 쉽게 이루어질 수 있다. 따라서 해당 문서가 어떤 프린터를 사용하여 출력 되었는가를 특정할 수 있다면 수사 범위를 줄이고 용의자를 판별하는데 도움이 된다. 본 논문에서는 프린터 장치 판별을 위하여 딥러닝 모델을 제안한다. 먼저 최근 인식 등에서 범용적으로 활용되는 지역 특징 기반의 컨볼루셔널 뉴널 네트워크를 이용한 프린터 장치 판별 모델을 제안하고, 전역 특징 기반의 처리 과정을 네트워크 모델에 도입함으로 인하여 수렴 속도 및 정확도를 향상한 기법을 제안한다. 제안한 모델의 성능은 8개의 프린터 장치를 활용하여 기존 프린터 판별을 위한 특징 기반 기술과 비교를 수행하였다. 그 결과 제안하는 지역 특징 기반의 모델과 전역 특징 기반의 모델이 각각 97.23% 및 99.98%의 높은 판별 정확도를 달성하였고, 기존 기술들에 비하여 높은 정확도를 갖는 우수성을 보였다.

Residual Learning Based CNN for Gesture Recognition in Robot Interaction

  • Han, Hua
    • Journal of Information Processing Systems
    • /
    • 제17권2호
    • /
    • pp.385-398
    • /
    • 2021
  • The complexity of deep learning models affects the real-time performance of gesture recognition, thereby limiting the application of gesture recognition algorithms in actual scenarios. Hence, a residual learning neural network based on a deep convolutional neural network is proposed. First, small convolution kernels are used to extract the local details of gesture images. Subsequently, a shallow residual structure is built to share weights, thereby avoiding gradient disappearance or gradient explosion as the network layer deepens; consequently, the difficulty of model optimisation is simplified. Additional convolutional neural networks are used to accelerate the refinement of deep abstract features based on the spatial importance of the gesture feature distribution. Finally, a fully connected cascade softmax classifier is used to complete the gesture recognition. Compared with the dense connection multiplexing feature information network, the proposed algorithm is optimised in feature multiplexing to avoid performance fluctuations caused by feature redundancy. Experimental results from the ISOGD gesture dataset and Gesture dataset prove that the proposed algorithm affords a fast convergence speed and high accuracy.

Human Activity Recognition Based on 3D Residual Dense Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제23권12호
    • /
    • pp.1540-1551
    • /
    • 2020
  • Aiming at the problem that the existing human behavior recognition algorithm cannot fully utilize the multi-level spatio-temporal information of the network, a human behavior recognition algorithm based on a dense three-dimensional residual network is proposed. First, the proposed algorithm uses a dense block of three-dimensional residuals as the basic module of the network. The module extracts the hierarchical features of human behavior through densely connected convolutional layers; Secondly, the local feature aggregation adaptive method is used to learn the local dense features of human behavior; Then, the residual connection module is applied to promote the flow of feature information and reduced the difficulty of training; Finally, the multi-layer local feature extraction of the network is realized by cascading multiple three-dimensional residual dense blocks, and use the global feature aggregation adaptive method to learn the features of all network layers to realize human behavior recognition. A large number of experimental results on benchmark datasets KTH show that the recognition rate (top-l accuracy) of the proposed algorithm reaches 93.52%. Compared with the three-dimensional convolutional neural network (C3D) algorithm, it has improved by 3.93 percentage points. The proposed algorithm framework has good robustness and transfer learning ability, and can effectively handle a variety of video behavior recognition tasks.

Improving Transformer with Dynamic Convolution and Shortcut for Video-Text Retrieval

  • Liu, Zhi;Cai, Jincen;Zhang, Mengmeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권7호
    • /
    • pp.2407-2424
    • /
    • 2022
  • Recently, Transformer has made great progress in video retrieval tasks due to its high representation capability. For the structure of a Transformer, the cascaded self-attention modules are capable of capturing long-distance feature dependencies. However, the local feature details are likely to have deteriorated. In addition, increasing the depth of the structure is likely to produce learning bias in the learned features. In this paper, an improved Transformer structure named TransDCS (Transformer with Dynamic Convolution and Shortcut) is proposed. A Multi-head Conv-Self-Attention module is introduced to model the local dependencies and improve the efficiency of local features extraction. Meanwhile, the augmented shortcuts module based on a dual identity matrix is applied to enhance the conduction of input features, and mitigate the learning bias. The proposed model is tested on MSRVTT, LSMDC and Activity-Net benchmarks, and it surpasses all previous solutions for the video-text retrieval task. For example, on the LSMDC benchmark, a gain of about 2.3% MdR and 6.1% MnR is obtained over recently proposed multimodal-based methods.