• Title/Summary/Keyword: Feature Fusion

Search Result 303, Processing Time 0.02 seconds

Feature information fusion using multiple neural networks and target identification application of FLIR image (다중 신경회로망을 이용한 특징정보 융합과 적외선영상에서의 표적식별에의 응용)

  • 선선구;박현욱
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.4
    • /
    • pp.266-274
    • /
    • 2003
  • Distance Fourier descriptors of local target boundary and feature information fusion using multiple MLPs (Multilayer perceptrons) are proposed. They are used to identify nonoccluded and partially occluded targets in natural FLIR (forward-looking infrared) images. After segmenting a target, radial Fourier descriptors as global shape features are defined from the target boundary. A target boundary is partitioned into four local boundaries to extract local shape features. In a local boundary, a distance function is defined from boundary points and a line between two extreme points. Distance Fourier descriptors as local shape features are defined by using distance function. One global feature vector and four local feature vectors are used as input data for multiple MLPs to determine final identification result of the target. In the experiments, we show that the proposed method is superior to the traditional feature sets with respect to the identification performance.

A Multimodal Fusion Method Based on a Rotation Invariant Hierarchical Model for Finger-based Recognition

  • Zhong, Zhen;Gao, Wanlin;Wang, Minjuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.1
    • /
    • pp.131-146
    • /
    • 2021
  • Multimodal biometric-based recognition has been an active topic because of its higher convenience in recent years. Due to high user convenience of finger, finger-based personal identification has been widely used in practice. Hence, taking Finger-Print (FP), Finger-Vein (FV) and Finger-Knuckle-Print (FKP) as the ingredients of characteristic, their feature representation were helpful for improving the universality and reliability in identification. To usefully fuse the multimodal finger-features together, a new robust representation algorithm was proposed based on hierarchical model. Firstly, to obtain more robust features, the feature maps were obtained by Gabor magnitude feature coding and then described by Local Binary Pattern (LBP). Secondly, the LGBP-based feature maps were processed hierarchically in bottom-up mode by variable rectangle and circle granules, respectively. Finally, the intension of each granule was represented by Local-invariant Gray Features (LGFs) and called Hierarchical Local-Gabor-based Gray Invariant Features (HLGGIFs). Experiment results revealed that the proposed algorithm is capable of improving rotation variation of finger-pose, and achieving lower Equal Error Rate (EER) in our homemade database.

Dual Attention Based Image Pyramid Network for Object Detection

  • Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4439-4455
    • /
    • 2021
  • Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.

Building Detection by Convolutional Neural Network with Infrared Image, LiDAR Data and Characteristic Information Fusion (적외선 영상, 라이다 데이터 및 특성정보 융합 기반의 합성곱 인공신경망을 이용한 건물탐지)

  • Cho, Eun Ji;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.6
    • /
    • pp.635-644
    • /
    • 2020
  • Object recognition, detection and instance segmentation based on DL (Deep Learning) have being used in various practices, and mainly optical images are used as training data for DL models. The major objective of this paper is object segmentation and building detection by utilizing multimodal datasets as well as optical images for training Detectron2 model that is one of the improved R-CNN (Region-based Convolutional Neural Network). For the implementation, infrared aerial images, LiDAR data, and edges from the images, and Haralick features, that are representing statistical texture information, from LiDAR (Light Detection And Ranging) data were generated. The performance of the DL models depends on not only on the amount and characteristics of the training data, but also on the fusion method especially for the multimodal data. The results of segmenting objects and detecting buildings by applying hybrid fusion - which is a mixed method of early fusion and late fusion - results in a 32.65% improvement in building detection rate compared to training by optical image only. The experiments demonstrated complementary effect of the training multimodal data having unique characteristics and fusion strategy.

Emotion Recognition Method based on Feature and Decision Fusion using Speech Signal and Facial Image (음성 신호와 얼굴 영상을 이용한 특징 및 결정 융합 기반 감정 인식 방법)

  • Joo, Jong-Tae;Yang, Hyun-Chang;Sim, Kwee-Bo
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.11-14
    • /
    • 2007
  • 인간과 컴퓨터간의 상호교류 하는데 있어서 감정 인식은 필수라 하겠다. 그래서 본 논문에서는 음성 신호 및 얼굴 영상을 BL(Bayesian Learning)과 PCA(Principal Component Analysis)에 적용하여 5가지 감정 (Normal, Happy, Sad, Anger, Surprise) 으로 패턴 분류하였다. 그리고 각각 신호의 단점을 보완하고 인식률을 높이기 위해 결정 융합 방법과 특징 융합 방법을 이용하여 감정융합을 실행하였다. 결정 융합 방법은 각각 인식 시스템을 통해 얻어진 인식 결과 값을 퍼지 소속 함수에 적용하여 감정 융합하였으며, 특정 융합 방법은 SFS(Sequential Forward Selection)특정 선택 방법을 통해 우수한 특정들을 선택한 후 MLP(Multi Layer Perceptron) 기반 신경망(Neural Networks)에 적용하여 감정 융합을 실행하였다.

  • PDF

Secure Biometric Hashing by Random Fusion of Global and Local Features

  • Ou, Yang;Rhee, Kyung-Hyune
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.6
    • /
    • pp.875-883
    • /
    • 2010
  • In this paper, we present a secure biometric hashing scheme for face recognition by random fusion of global and local features. The Fourier-Mellin transform and Radon transform are adopted respectively to form specialized representation of global and local features, due to their invariance to geometric operations. The final biometric hash is securely generated by random weighting sum of both feature sets. A fourfold key is involved in our algorithm to ensure the security and privacy of biometric templates. The proposed biometric hash can be revocable and replaced by using a new key. Moreover, the attacker cannot obtain any information about the original biometric template without knowing the secret key. The experimental results confirm that our scheme has a satisfactory accuracy performance in terms of EER.

Extraction of Spatial Characteristics of Cadastral Land Category from RapidEye Satellite Images

  • La, Phu Hien;Huh, Yong;Eo, Yang Dam;Lee, Soo Bong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.6
    • /
    • pp.581-590
    • /
    • 2014
  • With rapid land development, land category should be updated on a regular basis. However, manual field surveys have certain limitations. In this study, attempts were made to extract a feature vector considering spectral signature by parcel, PIMP (Percent Imperviousness), texture, and VIs (Vegetation Indices) based on RapidEye satellite image and cadastral map. A total of nine land categories in which feature vectors were significantly extracted from the images were selected and classified using SVM (Support Vector Machine). According to accuracy assessment, by comparing the cadastral map and classification result, the overall accuracy was 0.74. In the paddy-field category, in particular, PO acc. (producer's accuracy) and US acc. (user's accuracy) were highest at 0.85 and 0.86, respectively.

Ensemble convolutional neural networks for automatic fusion recognition of multi-platform radar emitters

  • Zhou, Zhiwen;Huang, Gaoming;Wang, Xuebao
    • ETRI Journal
    • /
    • v.41 no.6
    • /
    • pp.750-759
    • /
    • 2019
  • Presently, the extraction of hand-crafted features is still the dominant method in radar emitter recognition. To solve the complicated problems of selection and updation of empirical features, we present a novel automatic feature extraction structure based on deep learning. In particular, a convolutional neural network (CNN) is adopted to extract high-level abstract representations from the time-frequency images of emitter signals. Thus, the redundant process of designing discriminative features can be avoided. Furthermore, to address the performance degradation of a single platform, we propose the construction of an ensemble learning-based architecture for multi-platform fusion recognition. Experimental results indicate that the proposed algorithms are feasible and effective, and they outperform other typical feature extraction and fusion recognition methods in terms of accuracy. Moreover, the proposed structure could be extended to other prevalent ensemble learning alternatives.

Vocal Effort Detection Based on Spectral Information Entropy Feature and Model Fusion

  • Chao, Hao;Lu, Bao-Yun;Liu, Yong-Li;Zhi, Hui-Lai
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.218-227
    • /
    • 2018
  • Vocal effort detection is important for both robust speech recognition and speaker recognition. In this paper, the spectral information entropy feature which contains more salient information regarding the vocal effort level is firstly proposed. Then, the model fusion method based on complementary model is presented to recognize vocal effort level. Experiments are conducted on isolated words test set, and the results show the spectral information entropy has the best performance among the three kinds of features. Meanwhile, the recognition accuracy of all vocal effort levels reaches 81.6%. Thus, potential of the proposed method is demonstrated.

Multi-Path Feature Fusion Module for Semantic Segmentation (다중 경로 특징점 융합 기반의 의미론적 영상 분할 기법)

  • Park, Sangyong;Heo, Yong Seok
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.1
    • /
    • pp.1-12
    • /
    • 2021
  • In this paper, we present a new architecture for semantic segmentation. Semantic segmentation aims at a pixel-wise classification which is important to fully understand images. Previous semantic segmentation networks use features of multi-layers in the encoder to predict final results. However, they do not contain various receptive fields in the multi-layers features, which easily lead to inaccurate results for boundaries between different classes and small objects. To solve this problem, we propose a multi-path feature fusion module that allows for features of each layers to contain various receptive fields by use of a set of dilated convolutions with different dilatation rates. Various experiments demonstrate that our method outperforms previous methods in terms of mean intersection over unit (mIoU).