• 제목/요약/키워드: Local feature

Search Result 932, Processing Time 0.021 seconds

Shot boundary Frame Detection and Key Frame Detection for Multimedia Retrieval (멀티미디어 검색을 위한 shot 경계 및 대표 프레임 추출)

  • 강대성;김영호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.1
    • /
    • pp.38-43
    • /
    • 2001
  • This Paper suggests a new feature for shot detection, using the proposed robust feature from the DC image constructed by DCT DC coefficients in the MPEG video stream, and proposes the characterizing value that reflects the characteristic of kind of video (movie, drama, news, music video etc.). The key frames are pulled out from many frames by using the local minima and maxima of differential of the value. After original frame(not do image) are reconstructed for key frame, indexing process is performed through computing parameters. Key frames that are similar to user's query image are retrieved through computing parameters. It is proved that the proposed methods are better than conventional method from experiments. The retrieval accuracy rate is so high in experiments.

  • PDF

Global Feature Extraction and Recognition from Matrices of Gabor Feature Faces

  • Odoyo, Wilfred O.;Cho, Beom-Joon
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.2
    • /
    • pp.207-211
    • /
    • 2011
  • This paper presents a method for facial feature representation and recognition from the Covariance Matrices of the Gabor-filtered images. Gabor filters are a very powerful tool for processing images that respond to different local orientations and wave numbers around points of interest, especially on the local features on the face. This is a very unique attribute needed to extract special features around the facial components like eyebrows, eyes, mouth and nose. The Covariance matrices computed on Gabor filtered faces are adopted as the feature representation for face recognition. Geodesic distance measure is used as a matching measure and is preferred for its global consistency over other methods. Geodesic measure takes into consideration the position of the data points in addition to the geometric structure of given face images. The proposed method is invariant and robust under rotation, pose, or boundary distortion. Tests run on random images and also on publicly available JAFFE and FRAV3D face recognition databases provide impressively high percentage of recognition.

Enhanced SIFT Descriptor Based on Modified Discrete Gaussian-Hermite Moment

  • Kang, Tae-Koo;Zhang, Huazhen;Kim, Dong W.;Park, Gwi-Tae
    • ETRI Journal
    • /
    • v.34 no.4
    • /
    • pp.572-582
    • /
    • 2012
  • The discrete Gaussian-Hermite moment (DGHM) is a global feature representation method that can be applied to square images. We propose a modified DGHM (MDGHM) method and an MDGHM-based scale-invariant feature transform (MDGHM-SIFT) descriptor. In the MDGHM, we devise a movable mask to represent the local features of a non-square image. The complete set of non-square image features are then represented by the summation of all MDGHMs. We also propose to apply an accumulated MDGHM using multi-order derivatives to obtain distinguishable feature information in the third stage of the SIFT. Finally, we calculate an MDGHM-based magnitude and an MDGHM-based orientation using the accumulated MDGHM. We carry out experiments using the proposed method with six kinds of deformations. The results show that the proposed method can be applied to non-square images without any image truncation and that it significantly outperforms the matching accuracy of other SIFT algorithms.

Region Division for Large-scale Image Retrieval

  • Rao, Yunbo;Liu, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5197-5218
    • /
    • 2019
  • Large-scale retrieval algorithm is problem for visual analyses applications, along its research track. In this paper, we propose a high-efficiency region division-based image retrieve approaches, which fuse low-level local color histogram feature and texture feature. A novel image region division is proposed to roughly mimic the location distribution of image color and deal with the color histogram failing to describe spatial information. Furthermore, for optimizing our region division retrieval method, an image descriptor combining local color histogram and Gabor texture features with reduced feature dimensions are developed. Moreover, we propose an extended Canberra distance method for images similarity measure to increase the fault-tolerant ability of the whole large-scale image retrieval. Extensive experimental results on several benchmark image retrieval databases validate the superiority of the proposed approaches over many recently proposed color-histogram-based and texture-feature-based algorithms.

Stroke Width-Based Contrast Feature for Document Image Binarization

  • Van, Le Thi Khue;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.55-68
    • /
    • 2014
  • Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method.

Action Recognition with deep network features and dimension reduction

  • Li, Lijun;Dai, Shuling
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.832-854
    • /
    • 2019
  • Action recognition has been studied in computer vision field for years. We present an effective approach to recognize actions using a dimension reduction method, which is applied as a crucial step to reduce the dimensionality of feature descriptors after extracting features. We propose to use sparse matrix and randomized kd-tree to modify it and then propose modified Local Fisher Discriminant Analysis (mLFDA) method which greatly reduces the required memory and accelerate the standard Local Fisher Discriminant Analysis. For feature encoding, we propose a useful encoding method called mix encoding which combines Fisher vector encoding and locality-constrained linear coding to get the final video representations. In order to add more meaningful features to the process of action recognition, the convolutional neural network is utilized and combined with mix encoding to produce the deep network feature. Experimental results show that our algorithm is a competitive method on KTH dataset, HMDB51 dataset and UCF101 dataset when combining all these methods.

Attention-based for Multiscale Fusion Underwater Image Enhancement

  • Huang, Zhixiong;Li, Jinjiang;Hua, Zhen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.2
    • /
    • pp.544-564
    • /
    • 2022
  • Underwater images often suffer from color distortion, blurring and low contrast, which is caused by the propagation of light in the underwater environment being affected by the two processes: absorption and scattering. To cope with the poor quality of underwater images, this paper proposes a multiscale fusion underwater image enhancement method based on channel attention mechanism and local binary pattern (LBP). The network consists of three modules: feature aggregation, image reconstruction and LBP enhancement. The feature aggregation module aggregates feature information at different scales of the image, and the image reconstruction module restores the output features to high-quality underwater images. The network also introduces channel attention mechanism to make the network pay more attention to the channels containing important information. The detail information is protected by real-time superposition with feature information. Experimental results demonstrate that the method in this paper produces results with correct colors and complete details, and outperforms existing methods in quantitative metrics.

Smoke Detection Method Using Local Binary Pattern Variance in RGB Contrast Imag (RGB Contrast 영상에서의 Local Binary Pattern Variance를 이용한 연기검출 방법)

  • Kim, Jung Han;Bae, Sung-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.10
    • /
    • pp.1197-1204
    • /
    • 2015
  • Smoke detection plays an important role for the early detection of fire. In this paper, we suggest a newly developed method that generated LBPV(Local Binary Pattern Variance)s as special feature vectors from RGB contrast images can be applied to detect smoke using SVM(Support Vector Machine). The proposed method rearranges mean value of the block from each R, G, B channel and its intensity of the mean value. Additionally, it generates RGB contrast image which indicates each RGB channel’s contrast via smoke’s achromatic color. Uniform LBPV, Rotation-Invariance LBPV, Rotation-Invariance Uniform LBPV are applied to RGB Contrast images so that it could generate feature vector from the form of LBP. It helps to distinguish between smoke and non smoke area through SVM. Experimental results show that true positive detection rate is similar but false positive detection rate has been improved, although the proposed method reduced numbers of feature vector in half comparing with the existing method with LBP and LBPV.

Binary Hashing CNN Features for Action Recognition

  • Li, Weisheng;Feng, Chen;Xiao, Bin;Chen, Yanquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.9
    • /
    • pp.4412-4428
    • /
    • 2018
  • The purpose of this work is to solve the problem of representing an entire video using Convolutional Neural Network (CNN) features for human action recognition. Recently, due to insufficient GPU memory, it has been difficult to take the whole video as the input of the CNN for end-to-end learning. A typical method is to use sampled video frames as inputs and corresponding labels as supervision. One major issue of this popular approach is that the local samples may not contain the information indicated by the global labels and sufficient motion information. To address this issue, we propose a binary hashing method to enhance the local feature extractors. First, we extract the local features and aggregate them into global features using maximum/minimum pooling. Second, we use the binary hashing method to capture the motion features. Finally, we concatenate the hashing features with global features using different normalization methods to train the classifier. Experimental results on the JHMDB and MPII-Cooking datasets show that, for these new local features, binary hashing mapping on the sparsely sampled features led to significant performance improvements.

Face Recognition Based on the Combination of Enhanced Local Texture Feature and DBN under Complex Illumination Conditions

  • Li, Chen;Zhao, Shuai;Xiao, Ke;Wang, Yanjie
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.191-204
    • /
    • 2018
  • To combat the adverse impact imposed by illumination variation in the face recognition process, an effective and feasible algorithm is proposed in this paper. Firstly, an enhanced local texture feature is presented by applying the central symmetric encode principle on the fused component images acquired from the wavelet decomposition. Then the proposed local texture features are combined with Deep Belief Network (DBN) to gain robust deep features of face images under severe illumination conditions. Abundant experiments with different test schemes are conducted on both CMU-PIE and Extended Yale-B databases which contain face images under various illumination condition. Compared with the DBN, LBP combined with DBN and CSLBP combined with DBN, our proposed method achieves the most satisfying recognition rate regardless of the database used, the test scheme adopted or the illumination condition encountered, especially for the face recognition under severe illumination variation.