• Title/Summary/Keyword: Filter-Bank

Search Result 355, Processing Time 0.025 seconds

Performance Improvements for Silence Feature Normalization Method by Using Filter Bank Energy Subtraction (필터 뱅크 에너지 차감을 이용한 묵음 특징 정규화 방법의 성능 향상)

  • Shen, Guanghu;Choi, Sook-Nam;Chung, Hyun-Yeol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.7C
    • /
    • pp.604-610
    • /
    • 2010
  • In this paper we proposed FSFN (Filter bank sub-band energy subtraction based CLSFN) method to improve the recognition performance of the existing CLSFN (Cepstral distance and Log-energy based Silence Feature Normalization). The proposed FSFN reduces the energy of noise components in filter bank sub-band domain when extracting the features from speech data. This leads to extract the enhanced cepstral features and thus improves the accuracy of speech/silence classification using the enhanced cepstral features. Therefore, it can be expected to get improved performance comparing with the existing CLSFN. Experimental results conducted on Aurora 2.0 DB showed that our proposed FSFN method improves the averaged word accuracy of 2% comparing with the conventional CLSFN method, and FSFN combined with CMVN (Cepstral Mean and Variance Normalization) also showed the best recognition performance comparing with others.

A Image Search Algorithm using Coefficients of The Cosine Transform (여현변환 계수를 이용한 이미지 탐색 알고리즘)

  • Lee, Seok-Han
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.1
    • /
    • pp.13-21
    • /
    • 2019
  • The content based on image retrieval makes use of features of information within image such as color, texture and share for Retrieval data. we present a novel approach for improving retrieval accuracy based on DCT Filter-Bank. First, we perform DCT on a given image, and generate a Filter-Bank using the DCT coefficients for each color channel. In this step, DC and the limited number of AC coefficients are used. Next, a feature vector is obtained from the histogram of the quantized DC coefficients. Then, AC coefficients in the Filter-Bank are separated into three main groups indicating horizontal, vertical, and diagonal edge directions, respectively, according to their spatial-frequency properties. Each directional group creates its histogram after employing Otsu binarization technique. Finally, we project each histogram on the horizontal and vertical axes, and generate a feature vector for each group. The computed DC and AC feature vectors bins are concatenated, and it is used in the similarity checking procedure. We experimented using 1,000 databases, and as a result, this approach outperformed the old retrieval method which used color information.

Development of Automatic Crack Detection using the Gabor Filter for Concrete Structures of Railway Tracks (가버 필터를 사용한 철도 콘크리트 궤도 도상의 자동 균열 감지 개발)

  • Na, Yong-Hyoun;Park, Mi-Yun;Park, Ji-Soo;Park, Sung-Baek;Kwon, Se-Gon
    • Journal of the Society of Disaster Information
    • /
    • v.14 no.4
    • /
    • pp.458-465
    • /
    • 2018
  • Purpose: Concrete track that affects on railway safety can detect cracks using image processing technique. However, since a condition of concrete track and surface noisy are obstructed to detect cracks, there is a need for a way to remove them effectively. Method: In this study, we proposed an image processing to detect cracks effectively for Korean railway and verified its performance through experiment. We developed image acquisition system for capture a railway concrete track and acquired railway concrete track images, randomly selected 2000 images and detected cracks in the image process using proposed Gabor Filter Bank methods. Results: As a result, 94% of detection rate are matched to the actual cracks in same quality and format railway concrete track image. Conclution: The crack detection method using Garbor Filter Bank was confirmed to be effective for crack image including noise in the Korean railway concrete track. This system is expected to become an automated maintenance system in the existing human-centered railway industry.

The Design of Optimal Filters in Vector-Quantized Subband Codecs (벡터양자화된 부대역 코덱에서 최적필터의 구현)

  • 지인호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.97-102
    • /
    • 2000
  • Subband coding is to divide the signal frequency band into a set of uncorrelated frequency bands by filtering and then to encode each of these subbands using a bit allocation rationale matched to the signal energy in that subband. The actual coding of the subband signal can be done using waveform encoding techniques such as PCM, DPCM and vector quantizer(VQ) in order to obtain higher data compression. Most researchers have focused on the error in the quantizer, but not on the overall reconstruction error and its dependence on the filter bank. This paper provides a thorough analysis of subband codecs and further development of optimum filter bank design using vector quantizer. We compute the mean squared reconstruction error(MSE) which depends on N the number of entries in each code book, k the length of each code word, and on the filter bank coefficients. We form this MSE measure in terms of the equivalent quantization model and find the optimum FIR filter coefficients for each channel in the M-band structure for a given bit rate, given filter length, and given input signal correlation model. Specific design examples are worked out for 4-tap filter in 2-band paraunitary filter bank structure. These optimum paraunitary filter coefficients are obtained by using Monte Carlo simulation. We expect that the results of this work could be contributed to study on the optimum design of subband codecs using vector quantizer.

  • PDF

Feature Parameter Extraction and Speech Recognition Using Matrix Factorization (Matrix Factorization을 이용한 음성 특징 파라미터 추출 및 인식)

  • Lee Kwang-Seok;Hur Kang-In
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.7
    • /
    • pp.1307-1311
    • /
    • 2006
  • In this paper, we propose new speech feature parameter using the Matrix Factorization for appearance part-based features of speech spectrum. The proposed parameter represents effective dimensional reduced data from multi-dimensional feature data through matrix factorization procedure under all of the matrix elements are the non-negative constraint. Reduced feature data presents p art-based features of input data. We verify about usefulness of NMF(Non-Negative Matrix Factorization) algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment results, we confirm that proposed feature parameter is superior to MFCC(Mel-Frequency Cepstral Coefficient) in recognition performance that is used generally.

A Study on the Real Time Recognition of Korean Isolated Words with Filter Bank Output (필터뱅크 출력을 이용한 실시간 격리 단어 인식에 관한 연구)

  • Kim, Kye-Kook;Lee, Jong-Arc;Kahng, Seong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.3
    • /
    • pp.5-12
    • /
    • 1991
  • In this paper, 10 city names of Korean were recognized. The name are articulated each 5 times by 10 male speakers. Filter bank output on total 500 words were extracted and they were used as feature parameters. Filter bank was constructed of 15 channels with 1/3 octave spacing from 200[Hz], using RC active circuit. Reference templates were created by clustering algorithm. DTW algorithm was used to compare similarity between reference templates and input words. Euclidean distance equation and Chebyshev distance equation were used to know the distinction between the recognition results obtained by the method of distance caculation, error rates are 16.4[%], 15.0[%], respectively.

  • PDF

Audio Event Classification Using Deep Neural Networks (깊은 신경망을 이용한 오디오 이벤트 분류)

  • Lim, Minkyu;Lee, Donghyun;Kim, Kwang-Ho;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.27-33
    • /
    • 2015
  • This paper proposes an audio event classification method using Deep Neural Networks (DNN). The proposed method applies Feed Forward Neural Network (FFNN) to generate event probabilities of ten audio events (dog barks, engine idling, and so on) for each frame. For each frame, mel scale filter bank features of its consecutive frames are used as the input vector of the FFNN. These event probabilities are accumulated for the events and the classification result is determined as the event with the highest accumulated probability. For the same dataset, the best accuracy of previous studies was reported as about 70% when the Support Vector Machine (SVM) was applied. The best accuracy of the proposed method achieves as 79.23% for the UrbanSound8K dataset when 80 mel scale filter bank features each from 7 consecutive frames (in total 560) were implemented as the input vector for the FFNN with two hidden layers and 2,000 neurons per hidden layer. In this configuration, the rectified linear unit was suggested as its activation function.

A Simplified Zero-Forcing Receiver for Multi-User Uplink Systems Based on CB-OSFB Modulation

  • Bian, Xin;Tian, Jinfeng;Wang, Hong;Li, Mingqi;Song, Rongfang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.2275-2293
    • /
    • 2020
  • This paper focuses on the simplified receiver design for multi-user circular block oversampled filter bank (CB-OSFB) uplink systems. Through application of discrete Fourier transform (DFT), the special banded structure and circular properties of the modulation matrix in the frequency domain of each user are derived. By exploiting the newly derived properties, a simplified zero-forcing (ZF) receiver is proposed for multi-user CB-OSFB uplink systems in the multipath channels. In the proposed receiver, the matrix inversion operation of the large dimension multi-user equivalent channel matrix is transformed into DFTs and smaller size matrix inversion operations. Simulation is given to show that the proposed ZF receiver can dramatically reduce the computational complexity while with almost the same symbol error rate as that of the traditional ZF receiver.

The Recognition of Korean Single vowels by Use of the Diffusion Filter Bank as a Pre-processor (확산필터뱅크를 전처리기로 사용한 한국어 단모음인식)

  • Huh, Man-Tak;Kim, Jae-Chang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.81-87
    • /
    • 1997
  • In this paper, a new pre-processing method for the recognition of single vowels by use of spectrum envelope is presented. We use new extraction method of a spectrum envelope using the diffusion filter bank. By dividing analysis band of a diffusion filter bank into subbands, we decreased the number of diffusion process. And, by increasing the number of difference, we got higher selectivity. As a result of them, we reduced the total processing time, and got higher enhancement of discrimination. By getting 88.3% of average recognition rate for single vowels of natural voice through computer simulation. We confirmed it to be useful for speech recognition which use spectrum analysis of the voice signal to have many frequency components.

  • PDF

A New Method of Fingerprint Image Processing Based on a Directional Filter Bank (방향성필터뱅크 기반의 새로운 지문영상의 처리 방법)

  • Oh, Sang-Keun;Lee, Joon-Jae;Park, Kil-Houm
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.8A
    • /
    • pp.796-804
    • /
    • 2002
  • This paper presents a new algorithm of fingerprint image analysis and processing using directional filter bank(DFB). The directional components of ridge is very important in pre-processing steps of fingerprint image processing such as image enhancement by directional filtering followed by estimationg the directional image of ridge patterns. The DFB analyzes input image into directional subband images and synthesizes them to the perfectly reconstructed image. In this paper, a new fingerprint processing algorithm using the DFB is proposed. The algorithm decomposes the fingerprint image into directional subband images and performs directional map generation, foreground segmentation, singular points extraction and image enhancement based on local directional energy estimate.