• Title/Summary/Keyword: Feature normalization

Search Result 155, Processing Time 0.021 seconds

Spectral Normalization for Speaker-Invariant Feature Extraction (화자 불변 특징추출을 위한 스펙트럼 정규화)

  • 오광철
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.238-241
    • /
    • 1993
  • We present a new method to normalize spectral variations of different speakers based on physiological studies of hearing. The proposed method uses the cochlear frequency map to warp the input speech spectra by interpolation or decimation. Using this normalization method, we can obtain much improved recognition results for speaker independent speech recognition.

  • PDF

An Amplitude Warping Approach to Intra-Speaker Normalization for Speech Recognition (음성인식에서 화자 내 정규화를 위한 진폭 변경 방법)

  • Kim Dong-Hyun;Hong Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.4 no.3
    • /
    • pp.9-14
    • /
    • 2003
  • The method of vocal tract normalization is a successful method for improving the accuracy of inter-speaker normalization. In this paper, we present an intra-speaker warping factor estimation based on pitch alteration utterance. The feature space distributions of untransformed speech from the pitch alteration utterance of intra-speaker would vary due to the acoustic differences of speech produced by glottis and vocal tract. The variation of utterance is two types: frequency and amplitude variation. The vocal tract normalization is frequency normalization among inter-speaker normalization methods. Therefore, we have to consider amplitude variation, and it may be possible to determine the amplitude warping factor by calculating the inverse ratio of input to reference pitch. k, the recognition results, the error rate is reduced from 0.4% to 2.3% for digit and word decoding.

  • PDF

Speaker Change Detection by Normalization of Phonetic Characteristics (음소 특성 정규화를 통한 화자 변화 검출)

  • Kim Hyung Soon;Park Hae Young;Park Sun Young
    • MALSORI
    • /
    • no.47
    • /
    • pp.97-107
    • /
    • 2003
  • Speaker change detection is to detect automatically a point of time at which speaker was replaced. Since feature parameters used for speaker change detection depend not only on speaker characteristics but also on phonetic characteristics, spoken contents included in the feature parameters inevitably causes performance degradation of speaker change detection. In this paper, to alleviate this problem, a method to normalize phonetic variations in speech feature parameters is proposed for emphasizing changes due to speaker characteristics. Experimental results show that the proposed method improves the performance of speaker change detection.

  • PDF

A Study on Appearance-Based Facial Expression Recognition Using Active Shape Model (Active Shape Model을 이용한 외형기반 얼굴표정인식에 관한 연구)

  • Kim, Dong-Ju;Shin, Jeong-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.1
    • /
    • pp.43-50
    • /
    • 2016
  • This paper introduces an appearance-based facial expression recognition method using ASM landmarks which is used to acquire a detailed face region. In particular, EHMM-based algorithm and SVM classifier with histogram feature are employed to appearance-based facial expression recognition, and performance evaluation of proposed method was performed with CK and JAFFE facial expression database. In addition, performance comparison was achieved through comparison with distance-based face normalization method and a geometric feature-based facial expression approach which employed geometrical features of ASM landmarks and SVM algorithm. As a result, the proposed method using ASM-based face normalization showed performance improvements of 6.39% and 7.98% compared to previous distance-based face normalization method for CK database and JAFFE database, respectively. Also, the proposed method showed higher performance compared to geometric feature-based facial expression approach, and we confirmed an effectiveness of proposed method.

Cepstral Normalization Combined with CSFN for Noisy Speech Recognition (켑스트럼 정규화와 켑스트럼 거리기반 묵음특징정규화 방법을 이용한 잡음음성 인식)

  • Choi, Sook-Nam;Shen, Guang-Hu;Chung, Hyun-Yeol
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.10
    • /
    • pp.1221-1228
    • /
    • 2011
  • The speech recognition system works well in general indoor environment. However, the recognition performance is dramatically decreased when the system is used in the real environment because of the several noises. In this paper we proposed CSFN-CMVN to improve the recognition performance of the existing CSFN(Cepstral distance based SFN). The CSFN-CMVN method is a combined method of cepstral normalization with CSFN that normalizes silence features using cepstral euclidean distance to classify speech/silence for better performance. From the test results using Aurora 2.0 DB, we could find out that our proposed CSFN-CMVN improves about 7% of more average word accuracy in all the test sets comparing with the typical silence features normalization SFN-I. We can also get improved accuracy of 6% and 5% respectively in compared tests with the conventional SFN-II and CSFN, showing the effectiveness of our proposed method.

Voice Activity Detection in Noisy Environment using Speech Energy Maximization and Silence Feature Normalization (음성 에너지 최대화와 묵음 특징 정규화를 이용한 잡음 환경에 강인한 음성 검출)

  • Ahn, Chan-Shik;Choi, Ki-Ho
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.169-174
    • /
    • 2013
  • Speech recognition, the problem of performance degradation is the difference between the model training and recognition environments. Silence features normalized using the method as a way to reduce the inconsistency of such an environment. Silence features normalized way of existing in the low signal-to-noise ratio. Increase the energy level of the silence interval for voice and non-voice classification accuracy due to the falling. There is a problem in the recognition performance is degraded. This paper proposed a robust speech detection method in noisy environments using a silence feature normalization and voice energy maximize. In the high signal-to-noise ratio for the proposed method was used to maximize the characteristics receive less characterized the effects of noise by the voice energy. Cepstral feature distribution of voice / non-voice characteristics in the low signal-to-noise ratio and improves the recognition performance. Result of the recognition experiment, recognition performance improved compared to the conventional method.

A MA-plot-based Feature Selection by MRMR in SVM-RFE in RNA-Sequencing Data

  • Kim, Chayoung
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.25-30
    • /
    • 2018
  • It is extremely lacking and urgently required that the method of constructing the Gene Regulatory Network (GRN) from RNA-Sequencing data (RNA-Seq) because of Big-Data and GRN in Big-Data has obtained substantial observation as the interactions among relevant featured genes and their regulations. We propose newly the computational comparative feature patterns selection method by implementing a minimum-redundancy maximum-relevancy (MRMR) filter the support vector machine-recursive feature elimination (SVM-RFE) with Intensity-dependent normalization (DEGSEQ) as a preprocessor for emphasizing equal preciseness in RNA-seq in Big-Data. We found out the proposed algorithm might be more scalable and convenient because of all libraries in R package and be more improved in terms of the time consuming in Big-Data and minimum-redundancy maximum-relevancy of a set of feature patterns at the same time.

The Bi-level Image Mapping Using Density Information in Character Patterns (문자패턴에서의 밀도정보를 이용한 이진영상 매핑)

  • 김봉석;강선미;양정윤;양윤모;김덕진
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.8
    • /
    • pp.8-15
    • /
    • 1993
  • This paper describes a normalization of character which is contained in the character recognition process. Line and dot density is computed on input character image and then image mapping is executed into destination. Also recognition is processed using overlap-partitioning of character image and extraction of 4 directional feature primitives. The validity of proposed nonlinear normalization algorithm could be verified by increment of recognition rate.

  • PDF

Comparison of the Dynamic Time Warping Algorithm for Spoken Korean Isolated Digits Recognition (한국어 단독 숫자음 인식을 위한 DTW 알고리즘의 비교)

  • 홍진우;김순협
    • The Journal of the Acoustical Society of Korea
    • /
    • v.3 no.1
    • /
    • pp.25-35
    • /
    • 1984
  • This paper analysis the Dynamic Time Warping algorithms for time normalization of speech pattern and discusses the Dynamic Programming algorithm for spoken Korean isolated digits recognition. In the DP matching, feature vectors of the reference and test pattern are consisted of first three formant frequencies extracted by power spectrum density estimation algorithm of the ARMA model. The major differences in the various DTW algorithms include the global path constrains, the local continuity constraints on the path, and the distance weighting/normalization used to give the overall minimum distance. The performance criterias to evaluate these DP algorithms are memory requirement, speed of implementation, and recognition accuracy.

  • PDF

A Robust Method for Speech Replay Attack Detection

  • Lin, Lang;Wang, Rangding;Yan, Diqun;Dong, Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.1
    • /
    • pp.168-182
    • /
    • 2020
  • Spoofing attacks, especially replay attacks, pose great security challenges to automatic speaker verification (ASV) systems. Current works on replay attacks detection primarily focused on either developing new features or improving classifier performance, ignoring the effects of feature variability, e.g., the channel variability. In this paper, we first establish a mathematical model for replay speech and introduce a method for eliminating the negative interference of the channel. Then a novel feature is proposed to detect the replay attacks. To further boost the detection performance, four post-processing methods using normalization techniques are investigated. We evaluate our proposed method on the ASVspoof 2017 dataset. The experimental results show that our approach outperforms the competing methods in terms of detection accuracy. More interestingly, we find that the proposed normalization strategy could also improve the performance of the existing algorithms.