• Title/Summary/Keyword: Speech rate

Search Result 1,246, Processing Time 0.024 seconds

A Study on Real Time Pitch Alteration of Speech Signal (음성신호의 실시간 피치변경에 관한 연구)

  • 김종국;박형빈;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.82-89
    • /
    • 2004
  • This paper describes how to reduce the effect of an occupation threshold by that the transform of mixture components of HMM parameters is controlled in hierarchical tree structure to prevent from over-adaptation. To reduce correlations between data elements and to remove elements with less variance, we employ PCA (principal component analysis) and ICA (independent component analysis) that would give as good a representation as possible, and decline the effect of over-adaptation. When we set lower occupation threshold and increase the number of transformation function, ordinary WLLR adaptation algorithm represents lower recognition rate than SI models, whereas the proposed MLLR adaptation algorithm represents the improvement of over 2% for the word recognition rate as compared to performance of SI models.

On the Use of a Parallel-Branch Subunit Mod디 in Continuous HMM for improved Word Recognition (연속분포 HMM에서 평행분기 음성단위를 사용한 단어인식율 향상연구)

  • Park, Yong-Kyuo;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2E
    • /
    • pp.25-32
    • /
    • 1995
  • In this paper, we propose to use a parallel-branch subunit model for improved word recognition. The model is obtained by splitting off each subunit branch based on mixture component in continuous hidden Markov model(continuous HMM). According to simulation results, the proposed model yields higher recognition rate than the single-branch subunit model or the parallel-branch subunit model proposed by Rabiner et al[1]. We show that a proper combination of the number of mixture components and the number of branches for each subunit results in increased recognition rate. To study the recognition performance of the proposed algorithms, the speech material used in this work was a vocabulary with 1036 Korean words.

  • PDF

Design of Low Bit Rate VSELP Codebook for the Korean Speech (한국어 음성에 있어서 저전송률을 갖는 개선된 VSELP코드북 설계)

  • 김형종;한승조
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.3 no.3
    • /
    • pp.607-616
    • /
    • 1999
  • This paper proposed an improved 4.8kbps VSELP in order to keep the good quality in band-limited channel. In the most cases, it is difficult to keep the good quality at the low bit rate. In order to solve the problems, many methods are proposed, but they are not suitable to the Korean language structure because they are designed for being suitable to the foreign language structure. In experiment, we use the noseless Korean voice data. We show that the proposed 4.8kbps VSELP is not excellent to the 8kbps VSELP in SEGWSNR(Segmentally Weighted SNR), but it is the superior to the 8kbps VSELP in the MOS(Mean Opinion Score) test.

  • PDF

A Study on APC-MPC in 8kbps of Convergence System (융복합 시스템의 8kbps에 있어서 APC-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.13 no.7
    • /
    • pp.177-182
    • /
    • 2015
  • In a MPC(Multi-Pulse Coding) using excitation source of voiced and unvoiced, it would be a distortion of voice waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration. To solve this problem, this paper present APC-MPC of amplitude-position compensation in a multi-pulses each pitch interval in order to reduce distortion of synthesis waveform. Also, I was implemented that the APC-MPC in coding system. And I evaluate the SNRseg of APC-MPC in 8kbps coding condition of convergence system. As a result, SNRseg of APC-MPC was 13.9dB for female voice and 14.3dB for male voice respectively. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

Decision Tree State Tying Modeling Using Parameter Estimation of Bayesian Method (Bayesian 기법의 모수 추정을 이용한 결정트리 상태 공유 모델링)

  • Oh, SangYeob
    • Journal of Digital Convergence
    • /
    • v.13 no.1
    • /
    • pp.243-248
    • /
    • 2015
  • Recognition model is not defined when you configure a model, Been added to the model after model building awareness, Model a model of the clustering due to lack of recognition models are generated by modeling is causes the degradation of the recognition rate. In order to improve decision tree state tying modeling using parameter estimation of Bayesian method. The parameter estimation method is proposed Bayesian method to navigate through the model from the results of the decision tree based on the tying state according to the maximum probability method to determine the recognition model. According to our experiments on the simulation data generated by adding noise to clean speech, the proposed clustering method error rate reduction of 1.29% compared with baseline model, which is slightly better performance than the existing approach.

Fast Implementation Algorithms for EVRC (EVRC의 고속 구현 알고리듬)

  • 정성교;최용수;김남건;윤대희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.43-49
    • /
    • 2001
  • EVRC (Enhanced Variable Rate Codec) has been adopted as a standard coder for the CDMA digital cellular system in North America and Korea, and known to provide good call quality at 8kbps. In this paper, fast implementation algorithms for EVRC encoder are proposed. The proposed algorithms are based on both efficient pitch detection scheme and fast fixed codebook search algorithm. In the codebook search, computational complexity is reduced down to 70% of the original EVRC by limiting the number of pulse position combination and by using a truncated impulse response. The proposed algorithms enable us to implement the EVRC with much smaller computational works. Also, informal subjective tests confirmed that the difference in the speech quality between the original EVRC and the proposed method was indistinguishable.

  • PDF

Deterministic Function Variable Step Size LMS Algorithm (결정함수 가변스텝 LMS 알고리즘)

  • Woo, Hong-Chae
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.2
    • /
    • pp.128-132
    • /
    • 2011
  • Least mean square adaptive algorithms have played important role in radar, sonar, speech processing, and mobile communication. In mobile communication area, the convergence rate of a LMS algorithm is quite important. However, LMS algorithms have slow and non-uniform convergence rate problem For overcoming these shortcomings, various variable step LMS adaptive algorithms have been studied in recent years. Most of these recent LMS algorithms have used complex variable step methods to get a rapid convergence. But complex variable step methods need a high computational complexity. Therefore, the main merits such as the simplicity and the robustness in a LMS algorithm can be eroded. The proposed deterministic variable step LMS algorithm is based upon a simple deterministic function for the step update so that the simplicity of the proposed algorithm is obtained and the fast convergence is still maintainable.

A Study on the Channel Normalized Pitch Synchronous Cepstrum for Speaker Recognition (채널에 강인한 화자 인식을 위한 채널 정규화 피치 동기 켑스트럼에 관한 연구)

  • 김유진;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.61-74
    • /
    • 2004
  • In this paper, a contort- and speaker-dependent cepstrum extraction method and a channel normalization method for minimizing the loss of speaker characteristics in the cepstrum were proposed for a robust speaker recognition system over the channel. The proposed extraction method creates a cepstrum based on the pitch synchronous analysis using the inherent pitch of the speaker. Therefore, the cepstrum called the 〃pitch synchronous cepstrum〃 (PSC) represents the impulse response of the vocal tract more accurately in voiced speech. And the PSC can compensate for channel distortion because the pitch is more robust in a channel environment than the spectrum of speech. And the proposed channel normalization method, the 〃formant-broadened pitch synchronous CMS〃 (FBPSCMS), applies the Formant-Broadened CMS to the PSC and improves the accuracy of the intraframe processing. We compared the text-independent closed-set speaker identification on 56 females and 112 males using TIMIT and NTIMIT database, respectively. The results show that pitch synchronous km improves the error reduction rate by up to 7.7% in comparison with conventional short-time cepstrum and the error rates of the FBPSCMS are more stable and lower than those of pole-filtered CMS.

Voice Recognition Performance Improvement using the Convergence of Bayesian method and Selective Speech Feature (베이시안 기법과 선택적 음성특징 추출을 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Chun
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.6
    • /
    • pp.7-11
    • /
    • 2016
  • Voice recognition systems which use a white noise and voice recognition environment are not correct voice recognition with variable voice mixture. Therefore in this paper, we propose a method using the convergence of Bayesian technique and selecting voice for effective voice recognition. we make use of bank frequency response coefficient for selective voice extraction, Using variables observed for the combination of all the possible two observations for this purpose, and has an voice signal noise information to the speech characteristic extraction selectively is obtained by the energy ratio on the output. It provide a noise elimination and recognition rates are improved with combine voice recognition of bayesian methode. The result which we confirmed that the recognition rate of 2.3% is higher than HMM and CHMM methods in vocabulary recognition, respectively.

Effects of Neonatal Hearing Screening Program (NHSP) Information on Parental Satisfaction (신생아 청각선별검사 프로그램에 관한 정보제공이 부모 만족도에 미치는 영향)

  • Ahn, Hyun-Sook;Cho, Soo-Jin
    • Phonetics and Speech Sciences
    • /
    • v.1 no.2
    • /
    • pp.51-59
    • /
    • 2009
  • This study was designed to investigate the effects of neonatal hearing screening program (NHSP) information on parental satisfaction with the Parent Satisfaction Questionnaire with Neonatal Hearing Screening Program (PSQ-NHSP) by Mazlan et al. (2006). The PSQ-NHSP consisted of four aspects including: information, personnel in charge of the hearing test, appointment activity, and overall satisfaction in the neonatal hearing screening program. A total of 106 parents (50 in the experimental group and 56 in the control group) participated in this study in one general hospital and two delivery clinics. The fifty parents in the experimental group received information and counseling with educational materials before filling out the PSQ-NHSP, but the fifty-six parents in the control group did not receive any counseling or education materials before completing the PSQ-NHSP. The PSQ-NHSP demonstrated excellent internal consistency reliability (${\sigma}=0.914$). The results of the study were as follows. First, the overall satisfaction ($3.77{\pm}0.81$) and personnel in charge of hearing test ($3.52{\pm}0.79$) aspects showed higher rates of satisfaction than the appointment activity aspect ($3.51{\pm}0.80$) for total subjects. Second, the overall parental satisfaction rate of the experimental group ($4.15{\pm}0.50$) was significantly higher than that of the control group ($3.09{\pm}0.53$) in all items. Lastly, thirty-two participants (30%) made at least one comment in response to the open-set items. A total of 29 comments were related to satisfaction with participating in the NHSP and II comments were related to dissatisfaction. In conclusion, to improve parental satisfaction it is important to provide parents with education and information about the NHSP before the test. In addition, PSQ-NHSP was found to be a useful instrument for identifying the benefits and shortfalls of the NHSP.

  • PDF