• Title/Summary/Keyword: Speech rate

Search Result 1,245, Processing Time 0.033 seconds

A DCT Adaptive Subband Filter Algorithm Using Wavelet Transform (웨이브렛 변환을 이용한 DCT 적응 서브 밴드 필터 알고리즘)

  • Kim, Seon-Woong;Kim, Sung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.1
    • /
    • pp.46-53
    • /
    • 1996
  • Adaptive LMS algorithm has been used in many application areas due to its low complexity. In this paper input signal is transformed into the subbands with arbitrary bandwidth. In each subbands the dynamic range can be reduced, so that the independent filtering in each subbands has faster convergence rate than the full band system. The DCT transform domain LMS adaptive filtering has the whitening effect of input signal at each bands. This leads the convergence rate to very high speed owing to the decrease of eigen value spread Finally, the filtered signals in each subbands are synthesized for the output signal to have full frequency components. In this procedure wavelet filter bank guarantees the perfect reconstruction of signal without any interspectra interference. In simulation for the case of speech signal added additive white gaussian noise, the suggested algorithm shows better performance than that of conventional NLMS algorithm at high SNR.

  • PDF

Decision Tree for Likely phoneme model schema support (유사 음소 모델 스키마 지원을 위한 결정 트리)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.10
    • /
    • pp.367-372
    • /
    • 2013
  • In Speech recognition system, there is a problem with phoneme in the model training and it cause a stored mode regeneration process which come into being appear time and more costs. In this paper, we propose the methode of likely phoneme model schema using decision tree clustering. Proposed system has a robust and correct sound model which system apply the decision tree clustering methode form generate model, therefore this system reduce the regeneration process and provide a retrieve the phoneme unit in probability model. Also, this proposed system provide a additional likely phoneme model and configured robust correct sound model. System performance as a result of represent vocabulary dependence recognition rate of 98.3%, vocabulary independence recognition rate of 98.4%.

An Enhancement of Learning Speed of the Error - Backpropagation Algorithm (오류 역전도 알고리즘의 학습속도 향상기법)

  • Shim, Bum-Sik;Jung, Eui-Yong;Yoon, Chung-Hwa;Kang, Kyung-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1759-1769
    • /
    • 1997
  • The Error BackPropagation (EBP) algorithm for multi-layered neural networks is widely used in various areas such as associative memory, speech recognition, pattern recognition and robotics, etc. Nevertheless, many researchers have continuously published papers about improvements over the original EBP algorithm. The main reason for this research activity is that EBP is exceeding slow when the number of neurons and the size of training set is large. In this study, we developed new learning speed acceleration methods using variable learning rate, variable momentum rate and variable slope for the sigmoid function. During the learning process, these parameters should be adjusted continuously according to the total error of network, and it has been shown that these methods significantly reduced learning time over the original EBP. In order to show the efficiency of the proposed methods, first we have used binary data which are made by random number generator and showed the vast improvements in terms of epoch. Also, we have applied our methods to the binary-valued Monk's data, 4, 5, 6, 7-bit parity checker and real-valued Iris data which are famous benchmark training sets for machine learning.

  • PDF

A study on the lip shape recognition algorithm using 3-D Model (3차원 모델을 이용한 입모양 인식 알고리즘에 관한 연구)

  • 남기환;배철수
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.5
    • /
    • pp.783-788
    • /
    • 2002
  • Recently, research and developmental direction of communication system is concurrent adopting voice data and face image in speaking to provide more higher recognition rate then in the case of only voice data. Therefore, we present a method of lipreading in speech image sequence by using the 3-D facial shape model. The method use a feature information of the face image such as the opening-level of lip, the movement of jaw, and the projection height of lip. At first, we adjust the 3-D face model to speeching face Image sequence. Then, to get a feature information we compute variance quantity from adjusted 3-D shape model of image sequence and use the variance quality of the adjusted 3-D model as recognition parameters. We use the intensity inclination values which obtaining from the variance in 3-D feature points as the separation of recognition units from the sequential image. After then, we use discrete HMM algorithm at recognition process, depending on multiple observation sequence which considers the variance of 3-D feature point fully. As a result of recognition experiment with the 8 Korean vowels and 2 Korean consonants, we have about 80% of recognition rate for the plosives md vowels.

The Communication Repair Strategy Characteristics According to Communication Breakdown of Elderly Man With Alzheimer's Dementia (알츠하이머 치매 노인의 의사소통 단절에 따른 의사소통 회복전략 특성)

  • Kim, Sun-Young;Park, Hee-June
    • Therapeutic Science for Rehabilitation
    • /
    • v.8 no.4
    • /
    • pp.53-63
    • /
    • 2019
  • Objective : Many communication recovery strategies should be used when communication breakdowns occur for successful communication, however, communication problems increase due to inadequate use of such strategies in older people with dementia. The purpose of this study was to investigate the difference of recovery strategy between dementia and the elderly in conversational discourse. Method : The subjects were eight of Alzheimer's dementia and 10 general elderly. Conversation discourse tasks were conducted face-to-face with the subjects. Communication breakdown and communication recovery strategies were analyzed based on 200 utterances collected in the conversation discourse. Result : First, the AD group had more communication breakdown than the control group, but the recovery rate did not differ between the groups. Second, in the AD group, the nonspecific recovery strategy and the clarification demand strategy were used as the expression strategy. The recovery rate after using expressive strategy was more than 90% in explanation strategy, combined strategy, nonspecific repair strategy, and repetition confirmation strategy. The response strategy used a lot of paraphrase strategy and combined strategies, and the recovery rate after using the response strategy was 100% for the simplification strategy, repeat strategy and gesture strategy. Conclusion : The AD group showed more breakdown of research subjects and breakdown of researchers than control group, and it showed ability to use various expression strategy and response strategy though there was difference in repair rate between communication repair strategy. AD group used nonspecific repair strategy in expression strategy the most and paraphrase strategy in response strategy the most. This shows different characteristic from ordinary elderly people. Therefore, it is necessary to utilize this repair strategy for rehabilitation of AD elderly.

Wavelet-based Pitch Detector for 2.4 kbps Harmonic-CELP Coder (2.4 kbps 하모닉-CELP 코더를 위한 웨이블렛 피치 검출기)

  • 방상운;이인성;권오주
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.717-726
    • /
    • 2003
  • This paper presents the methods that design the Wavelet-based pitch detector for 2,4 kbps Harmonic-CELP Coder, and that achieve the effective waveform interpolation by decision window shape of the transition region, Waveform interpolation coder operates by encoding one pitch-period-sized segment, a prototype segment, of speech for each frame, generate the smooth waveform interpolation between the prototype segments for voiced frame, But, harmonic synthesis of the prototype waveforms between previous frame and current frame occur not only waveform errors but also discontinuity at frame boundary on that case of pitch halving or doubling, In addtion, in transition region since waveform interpolation coder synthesizes the excitation waveform by using overlap-add with triangularity window, therefore, Harmonic-CELP fail to model the instantaneous increasing speech and synthesis waveform linearly increases, First of all, in order to detect the precise pitch period, we use the hybrid 1st pitch detector, and increse the precision by using 2nd ACF-pitch detector, Next, in order to modify excitation window, we detect the onset, offset of frame by GCI, As the result, pitch doubling is removed and pitch error rate is decreased 5.4% in comparison with ACF, and is decreased 2,66% in comparison with wavelet detector, MOS test improve 0.13 at transition region.

Improved ErtPS Scheduling Algorithm for AMR Speech Codec with CNG Mode in IEEE 802.16e Systems (IEEE 802.16e 시스템에서의 CNG 모드 AMR 음성 코덱을 위한 개선된 ErtPS 스케줄링 알고리즘)

  • Woo, Hyun-Je;Kim, Joo-Young;Lee, Mee-Jeong
    • The KIPS Transactions:PartC
    • /
    • v.16C no.5
    • /
    • pp.661-668
    • /
    • 2009
  • The Extended real-time Polling Service (ErtPS) is proposed tosupport QoS of VoIP service with silence suppression which generates variable size data packets in IEEE 802.16e systems. If the silence is suppressed, VoIP should support Comfort Noise Generation (CNG) which generates comfort noise for receiver's auditory sense to notify the status of connection to the user. CNG mode in silent-period generates a data with lower bit rate at long packet transmission intervals in comparison with talk-spurt. Therefore, if the ErtPS, which is designed to support service flows that generate data packets on a periodic basis, is applied to silent-period, resources of the uplink are used inefficiently. In this paper, we proposed the Improved ErtPS algorithm for efficient resource utilization of the silent-period in VoIP traffic supporting CNG. In the proposed algorithm, the base station allocates bandwidth depending on the status of voice at the appropriate interval by havingthe user inform the changes of voice status. The Improved ErtPS utilizes the Cannel Quality Information Channel (CQICH) which is an uplink subchannel for delivering quality information of channel to the base station on a periodic basis in 802.16e systems. We evaluated the performance of proposed algorithm using OPNET simulator. We validated that proposed algorithm improves the bandwidth utilization of the uplink and packet transmission latency

An Epidemiologic Study of Symptoms of Temporomandibular Disorders in Korean College Students (경기도 지역 대학생의 측두하악장애증상에 관한 역학적 연구)

  • Park, Hye-Sook
    • Journal of Oral Medicine and Pain
    • /
    • v.32 no.1
    • /
    • pp.91-104
    • /
    • 2007
  • An epidemiologic investigation was carried out to determine the prevalence of symptoms of temporomandibular disorders in college students that aged 19-31 years. 460 students were investigated with a questionnaire from September to December 2006. The obtained results were as follows : 1. The prevalence of symptoms of temporomandibular disorders was 80.6%. 2. The most frequently complained symptom was headache and joint sound was the next one without distinct difference between men and women. 3. While the rate of occurrence of symptom of acute malocclusion decreased with age in men, that of TMJ pain during chewing or speech increased with age in women. 4. Symptoms including TMJ pain during mouth opening, chewing or speech, TMJ fatigue and acute malocclusion occurred significantly more frequently in women than in men. Contributing factors including resting cheeks on hands, stressful state, gum chewing, insomnia and clenching occurred significantly more frequently in women than in men. 5. There was a highly significant relationship between symptoms and contributing factors including resting cheeks on hands, stressful state, unilateral chewing, insomnia and clenching. 6. There was a highly significant relationship between symptoms and general personality.

Effect of Articulation Abilities on the Articulator Strength Training by IOPI of Spasticity Dysarthric Speech (IOPI를 활용한 조음기관 훈련 프로그램이 경직형 마비말장애의 조음 능력에 미치는 영향)

  • Lee, Jang-Shin;Lee, Ji-Yun;Kim, Sun-Hee
    • Therapeutic Science for Rehabilitation
    • /
    • v.9 no.1
    • /
    • pp.91-99
    • /
    • 2020
  • Objective : The purpose of this study was to investigate the effects of the IOPI articulator strength training program on articulator(tongue and lip) muscle strength, numbers of /l, s, ʨ/ articulation accuracy, articulatory numbers, articulation regularity and accuracy in the alternate motion rates, and sequential motion rate changes in patients with spastic dysarthria. Methods : Three cases of patients with spastic dysarthria living in Jeju, Korea, were included in this study. A single subject design was selected to study changes in articulator(tongue and lip) muscle strength, numbers of /ㄹ, ㅅ, ㅈ/ articulation accuracy, articulatory numbers, articulation regularity and accuracy in the alternate motion rates and sequential motion rates. Results : After the articulator strength training program was conducted on patients with spastic dysarthria, there were positive changes in articulator(tongue and lip) muscle strength, numbers of /ㄹ, ㅅ, ㅈ/ articulation accuracy, articulatory numbers, articulation regularity and accuracy on the alternate motion rates and sequential motion rates. Conclusion : Our findings suggest that IOPI articulator strength training program could be very useful for the most representative childeren with cerebral palsy if conducted in various subtypes of dysarthric patients and linked with articulatory function training with IOPI at home.

Speech Recognition Using Noise Robust Features and Spectral Subtraction (잡음에 강한 특징 벡터 및 스펙트럼 차감법을 이용한 음성 인식)

  • Shin, Won-Ho;Yang, Tae-Young;Kim, Weon-Goo;Youn, Dae-Hee;Seo, Young-Joo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.5
    • /
    • pp.38-43
    • /
    • 1996
  • This paper compares the recognition performances of feature vectors known to be robust to the environmental noise. And, the speech subtraction technique is combined with the noise robust feature to get more performance enhancement. The experiments using SMC(Short time Modified Coherence) analysis, root cepstral analysis, LDA(Linear Discriminant Analysis), PLP(Perceptual Linear Prediction), RASTA(RelAtive SpecTrAl) processing are carried out. An isolated word recognition system is composed using semi-continuous HMM. Noisy environment experiments usign two types of noises:exhibition hall, computer room are carried out at 0, 10, 20dB SNRs. The experimental result shows that SMC and root based mel cepstrum(root_mel cepstrum) show 9.86% and 12.68% recognition enhancement at 10dB in compare to the LPCC(Linear Prediction Cepstral Coefficient). And when combined with spectral subtraction, mel cepstrum and root_mel cepstrum show 16.7% and 8.4% enhanced recognition rate of 94.91% and 94.28% at 10dB.

  • PDF