통합 검색 | Korea Science

음성신호의 발성율과 PSOLA기법을 적용한 음성 보코더 전송률 개선에 관한 연구 (Improvement of Bit Rate applying the Speaking Rate and PSOLA Technique of Speech in CELP Vocoder)

장경아;서지호;배명진
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 신호처리소사이어티 추계학술대회 논문집
- /
- pp.45-48
- /
- 2003
In general, speech coding methods are classified into the following three categories: the waveform coding, the source coding and the hybrid coding. Fast speaking is possible to encode with a few information compared with slow speaking rate. In case of speaking rate, low frequency band is more important than high frequency band while listening. Speech vocoding technique is developing to way with low bit rate and complexity and high sound quality. the CELP type of vocoder support very good sound quality with low bit rate but these vocoders don't consider about the speaking rate. When we consider speaking rate and encode the frame depending on the speaking rate, the bit rate is able to reduce the bit rate than the conventional vocoder. We propose the technique to estimate the speaking rate and applied PSOLA technique in case of the frame of slow speaking rate. As a result of simulation bit rate can be reduced about 300 bps.
PDF

A Study on Measuring the Speaking Rate of Speaking Signal by Using Line Spectrum Pair Coefficients

Jang, Kyung-A;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- 제20권3E호
- /
- pp.18-24
- /
- 2001
Speaking rate represents how many phonemes in speech signal have in limited time. It is various and changeable depending on the speakers and the characters of each phoneme. The preprocessing to remove the effect of variety of speaking rate is necessary before recognizing the speech in the present speech recognition systems. So if it is possible to estimate the speaking rate in advance, the performance of speech recognition can be higher. However, the conventional speech vocoder decides the transmission rate for analyzing the fixed period no regardless of the variety rate of phoneme but if the speaking rate can be estimated in advance, it is very important information of speech to use in speech coding part as well. It increases the quality of sound in vocoder as well as applies the variable transmission rate. In this paper, we propose the method for presenting the speaking rate as parameter in speech vocoder. To estimate the speaking rate, the variety of phoneme is estimated and the Line Spectrum Pairs is used to estimate it. As a result of comparing the speaking rate performance with the proposed algorithm and passivity method worked by eye, error between two methods is 5.38% about fast utterance and 1.78% about slow utterance and the accuracy between two methods is 98% about slow utterance and 94% about fast utterances in 30 dB SNR and 10 dB SNR respectively.
PDF

한국어 발화 속도의 연령별 증가에 관한 연구 －만 $3{\sim}8$ 세 아동을 대상으로－ (Increase in Speaking Rate by $3{\sim}8$-year-old Korean Children)

김태경;장경희;이필영
- 음성과학
- /
- 제13권3호
- /
- pp.83-95
- /
- 2006
This study attempts to suggest a criterion of Korean language development. For this purpose we investigated speaking rates of the spontaneous utterances produced by 144 children, aged 3 to 8. We analyzed each subject's speaking rate and its relevance with speaker's age, gender and utterance length. To determine the relative contributions of variables to the speaking rate, multiple regression was conducted. Results of this study can be summarized as follows: (1) The mean and maximum values of the speaking rate increased with the growth of age. (2) A statistically significant increase in speaking rate appeared at two-year intervals. (3) There was no significant difference between male and female groups in the speaking rate. (4) The multiple regression analysis has shown that along with the speaker's age, the utterance length(the mean number of syllables per utterance) is also important in estimating the speaking rates.
PDF

LSP 파라미터를 이용한 발성측정법 (On a Study of Measurement Method of Utterance Velocity for the Reduction of Transmission Rate in CELP Vocoder.)

장경아;배명진
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2000년도 추계종합학술대회 논문집(4)
- /
- pp.199-202
- /
- 2000
Speaking Rate has variety depends on the situation and habit of speakers. It has been many studied about speaking rate In speaker recognition. The study of speaking rate in speech recognition is one of considerable matter when It is recognized the speakers and it is measured by many speech data base and complicate estimation for accuracy. In this paper, conventional vocoder process the speech signal when encoding and transmitting without regard to speaking rate so in order to apply the speaking rate for vocoder It should be considered the simpler algorithm and less computation amount than the conventional method of speaking rate used In speech recognition. We proposed the speaking rate algorithm which is used the simple parameter with Line Spectrum Pair (LSP). The proposed peaking rate method is measured by the information of LSP in speech. We measured the variety rate of phenomenon about utterances which have different velocity, respectively. As a result, It has distinct variation rate of phenomenon between utterances uttered fast and slow and the rate is 42.8% higher in case of uttered fast than in case of uttered slow.
PDF

Asymmetric effects of speaking rate on the vowel/consonant ratio conditioned by coda voicing in English

Ko, Eon-Suk
- 말소리와 음성과학
- /
- 제10권2호
- /
- pp.45-50
- /
- 2018
The vowel/consonant ratio is a well-known cue for the voicing of postvocalic consonants. This study investigates how this ratio changes as a function of speaking rate. Seven speakers of North American English read sentences containing target monosyllabic words that contrasted in coda voicing at three different speaking rates. Duration measures were taken for the voice onset time (VOT) of the onset consonant, the vowel, and the coda. The results show that the durations of the onset VOT and vowel are longer before voiced codas, and that the durations of all segments increase monotonically as speaking rate decreases. Importantly, the vowel/consonant ratio, a primary acoustic cue for coda voicing, was found to pattern asymmetrically for voiced and voiceless codas; it increases for voiced codas but decreases for voiceless codas with the decrease in speaking rate. This finding suggests that there is no stable ratio in the duration of preconsonantal vowels that is maintained in different speaking styles.
https://doi.org/10.13064/KSSS.2018.10.2.045 인용 PDF KSCI

모음길이 비율에 따른 발화속도 보상을 이용한 한국어 음성인식 성능향상 (An Improvement of Korean Speech Recognition Using a Compensation of the Speaking Rate by the Ratio of a Vowel length)

박준배;김태준;최성용;이정현
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 컴퓨터소사이어티 추계학술대회논문집
- /
- pp.195-198
- /
- 2003
The accuracy of automatic speech recognition system depends on the presence of background noise and speaker variability such as sex, intonation of speech, and speaking rate. Specially, the speaking rate of both inter-speaker and intra-speaker is a serious cause of mis-recognition. In this paper, we propose the compensation method of the speaking rate by the ratio of each vowel's length in a phrase. First the number of feature vectors in a phrase is estimated by the information of speaking rate. Second, the estimated number of feature vectors is assigned to each syllable of the phrase according to the ratio of its vowel length. Finally, the process of feature vector extraction is operated by the number that assigned to each syllable in the phrase. As a result the accuracy of automatic speech recognition was improved using the proposed compensation method of the speaking rate.
PDF

동어반복증을 동반한 파킨슨병 환자의 말속도 연구 (A study of speaking rate on Parkinson's disease with palilalia)

김선우
- 말소리와 음성과학
- /
- 제8권3호
- /
- pp.61-66
- /
- 2016
The purpose of this study is to examine the speaking rate(overall speaking rate and articulatory rate) of Parkinson's disease patients with palilalia(PDP). Palilalia is traditionally characterized by not only compulsive repetitions of words and phrases, but also by increased rate of speech based on auditory perception. Since Souques(1908) first characterized palilalia as fast speech rate from the perspective of auditory perception, few studies have evaluated PDP speech using acoustic methods. To compare the speech rate between PDP and normal subjects, we included five PDP and eight control subjects(age over 55), as well as the date acquired under reading tasks(standardized Korean paragraph). The difference in median of the overall speaking rate was not statically significant between the PDP group(median 5.25, IQR 1.30) and normal group(median 4.76, IQR 0.71). The PDP, however, had a significantly higher syllables per second on the articulatory rate(median 6.60, IQR 1.04) than normal subjects(median 5.60, IQR 0.52). Results indicated no differences in pause over 250msec and disfluency duration between the two groups. To provide useful insight into PDP speech, multiple levels of analysis should be employed.
https://doi.org/10.13064/KSSS.2016.8.3.061 인용 PDF KSCI

시간축 변환을 이용한 음성 인식기의 성능 향상에 관한 연구 (Study on the Improvement of Speech Recognizer by Using Time Scale Modification)

이기승
- 한국음향학회지
- /
- 제23권6호
- /
- pp.462-472
- /
- 2004
본 논문에서는 자동 음성 인식기의 성능 저하를 일으키는 요인으로서 발성 속도의 변동에 따를 성능 저하를 보상하기 위한 기법을 제안하였다. 새로운 기법의 제안에 앞서서. 먼저 발성 속도의 변화에 따른 기존의 은닉 마코프 모델을 이용한 음성 인식기의 성능을 정량적으로 분석하였다. 이러한 분석을 통해 발성 속도에 따른 유의한 성능 저하를 관찰하고, 주어진 음성으로부터 발성 속도를 정량적으로 나타낼 수 있는 변수를 도입하였다. 발성 속도를 학습 시 사용한 음성과 유사하게 변화시키기 위해 본 논문에서는 음성 신호에 대한 시간축 변환을 사용하였으며, 최종적으로 발성 속도에 따라 선택적으로 시간축 변환을 적용하여 발성 속도의 변동에 따른 음성 인식의 성능 저하를 보상할 수 있는 기법을 제안하였다. 10자리의 이동통신용 전화번호를 이용한 음성 인식의 실험을 통해, 제안된 기법은 빠르게 발성하는 음성에 대해 15.5％의 오류율 감소를 가져오는 것을 확인할 수 있었다.
PDF KSCI

발화속도에 따른 한국어 모음의 음향적 특성 (Effects of Speaking Rate on Korean Vowels)

이숙향;고현주;한양구;김종진
- 한국음향학회지
- /
- 제22권1호
- /
- pp.14-22
- /
- 2003
본 연구는 발화속도에 따른 한국어 모음의 음향적 특성에 관한 것으로 보통, 느림. 빠름의 발화속도 변화에 따라 단모음과 이중모음의 반모음 성분과 단모음성분의 지속시간적 특성과 포만트 특성을 살펴보았다. 지속시간은 전체적으로 발화속도가 빨라짐에 따라 짧아지는 경향을 보였으나 포만트는 단모음과 이중모음의 단모음 성분의 경우 큰 차이가 없었고 이중모음의 반모음 성분의 경우는 반모음의 종류에 따라 서로 다른 결과를 보여주었다.
PDF KSCI

다화자잡음이 말더듬의 비율과 말속도에 미치는 영향 (The Noise Effect on Stuttering and Overall Speech Rate: Multi-talker Babble Noise)

박진;정인기
- 말소리와 음성과학
- /
- 제4권2호
- /
- pp.121-126
- /
- 2012
This study deals with how stuttering changes in its frequency in a situation where adult participants who stutter are exposed to one type of background noise, that is, multi-talker babble noise. Eight American English-speaking adults who stutter participated in this study. Each of the subjects read aloud sentences under each of three speaking conditions (i.e., typical solo reading (TSR), typical choral reading (TCR), and multi-talker babble noise reading (BNR)). Speech fluency was computed based on a percentage of syllables stuttered (%SS) and speaking rate was also assessed to examine if there was significant change in rates as a measure of vocal change under each of the speaking conditions. The study found that participants read more fluently both during BNR and during TCR than during TSR. The study also found that participants did not show significant changes in speaking rate across the three speaking conditions. Some discussion was provided in relation to the effect of multi-talker babble noise on the frequency of stuttering and its further speculation.
https://doi.org/10.13064/KSSS.2012.4.2.121 인용 PDF

검색결과 117건 처리시간 0.02초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)