Search | Korea Science

A Study On Male-To-Female Voice Conversion (남녀 음성 변환 기술연구)

Choi Jung-Kyu;Kim Jae-Min;Han Min-Su
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.115-118
- /
- 2000
Voice conversion technology is essential for TTS systems because the construction of speech database takes much effort. In this paper. male-to-female voice conversion technology in Korean LPC TTS system has been studied. In general. the parameters for voice color conversion are categorized into acoustic and prosodic parameters. This paper adopts LSF(Line Spectral Frequency) for acoustic parameter, pitch period and duration for prosodic parameters. In this paper. Pitch period is shortened by the half, duration is shortened by $25\%, and LSFs are shifted linearly for the voice conversion. And the synthesized speech is post-filtered by a bandpass filter. The proposed algorithm is simpler than other algorithms. for example, VQ and Neural Net based methods. And we don't even need to estimate formant information. The MOS(Mean Opinion Socre) test for naturalness shows 2.25 and for female closeness, 3.2. In conclusion, by using the proposed algorithm. male-to-female voice conversion system can be simply implemented with relatively successful results.
PDF

Acoustic correlates of L2 English stress - Comparison of Japanese English and Korean English

Konishi, Takayuki;Yun, Jihyeon;Kondo, Mariko
- Phonetics and Speech Sciences
- /
- v.10 no.1
- /
- pp.9-14
- /
- 2018
This study compared the relative contributions of intensity, F0, duration and vowel spectra of L2 English lexical stress by Japanese and Korean learners of English. Recordings of Japanese, Korean and native English speakers reading eighteen 2 to 4 syllable words in a carrier sentence were analyzed using multiple regression to investigate the influence of each acoustic correlate in determining whether a vowel was stressed. The relative contribution of each correlate was calculated by converting the coefficients to percentages. The Japanese learner group showed phonological transfer of L1 phonology to L2 lexical prosody and relied mostly on F0 and duration in manifesting L2 English stress. This is consistent with the results of the previous studies. However, advanced Japanese speakers in the group showed less reliance on F0, and more use of intensity, which is another parameter used in native English stress accents. On the other hand, there was little influence of F0 on L2 English stress by the Korean learners, probably due to the transfer of the Korean intonation pattern to L2 English prosody. Hence, this study shows that L1 transfer happens at the prosodic level for Japanese learners of English and at the intonational level for Korean learners.
https://doi.org/10.13064/KSSS.2018.10.1.009 인용 PDF KSCI

The Effect of Vocal Function Exercise on Voice Improvement in Patients with Vocal Nodules (성대 기능 훈련이 성대결절 환자의 음성개선에 미치는 효과)

Lim, Hye-Jin;Kim, Jeong-Kyu;Kwon, Do-Ha;Park, Jun-Young
- Phonetics and Speech Sciences
- /
- v.1 no.2
- /
- pp.37-42
- /
- 2009
The purpose of the present study was to determine the effect of the management program known as vocal function exercise (VFE) on voice quality. Typical VFE was modified and applied to patients with vocal nodules by controlling intensity of voice and relieving the vocal fold to solve hyperfunctional problems in VFE. Eight female subjects aged between 28 and 54 who had been diagnosed with vocal nodules took part in the study. The patients performed VFEs once a week for eight weeks. Vocal function exercises consist of voice hygiene, respiratory training, phonation training, and glide training. The subjects' voices were analyzed pre and post therapy on the aspects of acoustics, maximum phonation time (MPT), GRBAS, and voice handicap index (VHI). As a result, it was found that fundamental frequency ($F_o$) was significant increased, shimmer decreased remarkably and that noise to harmonic ratio (NHR) lowered obviously in the acoustic parameter. In addition, MPT was increased significantly. The scale of GRBAS indicated significant improvement in grade, roughness, and strained voice. VHI indicated significant improvement in an emotional part. In conclusion, VFE was effective in improving voice quality for patients with vocal nodules.
PDF

Voice Tremor in Parkinsonism : A Preliminary Study for Differential Diagnosis (파킨슨증의 음성진전 : 감별진단을 위한 예비연구)

Choi, Seong-Hee;Kim, Hyang-Hee;Lee, Won-Yong;Choi, Hong-Shik
- Speech Sciences
- /
- v.12 no.3
- /
- pp.19-33
- /
- 2005
Tremor is a main factor of parkinsonism. Voice tremor may be the first, later or the only symptom of a neurological disease and its frequency, amplitude, and regularity may differ among the diseases of different neural subsystems. Differential diagnosis between idiopathic Parkinson's disease (IPD) and multiple system atrophy (MSA) has been difficult. This study included three groups: (1) 6 IPD patients; (2) 6 MSA patients; and (3) 20 ageand sex-matched normal controls. The MDVP (Multidimensional Voice Program) was used to analyze the sustained /a/phonation. The results were as follows: (1) frequency perturbation parameters (jitter, sPPQ, Vf0) and FTRI of tremor parameter of two patient groups were statistically different from those of the controls (p < .01); (2) measures were higher in short-term and long-term f0 and amplitude perturbation in MSA than IPD; (3) however, any acoustic parameters between IPD and MSA were not statistically different; except for the rate of frequency tremor, 4$\sim$5 Hz in IPD, 5$\sim$11 Hz in MSA and (4) the pattern of regularity for voice tremor through histogram indicated that amplitude of IPD was irregular while both f0 and amplitude of MSA were irregular. In conclusion, F0, rate of frequency tremor, and pattern of f0 regularity may be predictors for differential diagnosis. These findings might signify that voice tremor of parkinsonism was resulted from modulation of f0.
PDF

A Study on Speaker Identification Parameter Using Difference and Correlation Coeffieicent of Digit_sound Spectrum (숫자음의 스펙트럼 차이값과 상관계수를 이용한 화자인증 파라미터 연구)

Lee, Hoo-Dong;Kang, Sun-Mee;Chang, Moon-Soo;Yang, Byung-Gon
- Speech Sciences
- /
- v.11 no.3
- /
- pp.131-142
- /
- 2004
Speaker identification system basically functions by comparing spectral energy of an individual production model with that of an input signal. This study aimed to develop a new speaker identification system from two parameters from the spectral energy of numeric sounds: difference sum and correlation coefficient. A narrow-band spectrogram yielded more stable spectral energy across time than a wide-band one. In this paper, we collected empirical data from four male speakers and tested the speaker identification system. The subjects produced 18 combinations of three-digit numeric. sounds !en times each. Five productions of each three-digit number were statistically averaged to make a model for each speaker. Then, the remaining five productions were tested on the system. Results showed that when the threshold for the absolute difference sum was set to 1200, all the speakers could not pass the system while everybody could pass if set to 2800. The minimum correlation coefficient to allow all to pass was 0.82 while the coefficient of 0.95 rejected all. Thus, both threshold levels can be adjusted to the need of speaker identification system, which is desirable for further study.
PDF

Development of Electrical Stimulator for Auditory Stimulation (청각 자극용 전기자극기 개발)

Heo, Seung-Deok;Jung, Dong-Keun;Kim, Lee-Suk;Kim, Gwang-Nyeon;Kang, Myung-Koo;Kim, Jae-Ryong;Kim, Gi-Ryon
- Speech Sciences
- /
- v.11 no.3
- /
- pp.201-211
- /
- 2004
This paper introduces a development of an electrical stimulator for auditory stimulation. The electrical stimulator is useful in neurotological diagnosis, audiological evaluation, candidate selection for cochlear implantation, optimal device selection and decision making of MAP strategy for severe-to-profound hearing impaired persons. The development was based on sound parameters of auditory brainstem responses and auditory electrophysiological characteristic such as effective firing of auditory nerve and recording evoked potentials during refractory period of neuron. Besides pulse parameter could adjustable by programming for more varied electrical stimulation evoked response audiometry. Using the electrical stimulator, electrical square pulse was applied to promontory, and electrically evoked auditory brainstem response and electrically middle latency response were successfully recorded in cats.
PDF

A Study of Acoustic Analysis in the Chinese' Korean Language Learners (중국인 한국어 학습자 음성의 음향학적 특성 연구)

Kim, Hyun-Ji;You, Jae-Yeon
- Phonetics and Speech Sciences
- /
- v.2 no.3
- /
- pp.75-80
- /
- 2010
The present research investigated the characteristics of voice between genders and nationalities by measuring the acoustic parameter values of Korean and Chinese students. Sound Forge was used to collect voice samples and Praat was used to measure and analyze jitter, shimmer, NHR, $sF_0$, and pitch range. The results of this research are a follows. First, during prolongation of the vowels, there was no significant difference in $F_0$ between Korean and Chinese males and Korean and Chinese females. Korean males and females had higher $F_0$ values than Chinese males and females. Secondly, during sentence reading, there was no significant difference between Korean and Chinese males in $sF_0$. But between female groups, there was a significant difference in $sF_0$. Thirdly, during sentence reading, the pitch range in Korean males was found to be narrower compared to Korean and Chinese females who had wider pitch range, showing a significant difference. Fourthly, jitter in the five vowels /a, i, u, e, o/ was found to be higher in Chinese than Korean subjects. In the vowels /a, e, u/ females were higher than males showing a significant difference. Fifthly, shimmer in the vowels /a, e, u/ was found to be higher in Chinese than Korean subjects showing a significant difference. Finally, NHR in the vowels /a, u, o/ was found to be higher in Chinese than Korean subjects showing a significant difference.
PDF

A Codeword Tying Algorithm in Speech Recognition based on Discrete Hidden Markov Model (이산분포 HMM을 이용한 음성인식에서의 코드워드 Tying 알고리즘)

Kim, Do-Yeong;Kim, Nam-Soo;Un, Chong-Kwan
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.3
- /
- pp.63-70
- /
- 1994
In this Paper, we propose a new codeword tying algorithm based on a tree structured classfier. The proposed algorithm which can be viewed as a kind of soft decision using statistical properties between codewords and states has an advantage of fast construction, and guarantees a unique optimal solution. Also, it can easily be applied to any speech recognition system based on discrete hidden Markov model (HMM). Experimental results on speaker-independent isolated word recognition show error reduction of $6\%$ for the codebook of size 256 and $9\%$ for 512 size and also HMM parameter reduction of about $20\%$.
PDF

A Study on Performance of Speech Recognition & Acoustic Parameter in Car Environment (자동차 주행 환경에서의 음성 인식 성능 및 음향 특성의 검토)

Lee Kwang-Hyun;Choi Dae-Lim;Kim Young-Il;Kim Bong-Wan;Lee Yong-Ju
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.269-272
- /
- 2004
주행 상태에서의 자동차 내부 음 환경은 다양한 소음 및 구조적 요인으로 인하여 음성에 대한 정상적인 전송 특성을 갖기 어렵다. 이는 음원으로부터 음성 입력 장치(Microphone)에 이르기까지의 채널 왜곡에 기인한 문제로써, 실제 주행 환경에서의 음성 인식 성능에 대해서도 심각한 악영향을 초래한다. 본 논문에서는 주행 소음의 크기에 따른 채널별 음성 왜곡 정도에 따른 명료도를 음성 전달 지수인 STI(Speech Transmission Index)를 통하여 분석하고 그 결과를 음성 인식률과 상호 비교하였다. 그리고 수음 패턴에 따른 명료도 척도와 음성 인식 성능과의 상관성을 검토하고, 이를 통해 단일 채널 환경에서 최적의 마이크로폰 위치에 대하여 고찰해 보았다. 실험 결과, 주행 중의 소음 환경에서도 음성의 명료도 척도와 인식률과의 관계는 높은 상관성이 얻어짐을 알 수 있었고, 각 채널 간의 성능 편차 패턴도 주행 환경에 따라 비슷한 양상을 보이는 것으로 나타났다.
PDF

HMM-based Speech Recognition using DMS Model and Double Spectral Feature (DMS 모델과 이중 스펙트럼 특징을 이용한 HMM에 의한 음성 인식)

Ann Tae-Ock
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.7 no.4
- /
- pp.649-655
- /
- 2006
This paper proposes a HMM-based recognition method using DMSVQ(Dynamic Multi-Section Vector Quantization) codebook by DMS model and double spectral feature, as a method on the speech recognition of speaker-independent. LPC cepstrum parameter is used as a instantaneous spectral feature and LPC cepstrum's regression coefficient is used as a dynamic spectral feature These two spectral features are quantized as each VQ codebook. HMM using DMS model is modeled by receiving instantaneous spectral feature and dynamic spectral feature by input. Other experiments to compare with the results of recognition experiments using proposed method are implemented by the various conventional recognition methods under the equivalent environment of data and conditions. Through the experiment results, it is proved that the proposed method in this paper is superior to the conventional recognition methods.
PDF

Search Result 373, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)