Search | Korea Science

Analysis of Voice Color Similarity for the development of HMM Based Emotional Text to Speech Synthesis (HMM 기반 감정 음성 합성기 개발을 위한 감정 음성 데이터의 음색 유사도 분석)

Min, So-Yeon;Na, Deok-Su
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.15 no.9
- /
- pp.5763-5768
- /
- 2014
Maintaining a voice color is important when compounding both the normal voice because an emotion is not expressed with various emotional voices in a single synthesizer. When a synthesizer is developed using the recording data of too many expressed emotions, a voice color cannot be maintained and each synthetic speech is can be heard like the voice of different speakers. In this paper, the speech data was recorded and the change in the voice color was analyzed to develop an emotional HMM-based speech synthesizer. To realize a speech synthesizer, a voice was recorded, and a database was built. On the other hand, a recording process is very important, particularly when realizing an emotional speech synthesizer. Monitoring is needed because it is quite difficult to define emotion and maintain a particular level. In the realized synthesizer, a normal voice and three emotional voice (Happiness, Sadness, Anger) were used, and each emotional voice consists of two levels, High/Low. To analyze the voice color of the normal voice and emotional voice, the average spectrum, which was the measured accumulated spectrum of vowels, was used and the F1(first formant) calculated by the average spectrum was compared. The voice similarity of Low-level emotional data was higher than High-level emotional data, and the proposed method can be monitored by the change in voice similarity.
https://doi.org/10.5762/KAIS.2014.15.9.5763 인용 PDF KSCI

COMPARISON OF SPEECH PATTERNS ACCORDING TO THE DEGREE OF SURGICAL SETBACK IN MANDIBULAR PROGNATHIC PATIENTS (하악골 전돌증 수술 후 하악골 이동량에 따른 발음 양상에 관한 비교 연구)

Shin, Ki-Young;Lee, Dong-Keun;Oh, Seung-Hwan;Sung, Hun-Mo;Lee, Suk-Hang
- Maxillofacial Plastic and Reconstructive Surgery
- /
- v.23 no.1
- /
- pp.48-58
- /
- 2001
After performing mandibular setback surgery, we found some changes in patterns and organs of speech. This investigation was undertaken to investigate the aspect and degree of speech patterns according to the amount of surgical setback in mandibular prognathic patients. Thirteen patients with skeletal Class III malocclusion were studied preoperative and postoperative over 6 months. They had undergone the mandible setback operation via bilateral sagittal split ramus osteotomy(BSSRO). We split the patients into two groups. Group 1 included patients whose degree of mandibular setback was 6mm or less, and Group 2 above 6mm. Control group was two adults wish normal speech patterns. A phonetician performed narrow phonetic transcriptions of tape-recorded words and sentences produced by each of the patients and the acoustic characteristics of the plosives, fricatives, and flaps were analyzed with a phonetic computer program (Computerized Speech Lab(CSL) Model 4300B(USA)). The results are as follows: 1. Generally, Patients showed longer closure duration of plosives, shorter VOT(voice onset time) and higher ratio of closure duration against VOT. 2. Patients showed more frequent diffuse distribution than the control group in frication noise energy of fricatives. 3. In fricatives, frequency of compact from were higher in group 1 than in group 2. 4. Generally, a short duration of closure for /ㄹ/ was not realized in the patient's flaps. Instead, it was realized as fricatives, sonorant with a vowel-like formant structure, or trill type consonant. 5. Abnormality of the patient's articulation was reduced, but adaptation of their articulation after surgery was not perfect and the degree of adaptation was different according to the degree of surgical setback.
PDF

The Comparative Study of Effect on Speech before and after Orthognathic Surgery of Patients (악교정 환자의 악교정 수술전후 발음양상에 대한 비교연구)

Kwon, Kyung-Hwan;Kim, Soo-Nam;Lee, Dong-Keun;Cho, Yong-Min;Lee, Suk-Hyang
- Maxillofacial Plastic and Reconstructive Surgery
- /
- v.22 no.2
- /
- pp.191-205
- /
- 2000
The purpose of this study was undertaken to determine the effects of orthognathic surgery on speech. The hyposis stated herein is that functional behaviors of the dentofacial complex, such as speech production, may be adversely affected by deviations of a structural nature(especially, Class III malocclusion). Twenty adults with Class III malocclusion(13 female and 7 male) were studied preoperative, immediate postoperative and either 6 or 12 months postoperative lateral cephalograms. They had mandibular prognathism and had undergone mandible setback operation. The position of tongue, soft palate(Uvula), hyoid bone, respiratory track width, and pharyngeal depth were assessed on lateral cephalograms with 23 cephalometric variables, ANOVA, Paired t-tests and Pearson's product-moment correlation coefficient tests were used to evalute the operative changes in all cephalometric parameters. A experienced speech and language pathologists performed narrow phonetic transcriptions of tape-recorded words and sentences produced by each of the ninth patients and the recording tapes were analyzed by phonetic computer program(Computerized Speech Lab(CSL) Model 4300BI(U.S.A.)) These judges also recorded their ratings of each patient's overall consonants, hypernasality, hyponasality, and articulation proficiency. The results obtained are as follows; 1. There were significant changes in distance of posterior pharyngeal wall to tongue (TI-TW2, TS-TW3) after the surgery at 6 months postoperatively(each p<0.01 p<0.05). 2. The posterior tongue point(TI, TS, PPT) moved posteriorly after surgery and remained to its changed position at 6 months postoperatively(p<0.05). The displacement of tongue was correlated with the movement of mandibular setback amount(p<0.05). The hyoid bone moved posteriorly superiorly after immediate postoperative period. There was significant changes in hyoid bone movement after immediated postoperative period(p<0.05), but returned to its original position during the follow-up period(p>0.05) 3. The soft palate was displaced posteriorly superiorly after immediated operative period and remained to its changed position at 6 months postoperatively(p<0.05). ANS-PNS-SPT angle increasing, PPU-PPPo distance narrowing was showed after surgery, and remained its appearance 6 months postoperatively(p<0.05). 4. There were significant changes in formant value and squre diagram of vowel sound after the orthognathic surgery and the follow-up period. There were significant changes in /ㅅ/sound and posterior tongue sound. 5. The posterior movement of tongue and the posteriosuperior movement of soft palate was correlated with mandibular setback amount after orthognathic surgery. On the vowel squre diagram, the author found that the place of articulation after operation moved downward, backward, upward. 6. In assessing speech abnormalities, dental occlusion should be considered as a contributing factor. The vast majority of subjects with preoperative misarticulations eliminated or reduced their errors following orthognathic surgery. There was significant difference in speech impovement between pre- and postoperation.
PDF

Perception of native Korean Speakers on English and German

Kang, Hyun-Sook;Koo, So-Ryeong;Lee, Sook-hyang
- Proceedings of the KSPS conference
- /
- 2000.07a
- /
- pp.86-87
- /
- 2000
In this paper, we discuss why two different surface forms appear in loanwords for English and German /${\int}$/ In Korean, a vowel is inserted into loanwords if a consonant cannot be properly syllabified. Therefore, /${\int}$/ in some positions of loanwords trigger vowel insertion. Interestingly, /${\int}$/s in the onset cluster of English and German words were borrowed in Korean as Iful with the inserted vowel [u] whereas If Is in the coda position of English and German words were borrowed as Ifil with the inserted vowel [i]. For example, 'shrimp' is adopted as [${\int}urimphi$] whereas 'rush' is adopted as [$ra{\int}i$]. In this paper, we attempt to find out the phonetic reason for the distribution of the surface forms of /${\int}$/. We assume that since the formant frequency of [i] is higher than that of [u], the peak frequency of /${\int}$/ with the surface form of [${\int}$i] in loanwords may be higher than that of /${\int}$/ with the surface form of [${\int}u$]. We also assume that duration may be another factor for the distribution of [${\int}i$] and [${\int}u$]. Since /${\int}$/ and /u/ use lip rounding whereas /i/ doesn't, the duration for [${\int}i$] might be longer than that of [${\int}u$]. German supports our assumption. /${\int}$/ in the onset cluster is longer than /${\int}$/ in the coda position. It also has higher peak frequency than that of /${\int}$/ in the coda position. In loanwords, ${\int}$ in the onset cluster is borrowed as [${\int}u$] as in Spiegel whereas /${\int}$/ in the coda position is borrowed as [${\int}i$] as in Bosch. English, however, does not support our assumption. Peak frequency of [${\int}$] depends on the preceding vowel, not on its position in the syllable structure. If the preceding vowel is front, then the peak freuency of the following of the following /${\int}$/ is high but if the preceding vowel is back, than the peak frequency of the following /${\int}$/ is low. The peak frequency of /${\int}$/ in the onset cluster seems to be in between. As we assumed, however, the duration of /${\int}$/ in the coda position is longer than of /${\int}$/ in the onset cluster. With the mixed results, we question whether Koreans really hear two different xounds for /${\int}$/ in English words. For the future experiment, we would like to perform the perception tet for /${\int}$/ in English words.
PDF

Effects of Butorphanol on Behavior after Intestinal Anastomosis in Dogs (Butorphanol의 투여가 장문합술 후 개의 행동에 미치는 영향)

Koo Ja-min;Lee Hee-chun;Chang Hong-hee;Seong Yong-jeung;Lee Hyo-jong;Yeon Seong-chan
- Journal of Veterinary Clinics
- /
- v.22 no.1
- /
- pp.6-15
- /
- 2005
This study was performed to investigate non-invasive behavioral pain assessment of dogs after surgery, and the analgesic effects of butorphanol after intestinal anastomosis in dogs. In this study, five dogs in the Control Group were anesthetized, but did not undergo surgery. Five dogs in the Analgesic Group were undergone intestinal anastomosis and treated with butorphanol. Five dogs in the Non-analgesic Group were also undergone intestinal anastomosis without analgesic treatment. The dogs in the Analgesic Group received butorphanol (0.4 mg/kg, IM) before and immediately after operation, while dogs in Control and Non-analgesic Groups received isovolumetric doses of sterile saline. The behavior of dogs were videotaped for 400 mins after anesthesia, during which time a researcher interacted with the dog once per each 80 mins. At each interaction, the researcher recorded behavioral pain score, using University of Melbourne Pain Scale. Interactive and non-interactive behaviors were observed and quantitated by a single observer using focal continuous sampling method. Vocalizations were obtained during 400 mins after anesthesia, and duration of call, intensity, pitch, 1-4 Formant were analyzed. Surgery affected an increasing of pain score. During interactions with researcher, greeting behaviors were decreased after surgery. Differences between Analgesic group given analgesic or that given a placebo drug were readily understood using quantitative behavioral measurements and vocalization. Significant difference between Analgesic group given butorphanol or that the given placebo drug was apparent(p< 0.05).
PDF KSCI

Effective Feature Vector for Isolated-Word Recognizer using Vocal Cord Signal (성대신호 기반의 명령어인식기를 위한 특징벡터 연구)

Jung, Young-Giu;Han, Mun-Sung;Lee, Sang-Jo
- Journal of KIISE:Software and Applications
- /
- v.34 no.3
- /
- pp.226-234
- /
- 2007
In this paper, we develop a speech recognition system using a throat microphone. The use of this kind of microphone minimizes the impact of environmental noise. However, because of the absence of high frequencies and the partially loss of formant frequencies, previous systems developed with those devices have shown a lower recognition rate than systems which use standard microphone signals. This problem has led to researchers using throat microphone signals as supplementary data sources supporting standard microphone signals. In this paper, we present a high performance ASR system which we developed using only a throat microphone by taking advantage of Korean Phonological Feature Theory and a detailed throat signal analysis. Analyzing the spectrum and the result of FFT of the throat microphone signal, we find that the conventional MFCC feature vector that uses a critical pass filter does not characterize the throat microphone signals well. We also describe the conditions of the feature extraction algorithm which make it best suited for throat microphone signal analysis. The conditions involve (1) a sensitive band-pass filter and (2) use of feature vector which is suitable for voice/non-voice classification. We experimentally show that the ZCPA algorithm designed to meet these conditions improves the recognizer's performance by approximately 16%. And we find that an additional noise-canceling algorithm such as RAST A results in 2% more performance improvement.
PDF KSCI

A Study on Acoustical Properties of Soprano′s Singing (소프라노의 성악 발성에 대한 음향학적 특징 연구)

임동철;문소연;이행세
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.5
- /
- pp.60-64
- /
- 2000
This paper studies the relation between the Fundamental Frequency (F0) and the formants of simple vowels in the Korean language sung by sopranos. It is hewn that, in soprano singing, the F0 of a vowel affects its formants. For this reason the formants of simple vowels sung by sopranos must be considered in all over the soprano singing range. We recorded the five simple vowel sounds /a/, /e/, /i/, /o/, and /u/ sung by five professional sopranos from A3 (220.0Hz) to A5 (880.0Hz) in the major scale and compared the formants of the sung vowels with those of spoken vowels. We observed that F1 and F2 of sung vowels were stable in low F0 (lower than B4) but in high F0 (higher than B4), F1 and F2 lost their stabilities. In the case of /a/, /o/, and /u/, the slope of the F1-F2 graph was about 2.6, and those of the F0-F2 and F0-Fl graphs were 2.2-2.5 and 0.7-1.0, respectively. And as the F0 increases, the F1 and F2 of sung vowels /a/, /e/, /i/, /o/, and /u/ were almost the same. At A5, the Fl and F2 of five sung vowels had the same values. This results suggest that the relation between the F0 and the formants be used to synthesize soprano's singing vowels.
PDF

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

Byun, Hi-Gyung
- Phonetics and Speech Sciences
- /
- v.12 no.3
- /
- pp.1-14
- /
- 2020
Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.
https://doi.org/10.13064/KSSS.2020.12.3.001 인용 PDF KSCI

A comparison of Korean vowel formants in conditions of chanting and reading utterances (챈트 및 읽기 발화조건에 따른 한국어 모음 포먼트 비교)

Park, Jihye;Seong, Cheoljae
- Phonetics and Speech Sciences
- /
- v.12 no.3
- /
- pp.85-94
- /
- 2020
Vowel articulation in subjects related to speech disorders seems to be difficult. A chant method that properly reflects the characteristics of language could be used as an effective way of addressing the difficulties. The purpose of this study was to find out whether the chant method is effective as a means of enhancing vowel articulation. The subjects of this study were 60 normal adults (30 males and 30 females) in their 20s and 30s whose native language is Korean. Eight utterance conditions including chanting and reading conditions were recorded and their acoustic data were analyzed. The results of the analysis of the acoustic variables related to the formant confirmed that the F1 and F2 values of the vowel formants are increased and the direction of movement of the center of gravity of the vowel triangle is statistically significantly forwarded and lowered in the chant method in both the word and the phrase context. The results also proved that accent is the most influential musical factor in chant. There was no significant difference between four repeated tokens, which increased the reliability of the results. In other words, chanting is an effective way to shift the center of gravity of the vowel triangle, which suggests that it can help to improve speech intelligibility by forming a desirable place for articulation.
https://doi.org/10.13064/KSSS.2020.12.3.085 인용 PDF KSCI

A STUDY OF THE INFLUENCE ON PHONATION WHEN MAXILLARY ANTERIOR TEETH ARE MISSING (상악 전치부 결손이 발음에 미치는 영향에 관한 연구)

Roh Chang-Sup;Choi Dae-Gyun;Woo Yi-Hyung;Choi Boo-Byung
- The Journal of Korean Academy of Prosthodontics
- /
- v.30 no.3
- /
- pp.338-360
- /
- 1992
This study was performed to investigate the phonetic alterations with upper anterior teeth were missing. To compare the changes of the phonations, before and after insertion of the temporary prosthesis, six subjects who lost their upper anterior teeth were selected (2-male, 4-female). Tested sounds (/ga(가), na(나), da(다), ra(라), sa(사), ja(자), cha(차), ta(타), pa(파), ha(하), gi(기), ni(니), di(디), ri(리), si(시), jl(지), chi(치), ti(티), pi(피), hi(히), seu(스), se(세), so(소), su(수)/were programmed into an IBM AT with and without temporary prosthesis. These experiments were analyzed by formants, consonants durations, and energy level changes with an LSI speech work station program. During the pronunciation of the tested sounds (with and without temporary prosthesis), mandibular movements were recorded to a Mandibular Kinesiogram and analyzed . The findings led to the following conclusions: 1. Objective differences could not be found. However, in every informant, subjective improvement could be noticed. 2. There were no persistant correlations of the formant's changes. And in every informant, phonetic changes were variable. 3. There were various changes of the consonant durations in every informant. By and large, those of /si(시), jl(지), chi(치), Pi(피), hi(히)/ were longer than other tested sounds. After insertion of the prosthesis, durations were shorter. Consonants with /i(ㅣ)/ were longer than with /a(ㅏ)/, with or without prosthesis. 4. With and without temporary prosthesis, mandibular movements were various in the frontal view. Mandibular movements showed lateral deviations, and mandibular positions with /si(시), ji(지), ti(티), seu(스), hi(히)/ were nearer to the mandibular rest position. 5. The kinds of temporary prosthesis and conditions of the missing teeth influenced every informant variously, so there were no correlation between informants. 6. Energy levels increased in all tested sounds with a fixed temporary prosthesis. And, there were no differences between before and after insertion of a removable temporary prosthesis. However, sibilant sounds, and consonants with /i(ㅣ)/ showed a little increased energy level.
PDF

Search Result 414, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)