통합 검색 | Korea Science

TTS를 이용한 매장음원방송에서 고객의 인지도 향상을 위한 음향효과 연구 (A Study on the Sound Effect for Improving Customer's Speech Recognition in the TTS-based Shop Music Broadcasting Service)

강선미;김현득;장문수
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.105-109
- /
- 2009
This thesis describes the method for well voice announcement using the TTS(Text-To-Speech) technology in the shop music broadcasting service. Offering a high quality TTS sound service for each shop requires a great expense. According to a report on the architectural acoustics the room acoustic indexes such as reverberation time and early decay time are closely connected with a subjective awareness about acoustics. By using the result the customers will be able to recognize better the voice announcement by applying sound effect to speech files made by TTS. The result of an aural comprehension examination has shown better about almost all of the parameters by applying reverb effect to TTS sound.
PDF

G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법 (Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1)

조근석;성종모;한민수;김영일;정상배
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.97-103
- /
- 2009
This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.
PDF

Generational Differences in the Perception of Korean Stops

Kang, Kyoung-Ho
- 말소리와 음성과학
- /
- 제2권3호
- /
- pp.3-10
- /
- 2010
The proposal that a sound change is occurring in Korean stops was evidenced in this study through identification experiments on Korean stops. Perceptual weight of acoustic correlates to Korean stop manner contrast [VOT (Voice Onset Time), H1-H2 (amplitude difference between the first and second harmonics), and F0 (Fundamental frequency)] was examined with re-synthesized /$t^ha$/, /ta/, and /$t^*a$/ syllables for younger and older Seoul speakers of Korean. For the identification of the aspirated and lenis stops, F0 cue weight relative to VOT was greater for the younger listeners than the older listeners. For H1-H2 cue weight, the two listener groups were more or less the same. These findings were parallel to the production differences found in the earlier work of the author. Combined with production differences, these perception differences between younger and older generations of Seoul speakers suggested that there are generational differences in the phonetic targets of Korean aspirated and lenis stops and such differences are realized in the perception of the stops.
PDF

한국인의 영어 폐쇄음 발화와 발화 훈련 (Korean Speakers' Pronunciation and Pronunciation Training of English Stops)

김지은
- 말소리와 음성과학
- /
- 제2권3호
- /
- pp.29-36
- /
- 2010
The purposes of this study are (1) to see if language transfer effect is found in Korean speakers' pronunciation of English stops and to correct them and (2) to investigate the effectiveness of mimicry training and Speech Analyzer training on subjects' pronunciation of English stops. For these purposes, 20 Korean speakers' VOT values of English stops were measured using Speech Analyzer and their post-training production was compared with their pre-training production. The result shows that Korean speakers have no difficulty in correcting pronunciation errors of English voiceless stops and voiced stops and such a result indicates that language transfer effect is not noticed as expected. In addition, the result of pronunciation training shows that the training using Speech Analyzer is more effective than mimicry training.
PDF

Perception of English High Vowels by Korean Speakers of English

Lee, Ji-Yeon
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.39-46
- /
- 2009
This study compares the perception of English high tense and lax vowels (/i, I, u, $\mho$/) by English speakers and Korean speakers of English. The four vowels were produced in /hVd/ context by a native speaker of English, and each word's vowel duration was manipulated to range from 170ms to 290ms in 30ms increments. Two English speakers and six Korean speakers of English were asked to listen to pairs of tense and lax vowel words with manipulated vowel durations and to identify the pair by choosing either heed-hid or hid-heed for front vowels and either who'd-hood or hood-who'd for back vowels. The results show that English speakers distinguished tense vowels from lax vowels with 100% accuracy regardless of the different durations, compared to 62% accuracy for Korean speakers of English. Most errors occurred for lengthened lax vowels and shortened tense vowels. The results of this study demonstrate that Korean speakers mainly rely on vowel duration as a cue to discriminate the tense and lax vowels. The theoretical and pedagogical implications of this finding are discussed.
PDF

Sensitivity to Phrase-initial Tone and Laryngeal Feature Identification of Foreign Learners of Korean

Lee, Hye-Sook
- 말소리와 음성과학
- /
- 제2권3호
- /
- pp.91-99
- /
- 2010
This paper reports on an identification test where KFL learners identified the Korean three-way laryngeal contrast in the phrase-initial position, when the phrase-initial tone was systematically manipulated. It turns out that heritage learners have some sensitivity to phrase-initial tone and show a plain-aspirated alternation in their identification according to the phrase-initial tone, as native speakers do, whereas non-heritage students do not show such tone sensitivity. However, after a weekly prosody training, second-year non-heritage students have shown a significant improvement in their performance. This paper clearly shows that the phrase-initial tone plays a critical role in distinguishing laryngeal features of Korean obstruents, and also suggests that prosody including the tone-segment correlation should be incorporated in the KFL curriculum.
PDF

과학수사용 화자 식별 시스템의 피치 차이에 따른 신뢰성 척도 (Confidence Measure of Forensic Speaker Identification System According to Pitch Variances)

김민석;김경화;양일호;유하진
- 말소리와 음성과학
- /
- 제2권3호
- /
- pp.135-139
- /
- 2010
Forensic speaker identification needs high accuracy and reliability. However, the current level of speaker identification does not reach its demand. Therefore, the confidence evaluation of results is one of the issues in forensic speaker identification. In this paper, we propose a new confidence measure of forensic speaker identification system. This is based on pitch differences between the registered utterances of the identified speaker and the test utterance. In the experiments, we evaluate this confidence measure by speech identification tasks on various environments. As the results, the proposed measure can be a good measure indicating if the result is reliable or not.
PDF

영어학습자의 양순폐쇄음과 순치마찰음 발성 난이도 비교 연구 (A Study of Production Difficulties of English Bilabial Stops and Labiodental Fricatives by Korean Learners of English)

구희산
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.11-15
- /
- 2009
The aim of this study was to identify production difficulties of Korean learners of English in their articulation of English bilabial stops /p, b/ and labiodental fricatives /f, v/. Sixty non-sense syllables and twelve words were produced three times by nine graduate students. Test scores were measured from the score board made by FluSpeak, a speech training software program, which was designed for English pronunciation practice and improvement. Results show that 1) the subjects had lower scores in producing /p, b/ than /f, v/ from all positions, and 2) subjects had lower scores in medial (inter-vocalic) position than in initial (pre-vocalic) position and in final (post-vocalic) position when they produced /p/, /b/, /f/, and /v/. The results suggest that on the whole, Korean learners of English have much difficulty in producing /b/ and that they also have more articulatory problems in intervocalic than in the other positions when they produce these bilabial stops and labiodental fricatives.
PDF

Lexical Encoding of L2 Suprasegmentals: Evidence from Korean Learners' Acquisition of Japanese Vowel Length Distinctions

Han, Jeong-Im
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.17-27
- /
- 2009
Despite many studies on the production and perception of L2 phonemes, studies on how such phonemes are encoded lexically remain scarce. The aim of this study is to examine whether L2 learners have a perceptual problem with L2 suprasegmentals which are not present in their L1, or if they are able to perceive but not able to encode them in their lexicon. Specifically, Korean learners were tested to see if they could discriminate the vowel length differences in Japanese at the psychoacoustic level through a simple AX discrimination task. Then, a speeded lexical decision task with high phonetic variability was conducted to see whether they could use such contrasts lexically. The results showed that Korean learners of Japanese have no difficulties in discriminating Japanese vowel length contrast, but they are unable to encode such contrast in their phonological representation, even with long L2 exposure.
PDF

히스토그램 변환에서 기준분포의 표준편차 변경에 따른 강인한 화자인증 성능 개선 (Performance Improvement of Robust Speaker Verification According to Various Standard Deviations of a Reference Distribution in Histogram Transformation)

권철홍
- 말소리와 음성과학
- /
- 제2권3호
- /
- pp.127-134
- /
- 2010
Additive noise and channel mismatch strongly degrade the performance of speaker verification systems, as they distort the features of speech. In this paper a histogram transformation technique is presented to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that their histogram is conformed to a reference distribution. The effect of different standard deviations for the reference distribution is investigated. Experimental results indicate that, in channel mismatched environments, the proposed technique offers significant improvements over existing techniques. We also verify performance improvement of the proposed method using statistics.
PDF

검색결과 948건 처리시간 0.018초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)