Search | Korea Science

An Efficient Transcoding Algorithm For G.723.1 and EVRC Speech Coders (G.723.1 음성부호화기와 EVRC 음성부호화기의 상호 부호화 알고리듬)

김경태;정성교;윤성완;박영철;윤대희;최용수;강태익
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.5C
- /
- pp.548-554
- /
- 2003
Interoperability is ole the most important factors for a successful integration of the speech network. To accomplish communication between endpoints employing different speech coders, decoder and encoder of each endpoint coder should be placed in tandem. However, tandem coder often produces problems such as poor speech quality, high computational load, and additional transmission delay. In this paper, we propose an efficient transcoding algorithm that can provide interoperability to the networks employing ITU-T G.723.1［1］and TIA IS-127 EVRC［2］speech coders. The proposed transcoding algorithm is composed of four parts: LSP conversion, open-loop pitch conversion, fast adaptive codebook search, and fast fixed codebook search. Subjective and objective quality evaluation confirmed that the speech quality produced by the proposed transcoding algorithm was equivalent to, or better than the tandem coding, while it had shorter processing delay and less computational complexity, which is certified implementing on TMS320C62x.
PDF KSCI

A Survey on Participants' Satisfaction of Vocal Hygiene Education: A Preliminary Study (음성위생교육 만족도에 대한 예비 연구)

Yoon, Ji Hye;Kim, Sun Woo
- Phonetics and Speech Sciences
- /
- v.5 no.3
- /
- pp.83-93
- /
- 2013
Vocal hygiene education is an indirect training approach to improve vocal function by educating all facets of optimal vocal health. Satisfaction levels of participants might be an important component of this indirect therapy for voice disorders. The authors aimed to investigate the satisfaction levels of vocal hygiene education in 51 patients with voice problems. We classified voice disorders of the participants according to three etiological categories (subgroups): organic, neurogenic, and functional. The survey consisted of three parts: 1) a condition of vocal hygiene education, 2) a degree of satisfaction of the present education, and 3) a request for future education. Participants responded to each item of the survey using a five-point Likert scale of 1 to 5 (1 being not at all and 5 being extremely). They also wrote down personal comments of improvement. Participants scored the vocal hygiene education offered by the speech-language pathologists between '3' and '4'. Specifically, the participants were highly satisfied with the specific and comprehensible explanation/instruction given by their speech-language pathologists. However, they were less satisfied with the tuition fee for the therapy sessions. Vocal hygiene education is offered individually to people in a clinical setting. Our results support the notion that vocal hygiene education can be an integral aspect of the treatment of voice problems in most cases.
https://doi.org/10.13064/KSSS.2013.5.3.083 인용 PDF

힐베르트의 '수학 문제'에 관하여

한경혜
- Journal for History of Mathematics
- /
- v.16 no.4
- /
- pp.33-44
- /
- 2003
This article shows Hilbert's Paris address within the context of his career and the mathematics of his day. And the text of Hilbert's speech will be considered in some detail, particularly those parts of it that reflects his vision of mathematics most clearly. At the end it is attempted to show the limitations of the standard view of Hilbert as an advocate of a formalist approach to mathematics.
PDF

VOICE SOURCE ESTIMATION USING SEQUENTIAL SVD AND EXTRACTION OF COMPOSITE SOURCE PARAMETERS USING EM ALGORITHM

Hong, Sung-Hoon;Choi, Hong-Sub;Ann, Sou-Guil
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06a
- /
- pp.893-898
- /
- 1994
In this paper, the influence of voice source estimation and modeling on speech synthesis and coding is examined and then their new estimation and modeling techniques are proposed and verified by computer simulation. It is known that the existing speech synthesizer produced the speech which is dull and inanimated. These problems are arised from the fact that existing estimation and modeling techniques can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can represent a variety of source characteristics. First, we divide speech samples in one pitch region into four parts having different characteristics. Second, the vocal-tract parameters and voice source waveforms are estimated in each regions differently using sequential SVD. Third, we propose composite source model as a new voice source model which is represented by weighted sum of pre-defined basis functions. And finally, the weights and time-shift parameters of the proposed composite source model are estimeted uning EM(estimate maximize) algorithm. Experimental results indicate that the proposed estimation and modeling methods can estimate more accurate voice source waveforms and represent various source characteristics.
PDF

Energy-Dependent Preemphasis for Speech Signal Preprocessing (음성신호 전처리를 위한 에너지 의존 프리엠퍼시스)

Kim, Dong-Jun;Park, Sang-Hui
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.3
- /
- pp.18-25
- /
- 1997
This study describes a modified preemphasis formula, what we call energy-dependent preemphasis(EDP). This uses the normalized short-term energy of speech signal, with the assumption that the source characteristics of the glottal pulses and the radiation characteristics of the lips are approximately proportional to the energy of speech signal. Using this method, speech analyses, such as AR spectrum estimation, formant detection, are performed for nonstationary starting parts of 5 Korean single vowels. The results are compared with the conventional two preemphasis methods. We found that the proposed preemphasis gave enhanced spectral shapes and more accurate formant frequencies and avoided overlapping phenomenon of adjacent two formants.
PDF

Speech/Mixed Content Signal Classification Based on GMM Using MFCC (MFCC를 이용한 GMM 기반의 음성/혼합 신호 분류)

Kim, Ji-Eun;Lee, In-Sung
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.2
- /
- pp.185-192
- /
- 2013
In this paper, proposed to improve the performance of speech and mixed content signal classification using MFCC based on GMM probability model used for the MPEG USAC(Unified Speech and Audio Coding) standard. For effective pattern recognition, the Gaussian mixture model (GMM) probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM) algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and mixed content signals using MFCC feature parameters. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.
https://doi.org/10.5573/ieek.2013.50.2.185 인용 PDF KSCI

A Syllabic Segmentation Method for the Korean Continuous Speech (우리말 연속음성의 음절 분할법)

한학용;고시영;허강인
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3
- /
- pp.70-75
- /
- 2001
This paper proposes a syllabic segmentation method for the korean continuous speech. This method are formed three major steps as follows. (1) labeling the vowel, consonants, silence units and forming the Token the sequence of speech data using the segmental parameter in the time domain, pitch, energy, ZCR and PVR. (2) scanning the Token in the structure of korean syllable using the parser designed by the finite state automata, and (3) re-segmenting the syllable parts witch have two or more syllables using the pseudo-syllable nucleus information. Experimental results for the capability evaluation toward the proposed method regarding to the continuous words and sentence units are 73.5％, 85.9％, respectively.
PDF

The Vowel System of American English and Its Regional Variation (미국 영어 모음 체계의 몇 가지 지역 방언적 차이)

Oh, Eun-Jin
- Speech Sciences
- /
- v.13 no.4
- /
- pp.69-87
- /
- 2006
This study aims to describe the vowel system of present-day American English and to discuss some of its phonetic variations due to regional differences. Fifteen speakers of American English from various regions of the United States produced the monophthongs of English. The vowel duration and the frequencies of the first and the second formant were measured. The results indicate that the distinction between the vowels [c] and [a] has been merged in most parts of the U.S. except in some speakers from eastern and southeastern parts of the U.S., resulting in the general loss of phonemic distinction between the vowels. The phonemic merger of the two vowels can be interpreted as the result of the relatively small functional load of the [c]-[a] contrast, and the smaller back vowel space in comparison to the front vowel space. The study also shows that the F2 frequencies of the high back vowel [u] were extremely high in most of the speakers from the eastern region of the U.S., resulting in the overall reduction of their acoustic space for high vowels. From the viewpoint of the Adaptive Dispersion Theory proposed by Liljencrants & Lindblom (1972) and Lindblom (1986), the high back vowel [u] appeared to have been fronted in order to satisfy the economy of articulatory gesture to some extent without blurring any contrast between [i] and [u] in the high vowel region.
PDF

An Implementation of the Real Time Speech Recognition for the Automatic Switching System (자동 교환 시스템을 위한 실시간 음성 인식 구현)

박익현;이재성;김현아;함정표;유승균;강해익;박성현
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.4
- /
- pp.31-36
- /
- 2000
This paper describes the implementation and the evaluation of the speech recognition automatic exchange system. The system provides government or public offices, companies, educational institutions that are composed of large number of members and parts with exchange service using speech recognition technology. The recognizer of the system is a Speaker-Independent, Isolated-word, Flexible-Vocabulary recognizer based on SCHMM(Semi-Continuous Hidden Markov Model). For real-time implementation, DSP TMS320C32 made in Texas Instrument Inc. is used. The system operating terminal including the diagnosis of speech recognition DSP and the alternation of speech recognition candidates makes operation easy. In this experiment, 8 speakers pronounced words of 1,300 vocabulary related to automatic exchange system over wire telephone network and the recognition system achieved 91.5% of word accuracy.
PDF

Text-Independent Speaker Identification System Using Speaker Decision Network Based on Delayed Summing (지연누적에 기반한 화자결정회로망이 도입된 구문독립 화자인식시스템)

이종은;최진영
- Journal of the Korean Institute of Intelligent Systems
- /
- v.8 no.2
- /
- pp.82-95
- /
- 1998
In this paper, we propose a text-independent speaker identification system which has a classifier composed of two parts; to calculate the degree of likeness of each speech frame and to select the most probable speaker from the entire speech duration. The first part is realized using RBFN which is selforganized through learning and in the second part the speaker is determined using a con-tbination of MAXNET and delayed summings. And we use features from linear speech production model and features from fractal geometry. Closed-set speaker identification experiments on 13 male homogeneous speakers show that the proposed techniques can achieve the identification ratio of 100% as the number of delays increases.
PDF

Search Result 135, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)