• 제목/요약/키워드: phonetic data

검색결과 200건 처리시간 0.022초

모음의 포먼트 변형에 따른 인공와우 이식 아동의 청각적 인지변화 (Perception Ability of Synthetic Vowels in Cochlear Implanted Children)

  • 허명진
    • 대한음성학회지:말소리
    • /
    • 제64호
    • /
    • pp.1-14
    • /
    • 2007
  • The purpose of this study was to examine the acoustic perception different by formants change for profoundly hearing impaired children with cochlear implants. The subjects were 10 children after 15 months of experience with the implant and mean of their chronological age was 8.4 years and Standard deviation was 2.9 years. The ability of auditory perception was assessed using acoustic-synthetic vowels. The acoustic-synthetic vowel was combined with F1, F2, and F3 into a vowel and produced 42 synthetic sound, using Speech GUI(Graphic User Interface) program. The data was deal with clustering analysis and on-line analytical processing for perception ability of acoustic synthetic vowel. The results showed that auditory perception scores of acoustic-synthetic vowels for cochlear implanted children were increased in F2 synthetic vowels compaire to those of F1. And it was found that they perceived the differences of vowels in terms of distance rates between F1 and F2 in specific vowel.

  • PDF

한국어 자음군의 후행모음에 나타난 발성유형의 음향음성학적 연구 (An Acoustic Study of Phonation Types in Vowels Following Consonant Clusters in Korean)

  • 박한상
    • 대한음성학회지:말소리
    • /
    • 제64호
    • /
    • pp.53-76
    • /
    • 2007
  • This study investigates phonation types of Korean obstruents associated with the vowels immediately following singletons or geminates in intervocalic positions. F0, H1-H2, and spectral tilt were measured from the 20 ms segment at the onset of the vowels for the tokens of /paCa/ and /paCCa/, where Cs are of the same manner and place of articulation. The results showed a remarkable change in the values of F0, H1-H2, and spectral tilt as the preceding obstruents shifts from the lenis singletons to the lenis geminates, which suggests that the spectral characteristics of the vowels following the lenis geminates are not different from those of the vowels following fortis singletons or geminates. Significantly enough, this study adds data about the spectral characteristics of Korean phonation types.

  • PDF

멜 켑스트럼 모듈레이션 에너지를 이용한 음성/음악 판별 (Speech/Music Discrimination Using Mel-Cepstrum Modulation Energy)

  • 김봉완;최대림;이용주
    • 대한음성학회지:말소리
    • /
    • 제64호
    • /
    • pp.89-103
    • /
    • 2007
  • In this paper, we introduce mel-cepstrum modulation energy (MCME) for a feature to discriminate speech and music data. MCME is a mel-cepstrum domain extension of modulation energy (ME). MCME is extracted on the time trajectory of Mel-frequency cepstral coefficients, while ME is based on the spectrum. As cepstral coefficients are mutually uncorrelated, we expect the MCME to perform better than the ME. To find out the best modulation frequency for MCME, we perform experiments with 4 Hz to 20 Hz modulation frequency. To show effectiveness of the proposed feature, MCME, we compare the discrimination accuracy with the results obtained from the ME and the cepstral flux.

  • PDF

모음에 따른 후두 교호운동 특성 (Effects of Vowel Differences on Laryngeal DDK)

  • 한지연;이옥분
    • 대한음성학회지:말소리
    • /
    • 제68권
    • /
    • pp.1-15
    • /
    • 2008
  • This study investigated the vowel effect on laryngeal DDK (L-DDK) in terms of rate, regularity, and range. Thirteen normal speakers participated in this experiment. Speakers were asked to repeat the vowels /a, e, i, o, u/ for vocal fold adduction DDK, and /ha, he, hi, ho, hul for vocal fold abduction DDK. Acoustic data was analyzed via Motor Speech Profile. There were 6 parameters: DDKavp and DDKavr for rate of L-DDK, DDKcvp and DDKjit for regulariry of L-DDK, and DDKavi and DDKcvi for range of L-DDK. Results of MANOVA and Fredman analysis showed no significant vowel effect on rate and regularity of L-DDK. MANOVA revealed significant effects of vowels and vocal fold ab/adduction on range of L-DDK. DDK peak intensity (DDKavi) in vowel /i/ production was lower than in vowels /a, e, o, u/. Variation of DDK peak intensity (DDKcvi) was significantly greater for /ha/ than for /a/ production. The implication of these findings on voice and speech pathology is discussed.

  • PDF

Implementation and Evaluation of an HMM-Based Speech Synthesis System for the Tagalog Language

  • ;김경태;김종진
    • 대한음성학회지:말소리
    • /
    • 제68권
    • /
    • pp.49-63
    • /
    • 2008
  • This paper describes the development and assessment of a hidden Markov model (HMM) based Tagalog speech synthesis system, where Tagalog is the most widely spoken indigenous language of the Philippines. Several aspects of the design process are discussed here. In order to build the synthesizer a speech database is recorded and phonetically segmented. The constructed speech corpus contains approximately 89 minutes of Tagalog speech organized in 596 spoken utterances. Furthermore, contextual information is determined. The quality of the synthesized speech is assessed by subjective tests employing 25 native Tagalog speakers as respondents. Experimental results show that the new system is able to obtain a 3.29 MOS which indicates that the developed system is able to produce highly intelligible neutral Tagalog speech with stable quality even when a small amount of speech data is used for HMM training.

  • PDF

TMS320VC5510 DSP를 이용한 AMR 음성부호화기의 실시간 구현 (Real-Time Implementation of AMR Speech Codec Using TMS320VC5510 DSP)

  • 김준;배건성
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.143-152
    • /
    • 2008
  • This paper focuses on the real time implementation of an adaptive multi-rate (AMR) speech codec, that is a standard speech codec of IMT-2000, using the TMS320VC5510. The series of TMS320VC55x is a 16-bit fixed-point digital signal processor (DSP) having low power consumption for the use of mobile communications by Texas Instruments (TI) corporation. After we analyze the AMR algorithm and source code as well as the structure and I/O of 7MS320VC55x, we carry out optimizing the programs for real time implementation. The implemented AMR speech codec uses 55.2 kbyte for the program memory and 98.3 kbyte for the data memory, and it requires 709,878 clocks, i.e. about 3.5 ms, for processing a frame of 20 ms speech signal.

  • PDF

청각장애 성인 남성의 음성 특성 (Acoustic Qualities of Phonation in Hearing-impaired Male Adults)

  • 서경희
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.37-49
    • /
    • 2008
  • The purposes of this experiment were to compare and analyze some voice parameters of the hearing impaired male adults and to suggest a basic data on the speech intervention for the hearing impaired. Voice analysis of four sustained vowels(/a/, /i/, /${\partial}$/, /u/, fundamental Sequency(F0), jitter percent, shimmer percent, and Noise to Harmonic Ratio(NHR) was conducted for the deaf young male adults using a sign laguage(N=5, aged 16-20) and the normal hearing young male adults(N=10, aged 18-20) by using MDVP(Multi-Dimensional Voice Program) in CSL. F0, jitter, and shimmer in the deaf group were significantly higher than those in the normal hearing group. The average of F0 was 151 Hz, which was lower than the results of the previous studies, and there were no significant differences among the sustained vowels. In both groups, the values of the voice parameters were stable on the /a/ or /${\partial}$/, those closed to the standard scores.

  • PDF

청각장애아동과 건청아동의 성도면적 추정 성능 (Performance of Vocal Tract Area Estimation from Deaf and Normal Children's Speech)

  • 김세환;김남;권오욱
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.159-172
    • /
    • 2005
  • This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, we obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order on the estimated vocal tract shape. We compare the vocal tract shapes obtained from deaf and normal children's speech.

  • PDF

모듈화한 신경 회로망을 이용한 광대역 음성 복원 (Wideband Speech Reconstruction Using Modular Neural Networks)

  • 우동헌;고참한;강현민;정진희;김유신;김형순
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.93-105
    • /
    • 2003
  • Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.

  • PDF

Greedy Kernel PCA를 이용한 화자식별 (Speaker Identification Using Greedy Kernel PCA)

  • 김민석;양일호;유하진
    • 대한음성학회지:말소리
    • /
    • 제66호
    • /
    • pp.105-116
    • /
    • 2008
  • In this research, we propose a speaker identification system using a kernel method which is expected to model the non-linearity of speech features well. We have been using principal component analysis (PCA) successfully, and extended to kernel PCA, which is used for many pattern recognition tasks such as face recognition. However, we cannot use kernel PCA for speaker identification directly because the storage required for the kernel matrix grows quadratically, and the computational cost grows linearly (computing eigenvector of $l{\times}l$ matrix) with the number of training vectors I. Therefore, we use greedy kernel PCA which can approximate kernel PCA with small representation error. In the experiments, we compare the accuracy of the greedy kernel PCA with the baseline Gaussian mixture models using MFCCs and PCA. As the results with limited enrollment data show, the greedy kernel PCA outperforms conventional methods.

  • PDF