• Title/Summary/Keyword: speech parameter

Search Result 373, Processing Time 0.027 seconds

The Prosodic Changes of Korean English Learners in Robot Assisted Learning (로봇보조언어교육을 통한 초등 영어 학습자의 운율 변화)

  • In, Jiyoung;Han, JeongHye
    • Journal of The Korean Association of Information Education
    • /
    • v.20 no.4
    • /
    • pp.323-332
    • /
    • 2016
  • A robot's recognition and diagnosis of pronunciation and its speech are the most important interactions in RALL(Robot Assisted Language Learning). This study is to verify the effectiveness of robot TTS(Text to Sound) technology in assisting Korean English language learners to acquire a native-like accent by correcting the prosodic errors they commonly make. The child English language learners' F0 range and speaking rate in the 4th grade, a prosodic variable, will be measured and analyzed for any changes in accent. We compare whether robot with the currently available TTS technology appeared to be effective for the 4th graders and 1st graders who were not under the formal English learning with native speaker from the acoustic phonetic viewpoint. Two groups by repeating TTS of RALL responded to the speaking rate rather than F0 range.

The Analysis of Voice after Vertical Partial Laryngectomy with Mucosal Flap and Fat Graft Reconstruction (수직후두부분절제술 및 점막 피판과 지방 이식을 통한 성대 재건술 후의 음성분석)

  • Chu, Hyung-Ro;Choi, In-Ja;Kim, Jin-Hwan;Ahn, Hwoe-Young;Rho, Young-Soo
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.2
    • /
    • pp.134-137
    • /
    • 2007
  • Background and Objectives: The goals of laryngeal reconstruction have been prevention of aspiration, production of a functional voice, and maintenance of an adequate airway for decannulation. It is generally believed that the reconstruction of the glottic region after vertical partial laryngectomy (VPL) can improve laryngeal function. The objective of this study is to evaluate of voice function after VPL with mucosal flap and fat graft reconstruction. Materials and Methods: From 1994 to 2006, 13 patients, who had been treated with VPL with mucosal flap and fat graft reconstruction. The voice characteristics, acoustic, aerodynamic parameter were measured in 13 patients after vertical partial laryngectomy with mucosal flap and fat graft reconstruction. Acoustic analysis was carried out using Computerized Speech Lab (CSL) and aerodynamic analysis were carried out using Aerophon II,3 months and 12 months after surgery. Results: The GRBAS scale, jitter, shimmer, NHR were improved as time goes on after surgery. But, maximum phonation time was shortened after surgery and there is no significant differences between before and after surgery in mean flow rate. Conclusion: The voice function of the mucosal flap and fat graft reconstruction after VPL were satisfactory. This can be an excellent reconstruction method after vertical partial laryngectomy.

  • PDF

Word Verification using Similar Word Information and State-Weights of HMM using Genetic Algorithmin (유사단어 정보와 유전자 알고리듬을 이용한 HMM의 상태하중값을 사용한 단어의 검증)

  • Kim, Gwang-Tae;Baek, Chang-Heum;Hong, Jae-Geun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.1
    • /
    • pp.97-103
    • /
    • 2001
  • Hidden Markov Model (HMM) is the most widely used method in speech recognition. In general, HMM parameters are trained to have maximum likelihood (ML) for training data. Although the ML method has good performance, it dose not take account into discrimination to other words. To complement this problem, a word verification method by re-recognition of the recognized word and its similar word using the discriminative function of the two words. To find the similar word, the probability of other words to the HMM is calculated and the word showing the highest probability is selected as the similar word of the mode. To achieve discrimination to each word the weight to each state is appended to the HMM parameter. The weight is calculated by genetic algorithm. The verificator complemented discrimination of each word and reduced the error occurred by similar word. As a result of verification the total error is reduced by about 22%

  • PDF

The Effect of Artecoll Injection for the Patients with Unilateral Vocal Cord Paralysis (일측성 성대마비 환자에서 Artecoll을 이용한 성대주입술의 효과 및 안전성)

  • Oh Jae-Won;Lee Seung-Won;Kim Min-Beom;Yun Young-Sun;Kim Kwan-Min;Son Young-Ik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.16 no.2
    • /
    • pp.129-134
    • /
    • 2005
  • Background and Objectives : Artecoll(R) is an injectable soft tissue filler, which is a suspension of polymethylmethacrylate microspheres in $3.5\%$ bovine collagen solution. The authors aimed to determine the clinical of Artecoll of Artecoll(R) as an injection material into the vocal fold to correct the glottal insufficiency caused by unilateral vocal cord paralysis. Materials and Methods : Forty-one consecutive patients with unilateral vocal cord paralysis received percutaneous Artecoll injections under local anesthesia. Acoustic, aerodynamic and stroboscopic analyses were prospectively provided before, 1 week and 3 months after injection. Perceptual GRBAS grading by speech language pathologists and subjective ratings of the hoarseness and aspiration by the patients were also obtained. Results : Aerodynamic parameter(maximal phonation time) were significantly improved after the injection (p<0.05). Acoustic parameters (jitter and shimmer) were improved at the 3rd month follow-up. GRBAS uading and patients own subjective scaling of hoarseness and aspiration also showed significant improvement (p<0.05). Early or delayed significant side effects were not observed. Conclusion : Vocal fold injection with Artecoll is a convenient, safe and useful method of temporarily correcting the glottal insufficiency. Further long-term follow-up studies will answer the usefulness and safety of the Artecoll injection laryngoplasty.

  • PDF

Voice Personality Transformation Using a Probabilistic Method (확률적 방법을 이용한 음성 개성 변환)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.3
    • /
    • pp.150-159
    • /
    • 2005
  • This paper addresses a voice personality transformation algorithm which makes one person's voices sound as if another person's voices. In the proposed method, one person's voices are represented by LPC cepstrum, pitch period and speaking rate, the appropriate transformation rules for each Parameter are constructed. The Gaussian Mixture Model (GMM) is used to model one speaker's LPC cepstrums and conditional probability is used to model the relationship between two speaker's LPC cepstrums. To obtain the parameters representing each probabilistic model. a Maximum Likelihood (ML) estimation method is employed. The transformed LPC cepstrums are obtained by using a Minimum Mean Square Error (MMSE) criterion. Pitch period and speaking rate are used as the parameters for prosody transformation, which is implemented by using the ratio of the average values. The proposed method reveals the superior performance to the previous VQ-based method in subjective measures including average cepstrum distance reduction ratio and likelihood increasing ratio. In subjective test. we obtained almost the same correct identification ratio as the previous method and we also confirmed that high qualify transformed speech is obtained, which is due to the smoothly evolving spectral contours over time.

A Study on Frequency-Time Plane Analysis of Wavelet (웨이브렛의 주파수-시간 평면 해석에 관한 연구)

  • Bae, Sang-Bum;Ryu, Ji-Goo;Kim, Nam-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.451-454
    • /
    • 2005
  • Recently, many methods to analyze signal have been proposed and representative methods are the Fourier transform and wavelet transform. In these methods, the Fourier transform represents signal with combination cosine and sine at all locations in the frequency domain. However, it doesn't provide time information that particular frequency occurs in signal and depends on only the global feature of the signal. So, to improve these points the wavelet transform which is capable of multiresolution analysis has been applied to many fields such as speech processing, image processing and computer vision. And the wavelet transform, which uses changing window according to scale parameter, presents time-frequency localization. In this paper, we proposed a new approach using a wavelet of cosine and sine type and analyzed features of signal in a limited point of frequency-time plane.

  • PDF

Comparative evaluation of nasal and alveolar changes in complete unilateral cleft lip and palate patients using intraoral and extraoral nasoalveolar molding techniques: randomized controlled trial

  • Kalaskar, Ritesh;Bhaje, Priyanka;Sharma, Priyanka;Balasubramanian, Shruti;Ninawe, Nupur;Ijalkar, Rajesh
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.47 no.4
    • /
    • pp.257-268
    • /
    • 2021
  • Objectives: Cleft lip and palate is a common congenital anomaly that impairs the aesthetics, speech, hearing, and psychological and social life of an individual. To achieve good aesthetic outcomes, presurgical nasoalveolar molding (NAM) has become important. Currently, the intraoral NAM technique is widely practiced. Numerous modifications have been made to intraoral NAM techniques, but the original problem of compliance leading to discontinuation of treatment remains unsolved. Therefore, the present study compared an extraoral NAM technique with the intraoral NAM technique. Materials and Methods: Twenty infants with complete unilateral cleft lip and palate were included and divided into two equal groups. Group A received the intraoral NAM technique, and Group B received the extraoral NAM technique. Pre- and postoperative extraoral and intraoral measurements were recorded. Results: Groups A and B did not differ significantly in any extraoral or intraoral parameter. Conclusion: The extraoral NAM technique is as effective as the intraoral NAM technique in achieving significant nasal and alveolar changes in complete unilateral cleft lip and palate patients. Additionally, it reduces the need for frequent hospital visits for activation and the stress associated with the insertion and removal of the intraoral NAM plate, thereby improving compliance.

Noise Statistics Estimation Using Target-to-Noise Contribution Ratio for Parameterized Multichannel Wiener Filter (변수내장형 다채널 위너필터를 위한 목적신호대잡음 기여비를 이용한 잡음추정기법)

  • Hong, Jungpyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1926-1933
    • /
    • 2022
  • Parameterized multichannel Wiener filter (PMWF) is a linear filter that can control the trade-off between residual noise and signal distortion using the embedded parameter. To apply the PMWF to noisy inputs, accurate noise estimation is important and multichannel minima-controlled recursive averaging (MMCRA) is widely used. However, in the case of the MMCRA, the accuracy of noise estimation decreases when a directional interference is involved into the array inputs. Consequently, the performance of the PMWF is degraded. Therefore, we propose a noise power spectral density (PSD) estimation method for the PMWF in this paper. The proposed method is based on a consecutive process of eigenvalue decomposition on noisy input PSD, estimation of the target component contribution using directional information, and exponential weighting for improved estimation of the target contribution. For evaluation, four objective measures were compared with the MMCRA and we verify that the PMWF with the proposed noise estimation method can improve performance in environments where directional interfereces exist.

Performance Improvement of Speaker Recognition by MCE-based Score Combination of Multiple Feature Parameters (MCE기반의 다중 특징 파라미터 스코어의 결합을 통한 화자인식 성능 향상)

  • Kang, Ji Hoon;Kim, Bo Ram;Kim, Kyu Young;Lee, Sang Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.679-686
    • /
    • 2020
  • In this thesis, an enhanced method for the feature extraction of vocal source signals and score combination using an MCE-Based weight estimation of the score of multiple feature vectors are proposed for the performance improvement of speaker recognition systems. The proposed feature vector is composed of perceptual linear predictive cepstral coefficients, skewness, and kurtosis extracted with lowpass filtered glottal flow signals to eliminate the flat spectrum region, which is a meaningless information section. The proposed feature was used to improve the conventional speaker recognition system utilizing the mel-frequency cepstral coefficients and the perceptual linear predictive cepstral coefficients extracted with the speech signals and Gaussian mixture models. In addition, to increase the reliability of the estimated scores, instead of estimating the weight using the probability distribution of the convectional score, the scores evaluated by the conventional vocal tract, and the proposed feature are fused by the MCE-Based score combination method to find the optimal speaker. The experimental results showed that the proposed feature vectors contained valid information to recognize the speaker. In addition, when speaker recognition is performed by combining the MCE-based multiple feature parameter scores, the recognition system outperformed the conventional one, particularly in low Gaussian mixture cases.

Changes in Acoustic Parameters According to Intensity Increase in Voice Assessment (음성질환자의 음성검사 시 강도 증가에 따른 음향학적 지표의 변화)

  • Nam, Do-Hyun;Rheem, Sung-Sue;Yun, Bo-Ram;Cho, Sun-A;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.22 no.2
    • /
    • pp.143-150
    • /
    • 2011
  • Background and Objectives : Clinically, as a tool for voice assessment before and after the operation or the voice treatment, acoustic analysis is widely used. However, in clinical situations, acoustic parameters vary according to how the assessment is made. Thus, with voice disease patients as subjects, we are to investigate what influence intensity increase exerts on acoustic parameters and how to reduce variation according to the way of assessing. Material and Method : At the voice clinic of the department of otorhinolaryngology in Gangnam Severance Hospital, with 30 female voice-disease patients (40.6 years old on the average) and 23 male voice-disease patients (40.1 years old on the average) as subjects, using the Dr Speech vocal-assessment program, we statistically tested the significance of the difference in each of acoustic parameters between when the "Ah" vowel is produced with a normal voice and when the "Ah" vowel is produced with a loud voice. Results : Acoustic parameters that showed a statistically significant difference according to intensity increase were Jitter, SD F0, and NNE for females, and Jitter, SD F0, HNR, SNR, and NNE for males. Voice quality estimates showed a statistically significant difference according to intensity increase in female hoarse voice, female breathy voice, and male breathy voice. Conclusion : In this research, acoustic analysis, which is generally used for voice assessment before and after the operation or the voice treatment, showed a tendency that acoustic parameters became better under the influence of intensity increase except for the cases where a voice disease was severe. Thus, to raise the reliability of voice assessment, the range of intensity needs to be set up. This should be the topic for the future research.

  • PDF