• Title/Summary/Keyword: Phonetics

Search Result 948, Processing Time 0.019 seconds

Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean (한국어 대용량발화말뭉치의 단모음분석)

  • Yoon, Tae-Jin;Kang, Yoonjung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.139-145
    • /
    • 2014
  • The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.

Frequency Bin Alignment Using Covariance of Power Ratio of Separated Signals in Multi-channel FD-ICA (다채널 주파수영역 독립성분분석에서 분리된 신호 전력비의 공분산을 이용한 주파수 빈 정렬)

  • Quan, Xingri;Bae, Keunsung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.149-153
    • /
    • 2014
  • In frequency domain ICA, the frequency bin permutation problem falls off the quality of separated signals. In this paper, we propose a new algorithm to solve the frequency bin permutation problem using the covariance of power ratio of separated signals in multi-channel FD-ICA. It makes use of the continuity of the spectrum of speech signals to check if frequency bin permutation occurs in the separated signal using the power ratio of adjacent frequency bins. Experimental results have shown that the proposed method could fix the frequency bin permutation problem in the multi-channel FD-ICA.

Production of alveolar flaps in American English by native Korean speakers (한국어 모국어 화자의 미국 영어 치경 탄설음 조음)

  • Oh, Eunjin
    • Phonetics and Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.21-29
    • /
    • 2016
  • This study examined how native Korean speakers realize the acoustic characteristics of /d, t/ flaps in American English. Fourteen subjects, who had lived in foreign countries for less than one year, read words containing the alveolar stops in flapping environments. /d/ (91%) became flaps more frequently than /t/ (42%). The closure durations for /d/ flaps were significantly longer than /t/ flaps, and the durations of the preceding vowels were not significantly different between /d/ and /t/ flaps. Female learners demonstrated a higher percentage of /t/ flapping than their male counterparts. Differences in flap patterns were observed among individual learners.

Development of articulatory estimation model using deep neural network (심층신경망을 이용한 조음 예측 모형 개발)

  • You, Heejo;Yang, Hyungwon;Kang, Jaekoo;Cho, Youngsun;Hwang, Sung Hah;Hong, Yeonjung;Cho, Yejin;Kim, Seohyun;Nam, Hosung
    • Phonetics and Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.31-38
    • /
    • 2016
  • Speech inversion (acoustic-to-articulatory mapping) is not a trivial problem, despite the importance, due to the highly non-linear and non-unique nature. This study aimed to investigate the performance of Deep Neural Network (DNN) compared to that of traditional Artificial Neural Network (ANN) to address the problem. The Wisconsin X-ray Microbeam Database was employed and the acoustic signal and articulatory pellet information were the input and output in the models. Results showed that the performance of ANN deteriorated as the number of hidden layers increased. In contrast, DNN showed lower and more stable RMS even up to 10 deep hidden layers, suggesting that DNN is capable of learning acoustic-articulatory inversion mapping more efficiently than ANN.

Aerodynamic features in patients with vocal polyps before & after laryngomicrosurgery (성대용종 환자의 후두미세수술 전후 공기역학 변수 변화)

  • Kang, Young Ae;Chang, Jae Won;Koo, Bon Seok
    • Phonetics and Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.39-49
    • /
    • 2016
  • The present study examined the change of aerodynamic features after laryngomicrosurgery in patients with vocal polyps. Aerodynamic evaluation was performed in thirty-nine patients (15 males and 24 females) one week before surgery and four weeks after surgery. Evaluation protocols of vital capacity, maximum sustained phonation(MXPH), and voicing efficiency(VOFT) were used to collect 29 phonatory aerodynamic measures, requiring voice with a comfortable pitch and loudness. Statistically significant changes were found for phonation time and airflow values in the MXPH protocol, while changes were also found for airflow values, subglottal pressure values and acoustic resistance values in the VOFT protocol. Although phonation time was increased in both male and female patients, gender-dependent changes were found in airflow measurements. Men's phonation time increased with no difference in airflow rate, but women's phonation time increased with decreased airflow rate and lower subglottal pressure. The changes of aerodynamic features may be affected by women's self-perceived change for vocal attitude, which was reducing sense of vocal effort after surgery.

Effects of stuttering severity on articulation rate in fluent and dysfluent utterances of preschool children who stutter (취학 전 말더듬 아동의 말더듬 중증도에 따른 발화 형태 별 조음속도 비교)

  • Chon, HeeCheong;Lee, SooBok
    • Phonetics and Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.79-90
    • /
    • 2016
  • The purpose of this study was to investigate the effects of stuttering severity on articulation rate measured from different types of utterances in preschool children who stutter. Participants were 40 boys who stutter (CWS) and age-matched 10 boys who do not stutter (CWNS). CWS were sub-grouped based on the severity of their stuttering: 15 mild, 13 moderate, and 12 severe. Utterances were categorized as "overall utterance" including all utterances that children spoke and "fluent utterance" which did not contain any disfluencies. Utterances containing abnormal disfluencies were categorized as "SLD utterance" for CWS. The results revealed no significant difference among groups in any type of utterance. There were significant positive correlations in articulation rates between utterance types. Stuttering severity was not a factor for characterizing the articulation rate of each type of utterance. Also, current findings suggest that articulation rate may not predict speech motor control ability in preschool CWS.

Korean stop pronunciation and current sound change: Focused on VOT and f0 in different pronunciation types (한국어 폐쇄음 발음과 최근의 발음 변이: 발화 형태별 VOT와 f0를 중심으로)

  • Kim, Ji-Eun
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.41-47
    • /
    • 2017
  • The purpose of this study is to examine how speakers use VOT and f0 to distinguish tense, lax, and aspirated stops in isolated sentence reading and paragraph readings. To do so, a total of 20 males between the ages of 20-25 years old were asked to read (1) isolated sentences, (2) information-oriented text and (3) emotional expressive texts in which the stop pronunciation's VOT value and f0 were measured thereafter. The main results are as follows. In the isolate sentence reading, lax stops, and aspirated stops were distinguished by both VOT and f0, but for the Korean men that read reading texts, VOT is not a cue to distinguish between lax and aspirated stops. In general, the VOT differences between lax stops and aspirated stops were smaller for information-oriented texts and emotional expressive texts than that of the isolate sentence reading. In the paragraph reading that induces a natural utterance, the f0 dependence is greater for the distinction between lax and aspirated stops.

Correlation analysis of linguistic factors in non-native Korean speech and proficiency evaluation (비원어민 한국어 말하기 숙련도 평가와 평가항목의 상관관계)

  • Yang, Seung Hee;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.49-56
    • /
    • 2017
  • Much research attention has been directed to identify how native speakers perceive non-native speakers' oral proficiency. To investigate the generalizability of previous findings, this study examined segmental, phonological, accentual, and temporal correlates of native speakers' evaluation of L2 Korean proficiency produced by learners with various levels and nationalities. Our experiment results show that proficiency ratings by native speakers significantly correlate not only with rate of speech, but also with the segmental accuracies. The influence of segmental errors has the highest correlation with the proficiency of L2 Korean speech. We further verified this finding within substitution, deletion, insertion error rates. Although phonological accuracy was expected to be highly correlated with the proficiency score, it was the least influential measure. Another new finding in this study is that the role of pitch and accent has been underemphasized so far in the non-native Korean speech perception studies. This work will serve as the groundwork for the development of automatic assessment module in Korean CAPT system.

Correlation analysis of voice characteristics and speech feature parameters, and classification modeling using SVM algorithm (목소리 특성과 음성 특징 파라미터의 상관관계와 SVM을 이용한 특성 분류 모델링)

  • Park, Tae Sung;Kwon, Chul Hong
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.91-97
    • /
    • 2017
  • This study categorizes several voice characteristics by subjective listening assessment, and investigates correlation between voice characteristics and speech feature parameters. A model was developed to classify voice characteristics into the defined categories using SVM algorithm. To do this, we extracted various speech feature parameters from speech database for men in their 20s, and derived statistically significant parameters correlated with voice characteristics through ANOVA analysis. Then, these derived parameters were applied to the proposed SVM model. The experimental results showed that it is possible to obtain some speech feature parameters significantly correlated with the voice characteristics, and that the proposed model achieves the classification accuracies of 88.5% on average.

A study on the release burst spectra of the voiceless plosives from the English and Korean spontaneous speech corpus (영어와 한국어 자연발화 코퍼스에서의 무성 폐쇄음 개방 파열 스펙트럼 연구)

  • Hwang, Sunmi;Yoon, Kyuchul
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.27-34
    • /
    • 2017
  • The purpose of this work is to examine the English and Korean voiceless plosives from the Buckeye[15] and Seoul[16] corpus in terms of their static spectral characteristics. The plosives were automatically extracted by a Praat script. In order to estimate the percent correctness in the classification of the plosives, discriminant analyses were performed whose trainings were based on four spectral moments, i.e. the center of gravity, variance, skewness and kurtosis as suggested in [6]. Another set of discriminant analyses were performed based on the spectral tilts. In the last set of analyeses, the spectral moments and tilts were both used in the training. Results showed that the correct classification rate did not exceed around 65% in the best case, which suggested that phonetic cues other than the release burst would be necessary including the dynamic spectral aspects and vowel-onset cues.