• Title/Summary/Keyword: fundamental frequency of speech

Search Result 205, Processing Time 0.024 seconds

Personal Credit Evaluation System through Telephone Voice Analysis: By Support Vector Machine

  • Park, Hyungwoo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.63-72
    • /
    • 2018
  • The human voice is one of the easiest methods for the information transmission between human beings. The characteristics of voice can vary from person to person and include the speed of speech, the form and function of the vocal organ, the pitch tone, speech habits, and gender. The human voice is a key element of human communication. In the days of the Fourth Industrial Revolution, voices are also a major means of communication between humans and humans, between humans and machines, machines and machines. And for that reason, people are trying to communicate their intentions to others clearly. And in the process, it contains various additional information along with the linguistic information. The Information such as emotional status, health status, part of trust, presence of a lie, change due to drinking, etc. These linguistic and non-linguistic information can be used as a device for evaluating the individual's credit worthiness by appearing in various parameters through voice analysis. Especially, it can be obtained by analyzing the relationship between the characteristics of the fundamental frequency(basic tonality) of the vocal cords, and the characteristics of the resonance frequency of the vocal track.In the previous research, the necessity of various methods of credit evaluation and the characteristic change of the voice according to the change of credit status were studied. In this study, we propose a personal credit discriminator by machine learning through parameters extracted through voice.

A CELP Coder using the Band-Divided Long Term Prediction (대역 분할 장구간 예측을 이용한 CELP 부호화기)

  • Choi, Young-Soo;Kang, Hong-Goo;Lim, Myoung-Seob;Ahn, Dong-Soon;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.38-45
    • /
    • 1995
  • In this paper a way to improve the performance of the long term prediction is proposed, which adopts the Multi-band Excitation (MBE) method in addition to the Code-Excited Linear Prediction (CELP) method at low bit rates below 4.8 kbps. In the proposed method, the multiband long term prediction is performed on the periodic components which still remain after the long term prediction of the conventional CELP method. At this point, the whole frequency region is divided into subbands whose size is equal to the spacing between the harmonics of the fundamental frequency, and the periodic multiband excitation signals. are represented as the sum of sine waves approximately as large as the spectrum of the excitation signals, so that the actual characteristics of the excitation signals can be better taken into account. To evaluate the performance of the proposed method, computer simulation is performed at 4.8 kbps. The 4.8 kbps DoD CELP and the 4.4 kbps IMBE were chosen as the reference vocoders for the speech quality measure. The result of the perceptual speech quality measure showed that the performance of the proposed method is better than that of the 4.8 kbps DoD CELP vocoder, and similar to that of the 4.4 kbps IMBE vocoder.

  • PDF

Analysis of Phonatory Aerodynamic & Electroglottography of a Countertenor (Countertenor 1인의 Modal Register와 Falsetto Register에서의 공기역학적 변화 및 전기성문파형의 변화 연구)

  • Nam, Do-Hyun;Choi, Seong-Hee;Choi, Jae-Nam;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.1
    • /
    • pp.43-48
    • /
    • 2006
  • Background and Objectives: Countertenors who can produce higher vocal pitch like female classical singer's voice and use both modal and falsetto register. This study was conducted to study phonatory characteristics between modal and falsetto register of the countertenor. Materials and Methods: A male countertenor who had 8 years of experience was examined using a videostroboscopy and his voice was analyzed using aerodynamic measures; fundamental frequency(F0), Mean air flow rate(MFR), intensity(SLP), subglottal air pressure(Psub) with phonatory function analyzer(Nagashima) and acoustic measures; jitter, shimmer, HNR, closed quotient(CQ) using a Electro-glottography(EGG) of Lx. Speech Studio(Laryngoscope, Ltd, UK) and voice range profile of CSL(Kay elemetrics). Results: In the stroboscopy finding, the longitudinal length of vocal folds was increased at the falsetto register and the upper margin of vocal folds vibrated with incomplete closure of true vocal folds. In aerodynamic analysis, intensity was same at the modal and falsetto register. However, MFR, Psub, MPT were higher at the falsetto register. In the electroglottographic analysis, closed quotient(CQ) at the modal register was high and also much higher at the high-pitch falsetto than at the loud falsetto. In the VRP, intensity was similar though F0 was different between modal and falsetto register. Conclusion: It implied that countertenor could produce powerful voice quality by increasing of respiratory pressure and respiratory volume though glottal closure was incomplete. In addition, no change of EGG waveform, similar voice range with alto was observed.

  • PDF

Effects of Injection Laryngoplasty with Hyaluronic Acid in Patients with Vocal Fold Paralysis

  • Kim, Geun-Hyo;Lee, Jae-Seok;Lee, Chang-Yoon;Lee, Yeon-Woo;Bae, In-Ho;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Osong Public Health and Research Perspectives
    • /
    • v.9 no.6
    • /
    • pp.354-361
    • /
    • 2018
  • Objectives: The purpose of this study was to explore the effects of injection laryngoplasty (IL) with hyaluronic acid in patients with vocal fold paralysis (VFP). Methods: A total of 50 patients with VFP participated in this study. Pre- and post-IL assessments were performed, which included analyzing the sustained vowel /a/ phonation, and the patient reading 1 Korean sentence from the "Walk" passage that comprised 25 syllables in 10 words. To investigate the effect of IL on vocal fold function, acoustic analysis (acoustic voice quality index, cepstral peak prominence, maximum phonation time, speaking fundamental frequency) was conducted and auditory-perceptual (grade and overall severity), visual judgment (gap), and self-questionnaire (voice handicap index-10) assessments were performed. Results: The patients with VFP showed statistically significant differences between pre-and post-IL assessments for acoustic and auditory-perception, visual judgment, and self-questionnaire assessments. Conclusion: The patients with VFP showed positive change in vocal fold function between pre- and post-IL measurements. The findings showed that IL with hyaluronic acid is an effective method to improve vocal fold function in patients with VFP.

Influence of gender on Dysphonia Severity Index : A study of normative values (성(性)이 DSI(Dysphonia Severity Index)에 미치는 영향)

  • Hwang, Young-Jin;Lee, Jae-Hong;Kim, Chang-Tae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.3
    • /
    • pp.1161-1169
    • /
    • 2012
  • This study was investigate the usefulness of the DSI according to gender. The present study evaluated DSI in Korean population and Thirty voluntary participants (15 males, 15 females) who had G0 on the Grade, Roughness, Breathiness, Asthenics, Strain(GRBAS) and lived in Kyunggi-Do or Seoul, from March 2011 to June 2011. Maximum phonation time, the Highest fundamental frequency, the Lowest intensity, and Jitter(%) were measured using CSL 4500(Kay Pentax. USA). The experimenter explained the subjects the experimental condition and procedures and demonstrated the procedures, prior to the experiment. The result of in this study showed that the difference between males and females were not significants, although the Fhi was higher for females than for males. Therefore, We conclude that the DSI is a useful instrument to objectively measure the severity of dysphonia.

Effects of Respiration and Oral Motor Training based on Musical Elements and Singing on Voice of Healthy Elderly (음악요소와 노래 부르기를 활용한 호흡 및 구강훈련이 정상노인의 음성에 미치는 영향)

  • Jun, Hee-Un;Kim, Soo-Ji
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.380-387
    • /
    • 2011
  • This study was to investigate the effects of music-combined respiration and oral motor training on the voice of healthy elderly. 27 women attending a senior center in Seoul participated and were randomly assigned to the experimental (n = 16) and the control group (n = 11). Subjects attended music program(25 minutes per session) once a week for 4 weeks. For both groups, Fundamental Frequency (F0), Maximum Phonation Time (MPT) and Sequential Motion Rates (SMR) were measured using the Praat speech analysis program before and after the training. The results showed statistical significance in scores of intensity, F0, MPT, and SMR in the experimental group while only intensity was statistically significant in the control group. Considering that, the increasing life expectancy and growing number of older adults, their quality of life has been important. So this study suggests that the respiration and oral motor training would be effectively incorporated into training and services for this population.

The Effect of An Increase of Closed Quotient on Improvement of Voice Quality after Type I Thyroplasty in Patients with Unilateral Vocal Cord Paralysis (일측 성대마비 환자에서 성대내전술 후 성대접촉율의 증가가 음질 개선에 미치는 영향)

  • Kim, Han-Su;Choi, Seung-Hee;Lim, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.1
    • /
    • pp.16-20
    • /
    • 2004
  • Purpose : To assess perceptual, acoustic and aerodynamic measure of voice quality in patients with unilateral vocal cord paralysis before and after type I thyroplasty. Methods : The clinical records of patients operated type I thyroplasty in the Departement of otorhinoalryngolgy, Yongdong Severance hospital from November 2001 to November 2003 were reviewed. All patients uderwent a vocal function evaluation including perceptual, acoustic and aerodynamic measures of voice preoperative and on $60^{th}$ postoperative day. The perceptual and acoustic measures were obtained from recording of patients' reading a 'Sanchak' passage. The perceptual evaluation was performed by 2 speech pathologist using a 4-point rating scale. Acoustic parameters(voice range profile low(RAL), voice range profile high(RAH), average fundamental frequency(AFX), closed quotient, harmonic to noise ratio, jitter and shimmer) were investigated by Lx speech studio. Mean flow rate(MFR), subglottic pressure(Psub) and intensity were measured using the Phonatory function analyzer. The maximum phonation time was also measured. The data were statistically analyzed. A paired t-test (p<0.1) was used to compare preoperative and postoperative results. And multiple regression test was used to find which parameter was most correlated to improvement of postoperative voice quality. Results : Among aerodynamic parameters, Psub $(88.11mmH_2O{\rightarrow}58.7mmH_2O)$, MPT(7.87sec${\rightarrow}$12.53sec), MFR (359.8ml/sec${\rightarrow}$161.06ml/sec) were statistically improved. AFx(205.5Hz${\rightarrow}$163.27Hz), AQx(23.9%${\rightarrow}$48.3%), RAL, RAH. Jotter and shimmer were improved. In multiple regression test, AFx and AQx was noted as the two meost correlated parameters to improvement of postoperative breathiness. But general grade of voice quality was more correlated to Psub and shimmer. Conclusion : Vocal fold medialization procedures effectively reduce glottic gap. Increasing of contact area of both vocal folds induced improvement in aerodynamic parameters and leaded stabilizing of vocal fold vibration. That effect results in improvement in acoustic parameters (shimmer, jitter, signal-to-noise ratio, voice range profile) and voice quality.

  • PDF

Reliability of OperaVOXTM against Multi-Dimensional Voice Program to Assess Voice Quality before and after Laryngeal Microsurgery in Patient with Vocal Polyp (성대 용종 환자의 후두미세수술 전후 음성 평가에서 OperaVOXTM와 Multi-Dimensional Voice Program 간의 신뢰도 연구)

  • Kim, Sun Woo;Kim, So Yean;Cho, Jae Kyung;Jin, Sung Min;Lee, Sang Hyuk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.31 no.2
    • /
    • pp.71-77
    • /
    • 2020
  • Background and Objectives OperaVOXTM (Oxford Wave Research Ltd.) is a portable voice analysis software package designed for use with iOS devices. As a relatively cheap, portable and easily accessible form of acoustic analysis, OperaVOXTM may be more clinically useful than laboratory-based software in many situations. The aim of this study was to evaluate the agreement between OperaVOXTM and Multi-Dimensional Voice Program (MDVP; Computerized Speech Lab) to assess voice quality before and after laryngeal microsurgery in patient with vocal polyp. Materials and Method Twenty patients who had undergone laryngeal microsurgery for vocal polyp were enrolled in this study. Preoperative and postoperative voices were assessed by acoustic analysis using MDVP and OperaVOXTM. A five-seconds recording of vowel /a/ was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). Results Several acoustic parameters of MDVP and OperaVOXTM related to short-term variability showed significant improvement. While pre-operative value of F0, jitter, shimmer, NHR was 155.75 Hz (male: 125.37 Hz, female: 183.37 Hz), 2.20%, 6.28%, 0.16, post-operative values of these parameter was 164.34 Hz (male: 129.42 Hz, female: 199.26 Hz), 2.15%, 5.18%, 0.14 Hz in MDVP. While pre-operative value of F0, jitter, shimmer, NHR was 168.26 Hz (male: 135.16 Hz, female: 201.37 Hz), 2.27%, 6.95%, 0.26, post-operative values of these parameters was 162.72 Hz (male: 128.267 Hz, female: 197.18 Hz), 1.71%, 5.36%, 0.20 in OperaVOXTM. There was high intersoftware agreement for F0, jitter, shimmer with intraclass correlation coefficient. Conclusion Our results showed that the short-term variability of acoustic parameters in both MDVP and OperaVOXTM were useful for the objective assessment of voice quality in patients who received laryngeal microsurgery. OperaVOXTM is comparable to MDVP and has high intersoftware reliability with MDVP in measuring the F0, jitter, and shimmer

Physiologic Phonetics for Korean Stop Production (한국어 자음생성의 생리음성학적 특성)

  • Hong, Ki-Hwan;Yang, Yoon-Soo
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.2
    • /
    • pp.89-97
    • /
    • 2006
  • The stop consonants in Korean are classified into three types according to the manner of articulation as unaspirated (UA), slightly aspirated (SA) and heavily aspirated (HA) stops. Both the UA and the HA types are always voiceless in any environment. Generally, the voice onset time (VOT) could be measured spectrographically from release of consonant burst to onset of following vowel. The VOT of the UA type is within 20 msec of the burst, and about 40-50 msec in the SA and 50-70 msec in the HA. There have been many efforts to clarify properties that differentiate these manner categories. Umeda, et $al^{1)}$ studied that the fundamental frequency at voice onset after both the UA and HA consonants was higher than that for the SA consonants, and the voice onset times were longest in the HA followed by the SA and UA. Han, et $al^{2)}$ reported in their speech synthesis and perception studies that the SA and UA stops differed primarily in terms of a gradual versus a relatively rapid intensity build-up of the following vowel after the stop release. Lee, et $al^{3)}$ measured both the intraoral and subglottal air pressure that the subglottal pressure was higher for the HA stop than for the other two stops. They also compared the dynamic pattern of the subglottal pressure slope for the three categories and found that the HA stop showed the most rapid increase in subglottal pressure in the time period immediately before the stop release. $Kagaya^{4)}$ reported fiberscopic and acoustic studies of the Korean stops. He mentioned that the UA type may be characterized by a completely adducted state of the vocal folds, stiffened vocal folds and the abrupt decreasing of the stiffness near the voice onset, while the HA type may be characterized by an extensively abducted state of the vocal folds and a heightened subglottal pressure. On the other hand, none of these positive gestures are observed for the SA type. Hong, et $al^{5)}$ studied electromyographic activity of the thyroarytenoid and posterior cricoarytenoid (PCA) muscles during stop production. He reported a marked and early activation of the PCA muscle associated with a steep reactivation of the thyroarytenoid muscle before voice onset in the production of the HA consonants. For the production of the UA consonants, little or no activation of the PCA muscle and earliest and most marked reactivation of the thyroarytenoid muscle were characteristic. For the SA consonants, he reported a more moderate activation of the PCA muscle than for the UA consonant, and the least and the latest reactivation of the thyroarytenoid muscle. Hong, et $al^{6)}$ studied the observation of the vibratory movements of vocal fold edges in terms of laryngeal gestures according to the different types of stop consonants. The movements of vocal fold edges were evaluated using high speed digital images. EGG signals and acoustic waveforms were also evaluated and related to the vibratory movements of vocal fold edges during stop production.

  • PDF

The Experimental Phonetic Study of Word Accent in Standard Korean (표준한국어 악센트의 실험음성학적 연구 -청취 테스트 및 음향분석-)

  • Seong Cheol-jae
    • MALSORI
    • /
    • no.21_24
    • /
    • pp.43-89
    • /
    • 1992
  • In this thesis, the prominent aspect of word accent in standard Korean is studied by auditory test and acoustic analysis experiment. The definition of 'accent' is, following Hoyoung Lee's discussion(1990), to be described as 'the means whereby a focused part of an utterance is made to stand out in order to concentrate the hearer's attention on it.' That is to say, the ten of 'accent' may be described in terms of phonological phenomenon and the accented syllable can be phonetically prominent as the result of those phonological process. Prosodic features may have different characteristics in different languages whether they contain linguistically important functions or not. Thus the characteristics of word accent in standard Korean will be determined as the content and trait of prosodic features. Following this viewpoint, present study looked over prosodic features which may effect the characteristics of word accent in standard Korean, through systematic experimental procedure. And the result of this experiment has been verified by statistical method, the T-test, for the purpose of identifying the relatedness among prosodic features(parameters). This thesis, therefore, aimed to investigate the intrinsic acoustic and physical qualities of the word accent in standard Korean. Nonsense words composed by 'mal' and 'ma' which can be divided into 'heavy syllable' and 'light syllable' quoted from Hyman(1975) have been classified into 28 types with respect to syllable numbers(2 syl., 3 sy1., 4 syl.) and these words have become the target of auditory test and acoustic experiment. As the result of those experimental Procedures, the word accent in standard Korean may be said that it has a tendency of fixing first two syllables regardless of syllable numbers. The syllable types of HH, HL, LL in the first two syllables may be prominent at first syllable and the type of H may be at second syllable. Various prosodic features(parameters) including duration, intensity, and Fo(purely phonetic terms) were also strengthened in those positions. The result of this experiment can be cleared up like these : 1. The most important feature is proved as 'duration', the feature of intensity resulted in more subsidiary one than the feature of duration. 2. Fo( fundamental frequency) could be observed as having some coherent contour through almost all syllable types(99 %), that is, in 2 syllable types, it had rising contour, in 2 syllable types, rising-falling contour, and in 4 syllable types, it contained rising-falling-rising contour. The result of auditory test was different with those contour forms of all Fo surveyed. With respect to these results, the discuss for Fo is determined' to be excluded comparing other features. 3. Finally, this thesis resulted in a decision that the word accent in standard Korean may has fixed(somewhat weaker) accent, especially fixed at first two syllables in almost all words. 4. Various kinds of syllable types related with 2,3,4 syllables, therefore, can be reclassified into 4 types of HH, HL, LH, LL following the concept of accent fixing placement(i.e. first two syllables). In these 4 types, the types of HH, HL, LL were prominent at the position of the first syllable , and the type of LH was prominent at the second syllable otherwise.

  • PDF