• Title/Summary/Keyword: Voice Speakers

Search Result 170, Processing Time 0.024 seconds

The Relationship between Acoustic Characteristics and Voice Handicap Index in Esophageal Speakers (식도발성 환자의 음향학적 특성과 음성장애지수의 상관성)

  • Jang, Hyo-Ryung;Shim, Hee-Jeong;Shin, Hee-Baek;Ko, Do-Heung;Kim, Hyun-Ki
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.115-121
    • /
    • 2014
  • This paper investigates the relationship between acoustic characteristics and voice handicap index for 29 males with esophageal speakers. Acoustic characteristics were measured by using a sustained vowel /a/ three times. The stable vocalization for 2 seconds was analyzed by MDVP program. Specifically, relationships between four VHI scores (total, functional, physical, and emotional) and three acoustic characteristics (jitter, shimmer, and NHR) were investigated using the Pearson correlation coefficient. As results, we found no relationship between NHR and VHI scores. However, both jitter and shimmer had statistically significant correlations with all four VHI scores. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation with esophageal speakers. Further research could be done to examine the overall quality of life survey, which is widely used as a subjective measure about voice for patients with esophageal speakers.

An Acoustic Study on the Voice Imitation(3) - Based on a professional voice imitator′s speech - (모방 발화의 음향음성학적 연구(3) -전문 성대 모사자의 자료를 중심으로-)

  • Ahn Byoung-seob;Park Mi-young
    • MALSORI
    • /
    • no.52
    • /
    • pp.1-14
    • /
    • 2004
  • In this study, we investigated acoustic characteristics of imitated utterances by a professional voice imitator, focusing on prosodic properties such as vowel formants and f0 distribution. To see the patterns of a voice imitation by a professional voice imitator, we compared the imitator's voice data with target speakers' voice data. The professional imitator, Mr. Bae produced utterances imitating the former President Kim's, the comedian Choi's, and the singer Bae's voices. Auditorily, the imitator was judged to imitate all the target speakers' voices successfully. However, acoustic examination showed that the imitator was better at imitating the singer Bae's voice in that the imitator's and the singer Bae's voices are more alike with respect to vowel formants and f0 distribution. We infer this is because the imitator's normal voice is very similar to the singer Bae's voice. On the other hand, the imitator's voice data showed that the patterns of vowel formants and f0 distribution found in the imitator's imitation voices of the other two target speakers were different from those of target speakers' voices.

  • PDF

The Aerodynamic Analysis between Normal Voice and Esophageal Voice (정상인과 식도발성 음성에서의 공기역학적 비교 연구)

  • 박국진;최홍식;정형진;유신영;박준호;김한수
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.9 no.1
    • /
    • pp.5-10
    • /
    • 1998
  • Voice rehabilitation is very important concerning in laryngectomees. Esophageal speech is a common and widely used method of voice restoration. But, until now there is no reliable data which shows the aerodynamic characteristics of esophageal speech. In order to evaluate the vocal quality of normal laryngeal and esophageal speech, several aerodynamic parameters were measured in 13 adults with normal laryngeal voice and 2 excellent esophageal speakers using Aerophone II voice function analyzer. The examined parameters were maximal flow rate, mean airflow rate, subglottic pressure, vocal efficiency, glottic resistance, maximal phonation time and mean sound pressure level. In vocal efficiency, there is no difference between two groups, but in other parameters, marked differences were showed in esophageal speakers, especially mean resistance. Results indicates that esophageal speakers make the efficient voices with poor aerodynamic condition, comparing with normal laryngeal speakers.

  • PDF

Vocal acoustic characteristics of speakers with depression (우울증 화자 음성의 음향음성학적 특성)

  • Baek, Yeon-Sook;Kim, Se-Joo;Kim, Eun-Yeon;Choi, Yae-Lin
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.91-98
    • /
    • 2012
  • The purposes of this paper is to study the characteristics of compared to the speakers voice without depression and speakers with depression, and to propose a objective method for the measurement of the therapeutic effects as well as for diagnostics of depression based on the characteristics. The voice samples obtained from 11 female speakers with depression, aged from 20 to 40, diagnosed as having major depressive disorder by an psychiatrist were compared with those from 12 normal controls with matched sex, age, height, weight, education, smoking, and drinking. The voice samples are taken by a portable digital recorder(TASCAM DR-07, Japan) and analysed using the MDVP(Multi-Dimentional Voice Program) software module from CSL(Computerized Speech Lab, kay elemetrics, co, model 4100). The result of the investigation are as following. First, the average speaking fundamental frequency and loudness range of the speakers with depression group was statistically significantly lower than that of the control group. The pitch range of the control group was rather higher than that of the speakers with depression group, but without statistical significance. Overall speech rates have no statistical difference between two groups. Second, the average speaking fundamental frequency and loudness range have statistically significant negative correlation with Beck Depression Inventory, i. e. more severe depression exhibits lower average speaking fundamental frequency and loudness range. Other vocal parameters such as pitch range and overall speech rate have no statistically meaningful correlations with Beck Depression Inventory.

Spectral and Cepstral Analyses of Esophageal Speakers (식도발성화자 음성의 spectral & cepstral 분석)

  • Shim, Hee-Jeong;Jang, Hyo-Ryung;Shin, Hee-Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.47-54
    • /
    • 2014
  • The purpose of this study was to analyze spectral versus cepstral measurements in esophageal speakers. The comparison between the measurements in thirteen male esophageal speakers was compared with the control group of thirteen normal speakers using the sustained vowel /a/. The main results can be summarized as below: (a) the CPP and L/H ratio of the esophageal group were significantly lower than those of the control group (b) the CPP was significantly correlated with the spectral parameters such as jitter, shimmer, NHR and VTI, and (c) the ROC analysis showed that the threshold of 10.25dB for the CPP achieved a good classification for esophageal speakers, with 100% perfect sensitivity and specificity. Thus, it was known that cepstral-based acoustic measures such as CPP, may be more reliable predictors than other spectral-based acoustic measures such as jitter and shimmer. And it was found that cepstral-based acoustic measures were effective in distinguishing esophageal voice quality from normal voice quality. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation with laryngectomees.

A Study of Fundamental Frequency about Voice Imitation (모방발화의 기본주파수 연구)

  • Park, Mi-Young;Shin, Ji- Young;Kang, Sun-Mee
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.199-204
    • /
    • 2004
  • The purpose of this paper is to find prosodic characteristics in voice imitation. Speakers change various phonetic features in voice imitation. Speakers change their pitch ranges in the most cases. Especially, the pitch range is important for word conditions. And, as imitators change the voice, the average value of f0 is close to high frequence than low frequence or middle level.

  • PDF

Many-to-many voice conversion experiments using a Korean speech corpus (다수 화자 한국어 음성 변환 실험)

  • Yook, Dongsuk;Seo, HyungJin;Ko, Bonggu;Yoo, In-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.351-358
    • /
    • 2022
  • Recently, Generative Adversarial Networks (GAN) and Variational AutoEncoders (VAE) have been applied to voice conversion that can make use of non-parallel training data. Especially, Conditional Cycle-Consistent Generative Adversarial Networks (CC-GAN) and Cycle-Consistent Variational AutoEncoders (CycleVAE) show promising results in many-to-many voice conversion among multiple speakers. However, the number of speakers has been relatively small in the conventional voice conversion studies using the CC-GANs and the CycleVAEs. In this paper, we extend the number of speakers to 100, and analyze the performances of the many-to-many voice conversion methods experimentally. It has been found through the experiments that the CC-GAN shows 4.5 % less Mel-Cepstral Distortion (MCD) for a small number of speakers, whereas the CycleVAE shows 12.7 % less MCD in a limited training time for a large number of speakers.

The Perception and Production of Vietnamese Tones by Japanese, Lao and Taiwanese Second Language Speakers

  • Dao, Muc Dich;Anh, Thu T. Nguyen
    • SUVANNABHUMI
    • /
    • v.14 no.1
    • /
    • pp.193-228
    • /
    • 2022
  • This study investigates the production and perception of Vietnamese tones by Japanese, Lao, and Taiwanese second language (L2) learners [n=30], comparing their performance in an Imitation task to that of Identification and Read-Aloud tasks. The results show that the Imitation task is generally easier for L2 speakers than the Identification and Read-Aloud tasks, suggesting that imitation is performed without some of the skills required by the other two tasks. It is also found that Lao and Taiwanese speakers outperform Japanese speakers, suggesting that prior experience with one tone language facilitates the acquisition of tone in another language. The result on speakers' tonal range show that L2 leaners have significantly narrower tonal F0 range than control Vietnamese speakers [n=11]. The results of error pattern analysis and tonal transcription also suggest that non-modal voice (glottal stop and creakiness) and contour tones (bidirectional fall-rise) are more difficult for L2 learners than modal voice tones (e.g., unidirectional contours: rising, falling, and level).

The Effects of the Methods of Disguised Voice on the Aural Decision (위장 발화 방법의 차이가 청취 판단에 미치는 영향)

  • Song Min-Chang;Shin Jiyoung;Kang SunMee
    • MALSORI
    • /
    • no.46
    • /
    • pp.25-35
    • /
    • 2003
  • This study deals with the disguised voice (or voice disguise) in the field of forensic phonetics. We especially studied the effects of the methods of disguised voice on the aural decision. Within the nonelectronic-deliberate voice disguise area, the methods of disguised voice include use of lowered pitch, pinched nostrils, falsetto, and whisper. Ten (male:5, female:5) Seoul speakers made a recording of 16 sentences. In the aural test, 30 subjects listened normal and disguised voice. And they were asked to make a decision whether speakers identified or not. The result is as follows: The speaker verification of the falsetto and whisper was more difficult than the lowered pitch and pinched nostrils.

  • PDF

Usability Analysis and Improvement Plan for Intelligent Speakers in the 4th Industrial Revolution Environment

  • Seong-Hoon Lee;Dong-Woo Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.119-125
    • /
    • 2023
  • Smart home in the 4th industrial revolution environment is where all devices in the home are connected to each other to provide the optimal living environment desired by the user. Artificial intelligence speakers are being used as a way to manage and control all devices used in this environment. The function of an artificial intelligence speaker ranges from simple music playback to serving as an interface that controls and manages all devices in a smart home space. In this study, we investigated and analyzed the usability of artificial intelligence speakers based on the current status of domestic and overseas markets and the survey contents of two organizations (Korea Consumer Agency and Korea Information and Communication Policy Institute (KISDI)). In addition, we investigated and analyzed the usability of artificial intelligence speakers. Based on the results of responses from users from two related organizations, major problems were derived, and major improvement measures, such as discovering new functions and improving voice recognition performance, were also described.