• Title/Summary/Keyword: Voice Speakers

Search Result 172, Processing Time 0.024 seconds

Voice and the Image triggered by the Voice - American speakers and Korean listeners - (음성과 그로 인해 만들어지는 이미지의 연계성 - 미국인 화자와 한국인 청자 -)

  • Tak, Ji-Hyun;Moon, Seung-Jae
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.71-75
    • /
    • 2005
  • We can easily recognize the voices already known to us. But what about unknown voices? Is there any relationship between voices and the images triggered by the voices? Actually, this question has been partly addressed by Moon(2000, 2002). The current study aims at shedding some more lights on the topic by investigating the relationship between unknown foreign voices and the images triggered by them. Speech samples from 16 American males and females (8 each) were recorded and 180 Korean subjects without any knowledge of the American Speakers were asked to match the voices with the corresponding photos. And the number of corrects matches between voices and pictures of the current study was less than that of Korean-speakers and Korean-listeners case. But in terms of the majority matches, regardless of correctness, the present study showed a similar trend: that is, there is more than a chance relationship between voices and the images triggered by the voices.

  • PDF

Intelligibility and Aerodynamic Study of Tracheopharyngeal and Tracheoesophageal Speechs (기관인두발성과 기관식도발성에 대한 이해도 및 공기역학적 검사)

  • 조승호;김민식;박영학;서병도
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.4 no.1
    • /
    • pp.19-23
    • /
    • 1991
  • Selected characteristics were compared in the speech of three tracheoesophageal, five tracheopharyngeal and ten normal laryngeal adult speakers. Tracheoesophageal speakers use Blom-Singer voice prosthesis after total laryngectomy and tracheopharyngeal speakers use tracheopharyngeal myomucosal shunt after near-total laryngectomy. Intelligibility judgement was based on standard Korean monosyllabic and bisyllabic word lists of 50 items. Aerodynamic study was composed of maximum phonation time, phonaton quotient. phonation pressure and mean air flow rate. Results indicate that intelligibility of tracheopharygeal speech is more similar to normal laryngeal speech than tracheoesophageal speech using Blom-Singer voice prosthesis.

  • PDF

A Comparison Study of Breath Groups during Reading Paragraph Tasks in Normal Adults and Adult Patients with Voice Disorders: A Preliminary Study (정상 성인 화자와 음성장애 성인 화자의 문단낭독 시 호흡단락에 대한 비교 연구: 예비연구)

  • Pyo, Hwayoung;Kim, Soyeon;Baek, Seungkuk
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.181-187
    • /
    • 2014
  • The present study was performed to investigate the characteristics of breath groups while reading paragraph in normal adults and adult patients with voice disorders. 10 normal females(avr. 20.6 yrs.), 10 young voice disorder females(avr. 33.5 yrs., P1 group), and 10 old voice disorder females(avr. 56.3 yrs., P2 group) read a paragraph of 210 syllables. By using the 'Running Speech' program of the Phonatory Aerodynamic System(PAS), total duration, numbers of breath groups, duration per breath group, and numbers of syllables per breath group were measured, and their correlations with aerodynamic measurement results of reading were analyzed. As a result, in total duration, numbers of breath groups, normals scored highest and P2 group speakers, lowest. Normals showed the longest duration per breath group which was not significant. P2 group speakers showed the highest numbers of syllables per breath group. Correlation analysis showed significantly high correlation scores of total duration and expiratory airflow; numbers of breath groups and inspiratory volume.

Gender Classification of Speakers Using SVM

  • Han, Sun-Hee;Cho, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.59-66
    • /
    • 2022
  • This research conducted a study classifying gender of speakers by analyzing feature vectors extracted from the voice data. The study provides convenience in automatically recognizing gender of customers without manual classification process when they request any service via voice such as phone call. Furthermore, it is significant that this study can analyze frequently requested services for each gender after gender classification using a learning model and offer customized recommendation services according to the analysis. Based on the voice data of males and females excluding blank spaces, the study extracts feature vectors from each data using MFCC(Mel Frequency Cepstral Coefficient) and utilizes SVM(Support Vector Machine) models to conduct machine learning. As a result of gender classification of voice data using a learning model, the gender recognition rate was 94%.

Basic Phonetic Problems Encountered by Poles Studying Korean. (폴란드인이 한국어 학습에 나타난 발음상의 음성학적 문제)

  • Paradowska Anna Isabella
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.247-251
    • /
    • 1996
  • This paper is intended as a preliminary study on phonetic and phonological differences between Polish and Korean languages. In this paper an attempt is made to examine the most conspicious difficulties encountered by Polish learners who begin to speak Korean (and in doing so, 1 would hope that it might be of help to future learners of both languages). Since the phoneme inventory and general phonetic rules for both languages are very different, teaching and learning accurate pronunciation is extremely difficult for both the Poles and Koreans without any previous phonetic training. In the case of Polish and Korean we can see how strong and persistent the influences of the mother-tongue are on the target language. As an example I would like to discuss the basic differences between Polish and Korean consonants. The most important consonantal opposition in Polish is voice-/voicelessness (f. ex.; 〔b〕 / 〔p〕, 〔g〕 / 〔k〕) while in Korean, opposition such as voice-/voicelessness is of secondary importance. Therefore Korean speakers do not perceive the difference between Polish voiced and voiceless consonants. On the other hand, Polish speakers can not distinguish Korean lenis / fortis / aspirated consonants (f. ex.; ㅂ 〔b〕 / ㅃ 〔p〕 / ㅍ〔ph〕, ㄱ 〔g〕 / ㄲ 〔k〕 / ㅋ 〔kh〕)) opposition. The other very important factor is palatalization which is of vital importance in Polish and, because of this, Polish speakers are extremely sensitive to it. In Korean palatalization is not important phonetically and Korean speakers do not distinguish between palatalized and non-palatalized consonants. The transcription used here is based on ' The principles of the International Phonetic Association and the Korean Phonetic Alphabet ' (1981) by Hyun Bok Lee.

  • PDF

The Voiceless Stop Distinction in the Alaryngeal Speech

  • Hong, Ki-Hwan;Kim, Hyun-Ki
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.53-64
    • /
    • 2000
  • Theoretically, alaryngeal speakers have difficulty in accomplishing the production of voiceless consonants. However, the perceptual studies often reveal a clear production of voiceless consonants giving good articulation scores in skilled alaryngeal speakers. The purpose of the present study was to clarify the production of voiceless stops in mode of articulation to normal speakers and skilled alaryngeal speakers. The acoustic characteristics of alaryngeal speech compared to the normal speech were investigated with special reference to the voiceless stop consonants. The surface electromyography from neck is used to monitor pharyngeal activity during speech. The general result is. that esophageal, shunt and neoglottal speakers realize the distinctions between the three types of [p] in a manner parallel to normals, whereas those using an electric voice generator do not.

  • PDF

A Research on Response Time and Identification of English High Back Vowels (영어 후위고설모음들의 반응시간과 인식에 대한 연구)

  • Yun, Yung-Do
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.49-56
    • /
    • 2011
  • This study investigates how American English high back vowels are identified. American English and Korean speakers participated in a phonetic experiment for this study. This study shows their response times of the vowels and discusses how the speakers identified them. For the experiment I used a synthesized vowel continuum between American English /u/ and /$\mho$/based on American English male speakers' voice obtained by Peterson and Barney (1952). I manipulated spectral steps and vowel duration of the stimuli. The statistical results showed that American English speakers were not able to distinguish the stimuli based on spectral quality. Instead they relied on vowel duration. This suggests that the American English high back vowels have changed since Peterson and Barney recorded them in 1952. The Korean speakers also relied on vowel duration, not spectral quality since they could not distinguish them. American speakers' response times of these vowels were not affected by both spectral quality and vowel duration. Koreans' response times were affected by vowel durations only.

  • PDF

Zero-shot voice conversion with HuBERT

  • Hyelee Chung;Hosung Nam
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.69-74
    • /
    • 2023
  • This study introduces an innovative model for zero-shot voice conversion that utilizes the capabilities of HuBERT. Zero-shot voice conversion models can transform the speech of one speaker to mimic that of another, even when the model has not been exposed to the target speaker's voice during the training phase. Comprising five main components (HuBERT, feature encoder, flow, speaker encoder, and vocoder), the model offers remarkable performance across a range of scenarios. Notably, it excels in the challenging unseen-to-unseen voice-conversion tasks. The effectiveness of the model was assessed based on the mean opinion scores and similarity scores, reflecting high voice quality and similarity to the target speakers. This model demonstrates considerable promise for a range of real-world applications demanding high-quality voice conversion. This study sets a precedent in the exploration of HuBERT-based models for voice conversion, and presents new directions for future research in this domain. Despite its complexities, the robust performance of this model underscores the viability of HuBERT in advancing voice conversion technology, making it a significant contributor to the field.

Reinterpretation of Stop Production in Korean Elderly Speakers (노년층 파열음 발음의 재해석)

  • Kim, Ji-Eun
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.139-145
    • /
    • 2015
  • Researchers have claimed that Korean younger speakers tend to less clearly differentiate aspirated and lax stops with VOT values while older speakers clearly differentiate these two stops with VOT values. To explain this phenomena, the current study consider both an aging effect and a general sound shift. For this study, VOT values and F0 of Korean stops produced by eight male speakers(years of birth were 1942 ~ 1952) analyzed using Praat. Their productions were compared with the values of participants whose year of birth were 1943 ~ 1952) in Silva(2006)'s research. Silva's research was conducted in 2004 using the same methods. The result shows that 2014's VOT gap between aspirated and lax stops and less F0 gap between aspirated and lax stops than those of 2004. When the F0 values related to physical conditions of the larynx is considered, it could be analyzed as the following: to distinguish the three-way phonation type clearly, older speakers depend on the VOT value more instead of F0 which they have difficulty to control.

Acoustic characteristics of the sustained vowel phonation according to age groups (모음 연장 발성이 보이는 연령대별 음향음성학적 특성 연구)

  • Seo, Yoon-Jeong;Shin, Jiyoung
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.67-76
    • /
    • 2018
  • This study was performed to investigate acoustic characteristics of sustained vowels produced by Seoul Korean speakers. For this study, three hundred nine healthy adults were chosen as participants from Korean Standard Speech Database. These subjects were divided into five chronological age groups (20s, 30s, 40s, 50s, 60-70s) and two gender groups (male and female). Fundamental frequency (f0), jitter, shimmer, and NHR (noise-to-harmonics ratio) was measured with 8 Korean vowels (/ɑ/, /æ/, /ʌ/, /e/, /o/, /u/, /ɯ/, /i/) by using Praat. The results showed that the vowel type significantly affected all acoustic parameters. Gender affected f0, jitter, and NHR significantly. The mean female speakers' f0 was greater than the males', and the mean jitter and NHR of male speakers was greater than the females'. Moreover, age affected shimmer and NHR significantly; in particular, the shimmer and NHR of elderly speakers was greater than the young speakers.