• 제목/요약/키워드: female speakers

검색결과 124건 처리시간 0.02초

개별화자의 음성파라미터 추출에 관한 연구: 음성파라미터의 상관관계를 중심으로 (A Study of Extracting Acoustic Parameters for Individual Speakers)

  • 고도흥
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.129-143
    • /
    • 2003
  • Fundamental frequency (Fo), jitter, shimmer, and harmonics-to-noise ratio (NHR) have been measured to see their interactions between the parameters using Multi-Dimensional Voice Program (MDVP). 100 Korean normal adults (50 males and 50 females) ranging from their early 20's to their early 30's produced the eight sustained vowels including /a/, /i/, /u/, /c/, /e/,/$\varepsilon$/, /i/, and /e/. The subjects were asked to read the above vowels five times in isolation with the interval of five seconds, respectively. Male voices, on the average, showed 130.7 Hz in Fo, 0.6696% in jitter, 1.8151% in shimmer, and 0.12 in NHR, while female voices showed 232.8 Hz in Fo, 0.9222% in jitter, 1.9199% in shimmer, and 0.1098 in NHR. As to the correlation coefficient, it was found that for male speakers jitter vs. shimmer, shimmer vs. NHR, Fo vs. shimmer, and Fo vs. NHR are statistically significant. It was found that for female subjects jitter vs. shimmer and Fo vs. shimmer are statistically significant. However, it is concluded that the correlation coefficient in females are not meaningful in a practical way though they are all statistically significant.

  • PDF

A Corpus-based study on the Effects of Gender on Voiceless Fricatives in American English

  • Yoon, Tae-Jin
    • 말소리와 음성과학
    • /
    • 제7권1호
    • /
    • pp.117-124
    • /
    • 2015
  • This paper investigates the acoustic characteristics of English fricatives in the TIMIT corpus, with a special focus on the role of gender in rendering fricatives in American English. The TIMIT database includes 630 talkers and 2342 different sentences, comprising over five hours of speech. Acoustic analyses are conducted in the domain of spectral and temporal properties by treating gender as an independent factor. The results of acoustic analyses revealed that the most acoustic properties of voiceless sibilants turned out to be different between male and female speakers, but those of voiceless non-sibilants did not show differences. A classification experiment using linear discriminant analysis (LDA) revealed that 85.73% of voiceless fricatives are correctly classified. The sibilants are 88.61% correctly classified, whereas the non-sibilants are only 57.91% correctly classified. The majority of the errors are from the misclassification of /ɵ/ as [f]. The average accuracy of gender classification is 77.67%. Most of the inaccuracy results are from the classification of female speakers in non-sibilants. The results are accounted for by resorting to biological differences as well as macro-social factors. The paper contributes to the understanding of the role of gender in a large-scale speech corpus.

남성과 여성의 음성 특징 비교 및 성별 음성인식에 의한 인식 성능의 향상 (Comparison of Male/Female Speech Features and Improvement of Recognition Performance by Gender-Specific Speech Recognition)

  • 이창영
    • 한국전자통신학회논문지
    • /
    • 제5권6호
    • /
    • pp.568-574
    • /
    • 2010
  • 음성인식에서의 인식률 향상을 위한 노력의 일환으로서, 본 논문에서는 성별을 구분하지 않는 일반적 화자독립 음성인식과 성별에 따른 음성인식의 성능을 비교하는 연구를 수행하였다. 실험을 위해 남녀 각 20명의 화자로 하여금 각각 300단어를 발성하게 하고, 그 음성 데이터를 여성/남성/혼성A/혼성B의 네 그룹으로 나누었다. 우선, 성별 음성인식에 대한 근거의 타당성을 파악하기 위하여 음성 신호의 주파수 분석 및 MFCC 특징벡터들의 성별 차이를 조사하였다. 그 결과, 성별 음성인식의 동기를 뒷받침할 정도의 두드러진 성별 차이가 확인되었다. 음성인식을 수행한 결과, 성을 구분하지 않는 일반적인 화자독립의 경우에 비해 성별 음성인식에서의 오류율이 절반 이하로 떨어지는 것으로 나타났다. 이로부터, 성 인식과 성별 음성인식을 계층적으로 수행함으로써 화자독립의 인식률을 높일 수 있을 것으로 사료된다.

대역별로 여과한 음성 강도의 차이값과 상관계수에 의한 화자확인 연구 (A Study on Speaker Identification by Difference Sum and Correlation Coefficient of Intensity Levels from Band-pass Filtered Sounds)

  • 양병곤
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.249-258
    • /
    • 2003
  • This study attempted to examine a speaker identification method using difference sum and correlation coefficient determined from a pair of intensity level matrices of band-pass-filtered numeric sounds produced by ten female speakers of similar age and height. Subjects recorded three digit numbers at a quiet room at a sampling rate of 22 kHz on a personal computer. Collected data were band-pass-filtered at five different band ranges. Then, matrices of five intensity levels at 100 proportional time points were obtained. Pearson correlation coefficients and the sum of absolute intensity differences between a pair of given matrices were determined within and across the speakers. Results showed that very high correlation coefficient and small difference sum generally occurred within each speaker but some individual variation was also observed. Thus, the matrix pair with a higher coefficient and a smaller difference sum was averaged to form each individual's model. Comparison among the speakers yielded generally low coefficients and large differences, which suggests successful speaker identification, but among them there were a few cases with very high coefficients and small differences. Future studies will focus on finer band ranges and additional spectral parameters at some peak points of the intensity contour at a low frequency band.

  • PDF

구강기류 분석에 근거한 정상 성인의 문단 읽기 시 호흡그룹의 특징 (The Study of Breath Group Based on Oral Airflow in Reading by Healthy Speakers)

  • 한지연;이옥분;심이슬
    • 음성과학
    • /
    • 제15권4호
    • /
    • pp.135-146
    • /
    • 2008
  • Breath group generally refers to one of units of speech production. It is an integral component of structural and contextual features of utterances with some responsibility for fluctuations in speech intelligibility. The purpose of this study was to know the characteristics of breath group in reading passages spoken by healthy speakers, specifically in view of aerodynamic aspects. Eighteen female speakers aged from 20 to 30 years old without communication problems and in healthy condition were participated in this study. PAS (Phonatory Aerodynamic System) was used for aerodynamic measurement of breath group. Results showed that the mean value of breath group in reading tasks was 16.03 per minute (SD=3.1), and the spoken syllables per one breath group were 17.95. And the mean time (m) of breath group was 3.06 (SD=0.62), and the ratio of exhalation and inhalation was appeared in the 1:5. The results need to be discussed in values of normality of breath group and clinical viewpoint, especially their potential implication from speech intelligibility caused by brain damage.

  • PDF

The effects of length of residence (LOR) on voice onset time (VOT)

  • Kim, Mi-Ryoung
    • 말소리와 음성과학
    • /
    • 제12권4호
    • /
    • pp.9-17
    • /
    • 2020
  • Changes in the first language (L1) sound system as a result of acquiring a second language (L2) (i.e., phonetic drift) have received considerable attention from a variety of speakers, settings, and environments. Less attention has been given to phonetic drift in adult speakers' L2 learning as their length of residence in America (LOR) increases. This study examines the effects of LOR on voice onset time (VOT) in L1 Korean stops. Three different groups of Korean adult learners of L2 English were compared to assess how malleable their L1 representations are in terms of LOR and whether there is any relationship between L1 change and L2 acquisition. The results showed that the effect of LOR was linguistically unimportant in the production of Korean stops. However, VOT merger as evidence of sound change in Korean stops were robust in the speech production of most of the female speakers across the groups. The results suggest that L2 English may not be the primary cause of L1 sound change. For generalizability, further study is necessary to see whether other acoustic cues show a similar pattern.

조선후기 '님' 담론의 특성과 그 의미 : 사설시조와 잡가를 중심으로 (The Characteristics and Significance of 'Nim' Texts in the Late Chason Period: Focused on Saseol-sijo and Chap-ga)

  • 신은경
    • 한국시조학회지:시조학논총
    • /
    • 제20집
    • /
    • pp.113-139
    • /
    • 2004
  • 사설시조에서는 남성화자가 여성 님에 대한 그리움과 연모의 정을 토로하는 작품들이 급증하고 이 현상은 잡가에서 극대화된다. 이런 텍스트들은 근대 이후의 시에서 '여성예찬' 혹은 '여성의 초점화'와 같은 주제가 활성화되는 내적 토대가 되며 여성에 대한 당대의 시각변화를 반영하고 있어 사설시조가 내포하는 '근대성'을 규명하는 중요한 단서가 된다. '님'에 대한 애정을 노래한 사설시조는 애정의 성격에 따라 '상사형'과 '육정형'으로 나뉠 수 있는데 사설시조에서 특별히 부각되기 시작하는 유형은 '남성 상사형' 즉 남성화자가 여성 님에 대한 그리움을 노래하는 유형이다. 남성 상사형 사설시조는, 이전의 한국 시가의 전형적 패턴이 되어 온 '여성상사형' 즉 여성화자가 남성님을 향한 그리움과 연모의 정을 표현하는 유형과는 달리 ‘여성'이 발화와 가치의 중심에 놓인다는 특징을 지닌다. '상사형' 텍스트에서 '님'은 시적 화자에게 삶의 의미를 부여하는 가치의 총체이자 발화의 중심이 되는데 '남성 상사형'은 곧, 남성만이 차지하던 '님'의 자리에 여성도 위치하게 되었음을 알려주는 징표가 된다. 본고에서는 이를 '여성의 초점화'라는 용어로 포괄하여 사설시조 및 잡가를 통해 그 구체적인 양상을 살폈으며 근대 이후의 시에서 '마돈나' '그대' '당신'으로 지칭되는 여성 님의 존재를 부각시키는 내적 토대가 된다는 것을 검토하였다.

  • PDF

Praat과 R로 분석한 한국인 대화 음성 말뭉치의 fundamental frequency(f0)값 분포 (The fundamental frequency (f0) distribution of Korean speakers in a dialogue corpus using Praat and R)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제15권3호
    • /
    • pp.17-25
    • /
    • 2023
  • 이 논문은 국립국어원에서 배포한 한국인 대화 음성 말뭉치에서 화자의 성대의 진동을 나타내는 fundamental frequency(f0)값을 측정해서 한국인이 일상 대화를 할 때 f0값의 기초적인 통계자료를 살펴보고, 나이와 f0값의 분포는 어떤 관계를 보이는지를 조사했다. 연구자료 수집과 분석은 Praat과 R을 이용했고, 개인별 억양구마다 상자도를 구하고 사분위값을 활용하여 극단값을 제거하는 방법으로 최종 f0값 자료를 구했다. 그 결과 전체 한국인들의 f0값의 평균값은 185 Hz이고 중앙값은 187 Hz로 나왔다. 자료의 분포모양을 나타내는 왜도는 0.11의 정적분포를 보였고, 첨도는 -0.09로 정상분포에 거의 가까운 모양을 보였다. 일상대화의 피치값의 변화범위로는 238 Hz로 나타났다. 남녀 간의 f0값의 차이는 남성의 중앙값 114 Hz의 거의 두 배에 해당하는 199 Hz가 여성의 중앙값으로 나타났고 t검증결과 유의미한 차이를 보였다. 분포모양을 나타내는 왜도는 남성이 1.24이었고, 여성은 그것의 반에 해당하는 0.58이었다. 첨도는 남녀집단 각각 5.21과 3.88로 나타나 남성의 값이 34% 정도 더 뾰족한 모양을 보였다. 연령대별로는 남녀집단을 합하여 볼 때, 나이가 들수록 f0값이 서서히 내려가는 경향을 보였다. 연령대별 f0중앙값과 나이 간의 회귀분석을 실행한 결과 기울기가 남성집단에서는 0.15, 여성집단에서는 -0.586으로 서로 반대되는 경향을 기록했다. 결론적으로, 대규모 참여자가 녹음한 대화 음성에서 한국인의 집단별 연령별 다양한 f0분포를 규명할 수 있지만, 나이와 f0관계는 더 정밀한 자료수집이 필요함을 알 수 있었다.

발음평가용 멀티미디어 시스템 구현을 위한 구어 프랑스어의 음향학적 단서 (Acoustic Cues in Spoken French for the Pronunciation Assessment Multimedia System)

  • 이은영;송미영
    • 음성과학
    • /
    • 제12권3호
    • /
    • pp.185-200
    • /
    • 2005
  • The objective of this study is to examine acoustic cues in spoken French for the assessment of pronunciation which is necessary to realization of the multimedia system. The corpus is composed of simple expressions which consist of the French phonological system include all phonemes. This experiment was made on 4 male and female French native speakers and on 20 Korean speakers, university students who had learned the French language more than two years. We analyzed the recorded data by using spectrograph and measured comparative features by the numerical values. First of all, we found the mean and the deviation of all phonemes, and then chose features which had high error frequency and great differences between French and Korean pronunciations. The selected data were simplified and compared among them. After we judged whether the problems of pronunciation in each Korean speaker were either the utterance mistake or the interference of mother tongue, in terms of articulatory and auditory aspects, we tried to find acoustic features as simplified as possible. From this experiment, we could extract acoustic cues for the construction of the French pronunciation training system.

  • PDF

위장 발화 방법의 차이가 청취 판단에 미치는 영향 (The Effects of the Methods of Disguised Voice on the Aural Decision)

  • 송민창;신지영;강선미
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.25-35
    • /
    • 2003
  • This study deals with the disguised voice (or voice disguise) in the field of forensic phonetics. We especially studied the effects of the methods of disguised voice on the aural decision. Within the nonelectronic-deliberate voice disguise area, the methods of disguised voice include use of lowered pitch, pinched nostrils, falsetto, and whisper. Ten (male:5, female:5) Seoul speakers made a recording of 16 sentences. In the aural test, 30 subjects listened normal and disguised voice. And they were asked to make a decision whether speakers identified or not. The result is as follows: The speaker verification of the falsetto and whisper was more difficult than the lowered pitch and pinched nostrils.

  • PDF