• 제목/요약/키워드: individual speakers

검색결과 69건 처리시간 0.023초

Individual differences in categorical perception: L1 English learners' L2 perception of Korean stops

  • Kong, Eun Jong
    • 말소리와 음성과학
    • /
    • 제11권4호
    • /
    • pp.63-70
    • /
    • 2019
  • This study investigated individual variability of L2 learners' categorical judgments of L2 stops by exploring English learners' perceptual processing of two acoustic cues (voice onset time [VOT] and f0) and working memory capacity as sources of variation. As prior research has reported that English speakers' greater use of the redundant cue f0 was responsible for gradient processing of native stops, we examined whether the same processing characteristics would be observed in L2 learners' perception of Korean stops (/t/-/th/). 22 English learners of L2 Korean with a range of L2 proficiency participated in a visual analogue scaling task and demonstrated variable manners of judging the L2 Korean stops: Some were more gradient than others in performing the task. Correlation analysis revealed that L2 learners' categorical responses were modestly related to individuals' utilizations of a primary cue for the stop contrast (VOT for L1 English stops and f0 for L2 Korean stops), and were also related to better working memory capacity. Together, the current experimental evidence demonstrates adult L2 learners' top-down processing of stop consonants where linguistic and cognitive resources are devoted to a process of determining abstract phonemic identity.

언어장애인의 명료도에 영향을 미치는 말요인: 문헌연구 (The Role of Speech Factors in Speech Intelligibility: A Review)

  • 김수진
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.25-44
    • /
    • 2002
  • The intelligibility of a spoken message is influenced by a number of factors. Intelligibility is a joint product of a speaker and a listener. In addition, intelligibility varies with the nature of the language context and the context of communication. Thus a single intelligibility score can not be ascribed to a given individual apart from listener and listening situation. But there is a clinical and research need to develop assessment measures of intelligibility that are quantitative and analytic. Before developing the index of intelligibility, the crucial factors need to be examined. Among them, the most significant in intelligibility is the speech factors of speakers. The following section reviews the literature dealing with the contribution of segmental and suprasegmental factors in speech intelligibility regarding the hearing impaired, alaryngeal, and motor disorders.

  • PDF

화자 인식을 위한 모음의 포만트 연구 (A Study on Formants of Vowels for Speaker Recognition)

  • 안병섭;신지영;강선미
    • 대한음성학회지:말소리
    • /
    • 제51호
    • /
    • pp.1-16
    • /
    • 2004
  • The aim of this paper is to analyze vowels in voice imitation and disguised voice, and to find the invariable phonetic features of the speaker. In this paper we examined the formants of monophthongs /a, u, i, o, {$\omega},{\;}{\varepsilon},{\;}{\Lambda}$/. The results of the present are as follows : $\circled1$ Speakers change their vocal tract features. $\circled2$ Vowels /a, ${\varepsilon}$, i/ appear to be proper for speaker recognition since they show invariable acoustic feature during voice modulation. $\circled3$ F1 does not change easily compared to higher formants. $\circled4$ F3-F2 appears to be constituent for a speaker identification in vowel /a/ and /$\varepsilon$/, and F4-F2 in vowel /i/. $\circled5$ Resulting of F-ratio, differences of each formants were more useful than individual formant of a vowel to speaker recognition.

  • PDF

Praat를 이용한 숫자음의 음향적 분석법 (An acoustical analysis method of numeric sounds by Praat)

  • 양병곤
    • 음성과학
    • /
    • 제7권2호
    • /
    • pp.127-137
    • /
    • 2000
  • This paper presents a macro script to analyze numeric sounds by a speech analysis shareware, Praat, and analyzes those sounds produced by three students who were born and raised in Pusan. Recording was done in a quiet office. To make a meaningful comparison, dynamic time points in relation to the total duration of voicing segments were determined to measure acoustical values. Results showed that a strong correlation coefficient was found between the repetitive production of numeric sounds within and across the speakers. Very high coefficients among diphthongal numbers (0 and 6) which usually show wide formant variation were noticed. This supports that each speaker produced numbers quite coherently. Also, the frequency differences between the three subjects were within a perceptually similar range. To identify a speaker among others may require to find subtle individual differences within this range. Perceptual experiments by synthesized numeric sounds may lead to resolve the issue.

  • PDF

Impact and Evaluation of International Cancer Control Congresses

  • Sarwal, Kavita;Trapido, Edward J.;Sutcliffe, Simon;Qiao, You-Lin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권2호
    • /
    • pp.1159-1163
    • /
    • 2013
  • International meetings on various aspects of cancer- its etiology, its diagnosis, its treatment, its palliation, and its prevention and control are held frequently. Many have similar themes, and many seek and receive the same speakers and audiences. A fundamental question arises: what difference does any individual meeting/congress/conference make or add to our understanding of the relevant issues? While many meetings conduct evaluations at the end of the Congress, few use evaluation as a tool to guide design, implementation, and evaluation of both short and long term impacts, and address the question of "what difference did the Congress make". The International Cancer Control Congresses, which are held biennially in different regions of the world, took the opportunity to use evaluation in this way, and ask the relevant questions. This paper describes that evaluation session of the ICCC4, held in Seoul, Korea in November 2011, which was part of the larger evaluation issue.

How Korean Learner's English Proficiency Level Affects English Speech Production Variations

  • Hong, Hye-Jin;Kim, Sun-Hee;Chung, Min-Hwa
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.115-121
    • /
    • 2011
  • This paper examines how L2 speech production varies according to learner's L2 proficiency level. L2 speech production variations are analyzed by quantitative measures at word and phone levels using Korean learners' English corpus. Word-level variations are analyzed using correctness to explain how speech realizations are different from the canonical forms, while accuracy is used for analysis at phone level to reflect phone insertions and deletions together with substitutions. The results show that speech production of learners with different L2 proficiency levels are considerably different in terms of performance and individual realizations at word and phone levels. These results confirm that speech production of non-native speakers varies according to their L2 proficiency levels, even though they share the same L1 background. Furthermore, they will contribute to improve non-native speech recognition performance of ASR-based English language educational system for Korean learners of English.

  • PDF

음성을 이용한 사상체질 분류 알고리즘 (Automated Speech Analysis Applied to Sasang Constitution Classification)

  • 강재환;유종향;이혜정;김종열
    • 말소리와 음성과학
    • /
    • 제1권3호
    • /
    • pp.155-163
    • /
    • 2009
  • This paper introduces an automatic voice classification system for the diagnosis of individual constitution based on Sasang Constitutional Medicine (SCM) in Traditional Korean Medicine (TKM). For the developing of this algorithm, we used the voices of 473 speakers and extracted a total of 144 speech features from the speech data consisting of five sustained vowels and one sentence. The classification system, based on a rule-based algorithm that is derived from a non parametric statistical method, presents binary negative decisions. In conclusion, 55.7% of the speech data were diagnosed by this system, of which 72.8% were correct negative decisions.

  • PDF

Combination of Classifiers Decisions for Multilingual Speaker Identification

  • Nagaraja, B.G.;Jayanna, H.S.
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.928-940
    • /
    • 2017
  • State-of-the-art speaker recognition systems may work better for the English language. However, if the same system is used for recognizing those who speak different languages, the systems may yield a poor performance. In this work, the decisions of a Gaussian mixture model-universal background model (GMM-UBM) and a learning vector quantization (LVQ) are combined to improve the recognition performance of a multilingual speaker identification system. The difference between these classifiers is in their modeling techniques. The former one is based on probabilistic approach and the latter one is based on the fine-tuning of neurons. Since the approaches are different, each modeling technique identifies different sets of speakers for the same database set. Therefore, the decisions of the classifiers may be used to improve the performance. In this study, multitaper mel-frequency cepstral coefficients (MFCCs) are used as the features and the monolingual and cross-lingual speaker identification studies are conducted using NIST-2003 and our own database. The experimental results show that the combined system improves the performance by nearly 10% compared with that of the individual classifier.

대화형 에이전트의 설명 기능과 프라이버시 염려 수준에 따른 사용자 경험 차이에 관한 연구 (A Study on the User Experience according to the Existence of Explanation Facilities and Individuals Privacy Concern Level)

  • 강찬영;최기은;강현민
    • 한국콘텐츠학회논문지
    • /
    • 제20권2호
    • /
    • pp.203-214
    • /
    • 2020
  • 오늘날 스마트 스피커는 점차 개인화되어 사용자들을 위해 특정 제품을 추천하는 추천 에이전트의 역할을 하고 있다. 본 연구의 목적은 스마트 스피커의 대화형 에이전트 맥락에서 '설명 기능'이 투명성, 인지된 신뢰, 사용자 만족도, 재사용 행동 의도, 프라이버시 위협, 추천 품질에 미치는 영향을 살펴보는 것이다. 또한 개인의 프라이버시 염려 수준이 평가에 영향을 미치는지 알아보기 위해 염려 수준을 사용자 구분을 위한 척도로 활용하였다. 연구결과, 설명이 있는 조건이 없는 조건 보다 모든 측정 변인에서 높게 평가되었음을 확인하였고, 프라이버시 염려 수준이 인지된 신뢰, 프라이버시 위협에 미치는 정적인 영향을 확인하였다. 본 연구는 스마트 스피커 맥락에서 설명 기능이 적용될 수 있다는 시사점과 프라이버시 역설 현상을 발견하였으며, 프라이버시 염려 수준에 따른 인지부조화의 가능성을 제시하였다.

Praat과 R로 분석한 한국인 대화 음성 말뭉치의 fundamental frequency(f0)값 분포 (The fundamental frequency (f0) distribution of Korean speakers in a dialogue corpus using Praat and R)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제15권3호
    • /
    • pp.17-25
    • /
    • 2023
  • 이 논문은 국립국어원에서 배포한 한국인 대화 음성 말뭉치에서 화자의 성대의 진동을 나타내는 fundamental frequency(f0)값을 측정해서 한국인이 일상 대화를 할 때 f0값의 기초적인 통계자료를 살펴보고, 나이와 f0값의 분포는 어떤 관계를 보이는지를 조사했다. 연구자료 수집과 분석은 Praat과 R을 이용했고, 개인별 억양구마다 상자도를 구하고 사분위값을 활용하여 극단값을 제거하는 방법으로 최종 f0값 자료를 구했다. 그 결과 전체 한국인들의 f0값의 평균값은 185 Hz이고 중앙값은 187 Hz로 나왔다. 자료의 분포모양을 나타내는 왜도는 0.11의 정적분포를 보였고, 첨도는 -0.09로 정상분포에 거의 가까운 모양을 보였다. 일상대화의 피치값의 변화범위로는 238 Hz로 나타났다. 남녀 간의 f0값의 차이는 남성의 중앙값 114 Hz의 거의 두 배에 해당하는 199 Hz가 여성의 중앙값으로 나타났고 t검증결과 유의미한 차이를 보였다. 분포모양을 나타내는 왜도는 남성이 1.24이었고, 여성은 그것의 반에 해당하는 0.58이었다. 첨도는 남녀집단 각각 5.21과 3.88로 나타나 남성의 값이 34% 정도 더 뾰족한 모양을 보였다. 연령대별로는 남녀집단을 합하여 볼 때, 나이가 들수록 f0값이 서서히 내려가는 경향을 보였다. 연령대별 f0중앙값과 나이 간의 회귀분석을 실행한 결과 기울기가 남성집단에서는 0.15, 여성집단에서는 -0.586으로 서로 반대되는 경향을 기록했다. 결론적으로, 대규모 참여자가 녹음한 대화 음성에서 한국인의 집단별 연령별 다양한 f0분포를 규명할 수 있지만, 나이와 f0관계는 더 정밀한 자료수집이 필요함을 알 수 있었다.