• 제목/요약/키워드: Speaker differences

검색결과 84건 처리시간 0.025초

의인화 수준에 따른 스마트 스피커 이용자의 고독감, 사회적 단절, 애착 및 이용행태의 차이 (Differences in Loneliness, Social Disconnection, Attachment, and Usage Behavior of Smart Speaker Users Depending on Anthropomorphism Level)

  • 장예빛
    • 한국콘텐츠학회논문지
    • /
    • 제22권10호
    • /
    • pp.93-102
    • /
    • 2022
  • 의인화의 영향력에 대한 연구들은 점차 늘어나는 추세이나, 의인화를 경험하는 이용자의 특성 및 이들의 이용행태에 관해서는 탐색이 부족한 편이다. 따라서 본 연구에서는 구체적으로 스마트 스피커 이용자의 의인화 수준에 따른 이용자의 외로움, 사회적 단절, 애착 및 스피커 이용행태의 차이에 관해 탐색해보고자 하였다. 스마트스피커 이용자들을 대상으로 온라인 설문조사를 진행하였고, 총 320명의 이용자들이 조사에 참여하였다. 의인화 수준에 따라 두 집단으로 나누어 집단 간 차이를 살펴본 결과, 이용자의 외로움, 사회적 단절, 애착불안, 하루 평균 스피커 이용 빈도 및 의인화 행동 수준에 통계적으로 유의미한 차이가 나타났고, 애착회피는 통계적으로 유의미한 차이를 나타나지 않았다. 연구결과를 통해, 스피커와 긍정적인 사회적 상호작용을 높일 수 있는 인터랙션 설계에 관한 주요한 시사점을 제공할 수 있을 것으로 기대한다.

러시아어 발화시 억양의 역할 (On the Role of the Phatic Function of Intonation in Russian)

  • 박근우
    • 음성과학
    • /
    • 제4권1호
    • /
    • pp.81-89
    • /
    • 1998
  • This paper investigates the phatic function of intonation in Russian by recording and analysing 11 female native speakers of standard Moscow Russian. This paper shows that differences in intonation pattern of a sentence are associated with differences in degree of listener's involvement in the speech. Intonation pattern of an utterance having phatic function appears to be determined by 1) the speaker's readiness to talk to evoke the listener's attention ; 2) the speaker's intention to continue the communication. Some emphasis is placed on the relationship between intonation pattern of an utterance and speaker-listener interaction.

  • PDF

대역별로 여과한 음성 강도의 차이값과 상관계수에 의한 화자확인 연구 (A Study on Speaker Identification by Difference Sum and Correlation Coefficient of Intensity Levels from Band-pass Filtered Sounds)

  • 양병곤
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.249-258
    • /
    • 2003
  • This study attempted to examine a speaker identification method using difference sum and correlation coefficient determined from a pair of intensity level matrices of band-pass-filtered numeric sounds produced by ten female speakers of similar age and height. Subjects recorded three digit numbers at a quiet room at a sampling rate of 22 kHz on a personal computer. Collected data were band-pass-filtered at five different band ranges. Then, matrices of five intensity levels at 100 proportional time points were obtained. Pearson correlation coefficients and the sum of absolute intensity differences between a pair of given matrices were determined within and across the speakers. Results showed that very high correlation coefficient and small difference sum generally occurred within each speaker but some individual variation was also observed. Thus, the matrix pair with a higher coefficient and a smaller difference sum was averaged to form each individual's model. Comparison among the speakers yielded generally low coefficients and large differences, which suggests successful speaker identification, but among them there were a few cases with very high coefficients and small differences. Future studies will focus on finer band ranges and additional spectral parameters at some peak points of the intensity contour at a low frequency band.

  • PDF

SNR을 이용한 프레임별 유사도 가중방법을 적용한 문맥종속 화자인식에 관한 연구 (A Study on the Context-dependent Speaker Recognition Adopting the Method of Weighting the Frame-based Likelihood Using SNR)

  • 최홍섭
    • 대한음성학회지:말소리
    • /
    • 제61호
    • /
    • pp.113-123
    • /
    • 2007
  • The environmental differences between training and testing mode are generally considered to be the critical factor for the performance degradation in speaker recognition systems. Especially, general speaker recognition systems try to get as clean speech as possible to train the speaker model, but it's not true in real testing phase due to environmental and channel noise. So in this paper, the new method of weighting the frame-based likelihood according to frame SNR is proposed in order to cope with that problem. That is to make use of the deep correlation between speech SNR and speaker discrimination rate. To verify the usefulness of this proposed method, it is applied to the context dependent speaker identification system. And the experimental results with the cellular phone speech DB which is designed by ETRI for Koran speaker recognition show that the proposed method is effective and increase the identification accuracy by 11% at maximum.

  • PDF

Inter-speaker and intra-speaker variability on sound change in contemporary Korean

  • Kim, Mi-Ryoung
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.25-32
    • /
    • 2017
  • Besides their effect on the f0 contour of the following vowel, Korean stops are undergoing a sound change in which a partial or complete consonantal merger on voice onset time (VOT) is taking place between aspirated and lax stops. Many previous studies on sound change have mainly focused on group-normative effects, that is, effects that are representative of the population as a whole. Few systematic quantitative studies of change in adult individuals have been carried out. The current study examines whether the sound change holds for individual speakers. It focuses on inter-speaker and intra-speaker variability on sound change in contemporary Korean. Speech data were collected for thirteen Seoul Korean speakers studying abroad in America. In order to minimize the possible effects of speech production, socio-phonetic factors such as age, gender, dialect, speech rate, and L2 exposure period were controlled when recruiting participants. The results showed that, for nine out of thirteen speakers, the consonantal merger is taking place between the aspirated and lax stop in terms of VOT. There were also intra-speaker variations on the merger in three aspects: First, is the consonantal (VOT) merger between the two stops is in progress or not? Second, are VOTs for aspirated stops getting shorter or not (i.e., the aspirated-shortening process)? Third, are VOTs for lax stops getting longer or not (i.e., the lax-lengthening process)? The results of remarkable inter-speaker and intra-speaker variability indicate a synchronous speech sound change of the stop system in contemporary Korean. Some speakers are early adopters or active propagators of sound change whereas others are not. Further study is necessary to see whether the inter-speaker differences exceed intra-speaker differences in sound change.

Speaker Adaptation Using ICA-Based Feature Transformation

  • Jung, Ho-Young;Park, Man-Soo;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • 제24권6호
    • /
    • pp.469-472
    • /
    • 2002
  • Speaker adaptation techniques are generally used to reduce speaker differences in speech recognition. In this work, we focus on the features fitted to a linear regression-based speaker adaptation. These are obtained by feature transformation based on independent component analysis (ICA), and the feature transformation matrices are estimated from the training data and adaptation data. Since the adaptation data is not sufficient to reliably estimate the ICA-based feature transformation matrix, it is necessary to adjust the ICA-based feature transformation matrix estimated from a new speaker utterance. To cope with this problem, we propose a smoothing method through a linear interpolation between the speaker-independent (SI) feature transformation matrix and the speaker-dependent (SD) feature transformation matrix. From our experiments, we observed that the proposed method is more effective in the mismatched case. In the mismatched case, the adaptation performance is improved because the smoothed feature transformation matrix makes speaker adaptation using noisy speech more robust.

  • PDF

과학수사용 화자 식별 시스템의 피치 차이에 따른 신뢰성 척도 (Confidence Measure of Forensic Speaker Identification System According to Pitch Variances)

  • 김민석;김경화;양일호;유하진
    • 말소리와 음성과학
    • /
    • 제2권3호
    • /
    • pp.135-139
    • /
    • 2010
  • Forensic speaker identification needs high accuracy and reliability. However, the current level of speaker identification does not reach its demand. Therefore, the confidence evaluation of results is one of the issues in forensic speaker identification. In this paper, we propose a new confidence measure of forensic speaker identification system. This is based on pitch differences between the registered utterances of the identified speaker and the test utterance. In the experiments, we evaluate this confidence measure by speech identification tasks on various environments. As the results, the proposed measure can be a good measure indicating if the result is reliable or not.

  • PDF

화자 인식을 위한 모음의 포만트 연구 (A Study on Formants of Vowels for Speaker Recognition)

  • 안병섭;신지영;강선미
    • 대한음성학회지:말소리
    • /
    • 제51호
    • /
    • pp.1-16
    • /
    • 2004
  • The aim of this paper is to analyze vowels in voice imitation and disguised voice, and to find the invariable phonetic features of the speaker. In this paper we examined the formants of monophthongs /a, u, i, o, {$\omega},{\;}{\varepsilon},{\;}{\Lambda}$/. The results of the present are as follows : $\circled1$ Speakers change their vocal tract features. $\circled2$ Vowels /a, ${\varepsilon}$, i/ appear to be proper for speaker recognition since they show invariable acoustic feature during voice modulation. $\circled3$ F1 does not change easily compared to higher formants. $\circled4$ F3-F2 appears to be constituent for a speaker identification in vowel /a/ and /$\varepsilon$/, and F4-F2 in vowel /i/. $\circled5$ Resulting of F-ratio, differences of each formants were more useful than individual formant of a vowel to speaker recognition.

  • PDF

자율형 이동로봇을 위한 전방위 화자 추종 시스템 (Speaker Tracking System for Autonomous Mobile Robot)

  • 이창훈;김용호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2002년도 합동 추계학술대회 논문집 정보 및 제어부문
    • /
    • pp.142-145
    • /
    • 2002
  • This paper describes a omni-directionally speaker tracking system for mobile robot interface in real environment. Its purpose is to detect a robust 360-degree sound source and to recognize voice command at a long distance(60-300cm). We consider spatial features, the relation of position and interaural time differences, and realize speaker tracking system using fuzzy inference process based on inference rules generated by its spatial features.

  • PDF

동적 시간 신축 알고리즘을 이용한 화자 식별 (Speaker Identification Using Dynamic Time Warping Algorithm)

  • 정승도
    • 한국산학기술학회논문지
    • /
    • 제12권5호
    • /
    • pp.2402-2409
    • /
    • 2011
  • 음성에는 전달하고자 하는 정보 이외에 화자 고유의 음향적 특징을 담고 있다. 화자간의 음향적 차이를 이용하여 말하고 있는 사람이 누구인지 판단하는 방법이 화자 인식이다. 화자 인식에는 화자 확인과 화자 식별로 구분되는데 화자 확인은 1명의 음성을 대상으로 본인인지 아닌지를 검증하는 방법이다. 반면, 화자 식별은 미리 등록된 다수의 종속 문장으로부터 가장 유사한 모델을 찾아 대상 의뢰인이 누군지 식별하는 방법이다. 본 논문에서는 MFCC(Mel Frequency Cepstral Coefficient) 계수를 추출하여 특징 벡터를 구성하였고, 특징 간 유사도 비교는 동적 시간 신축(Dynamic Time Warping) 알고리즘을 이용한다. 각 화자마다 두 개의 종속 문장을 훈련 데이터로 사용하여 음운성에 기반을 둔 공통적 특징을 기술하였고, 이를 통해 데이터베이스에 저장되어 있지 않은 단어를 사용하더라도 동일 화자임을 식별할 수 있도록 하였다.