• Title/Summary/Keyword: Speaker

Search Result 1,678, Processing Time 0.037 seconds

Speaker Variation in Number Production by Males (남성의 숫자음 발성에 나타난 화자변이)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.93-104
    • /
    • 2001
  • The author analyzed acoustic parameters of ten Korean numbers produced by ten male students using Praat. Variations of f0, F1, F2 and F3 within and between speakers were examined by determining an average and standard deviation of the parameters of each number and by comparing the acoustic values with one another. Results showed that each subject produced the numbers within a certain range of variation across time. Thus, speaker identification can be more certain using dynamic information of the acoustic parameters within each vocalic segment. Also, percent difference of within-subjects' variation to that of between-subjects can be utilized to determine which sounds would be better stimuli for speaker identification. According to the criteria, the number '2' proved the best stimulus while the number '7' was the worst. Future studies will be necessary to explore robust methods of speaker identification under noisy conditions.

  • PDF

Performance Enhancement of Speaker Identification in Noisy Environments by Optimization Membership Function Based on Particle Swarm (Particle Swarm 기반 최적화 멤버쉽 함수에 의한 잡음 환경에서의 화자인식 성능향상)

  • Min, So-Hee;Song, Min-Gyu;Na, Seung-You;Kim, Jin-Young
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.105-114
    • /
    • 2007
  • The performance of speaker identifier is severely degraded in noisy environments. A study suggested the concept of observation membership for enhancing performances of speaker identifier with noisy speech [1]. The method scaled observation probabilities of input speech by observation identification values decided by SNR. In the paper [1], the authors suggested heuristic parameter values for membership function. In this paper we attempt to apply particle swarm optimization (PSO) for obtaining the optimal parameters for speaker identification in noisy environments. With the speaker identification experiments using the ETRI database we prove that the optimization approach can yield better performance than using only the original membership function.

  • PDF

PCA Covariance Model Based on Multiband for Speaker Verification (화자 확인을 위한 다중대역에 기반한 주성분 분석 공분산 모델)

  • Choi, Min-Jung;Lee, Youn-Jeong;Seo, Chang-Woo
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.127-135
    • /
    • 2007
  • Feature vectors of speech are generally extracted from whole frequency domain. The inherent character of a speaker is located in the low band or high band frequency. However, if the speech is corrupted by narrowband noise with concentrated energy, speaker verification performance is reduced as the individual characteristic is removed. In this paper, we propose a PCA Covariance Model based on the multiband to extract the robust feature vectors against the narrowband noise. First, we divide the overall frequency band into several subbands. Second, the correlation of feature vectors extracted independently from each subband is removed by PCA. The distance obtained from each subband has different distribution. To normalize against the different distribution, we moved the value into the normalized distribution through the mapping function. Finally, the represented value applying the weighting function is used for speaker verification. In the experiments, the proposed method shows better performance of the speaker verification and reduces the computation.

  • PDF

A Frequency Characteristics of the Underwater using moving Coil Type Driver Unit (可動 코일형 Driver Unit 를 이용한 水中擴聲器의 周波數 特性)

  • Lee, Chang-Heon;Seo, Du-Ok;Kim, Byeong-Yeop
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.30 no.1
    • /
    • pp.25-32
    • /
    • 1994
  • An underwater speaker was made of a moving coil driver unite of usual speaker, acryl-boards, polyester resin, rubber and castor oil and it's frequency characteristics was measured in range of 250~600Hz in air water tank and sea. The results of measurements are follows: 1. Transmitting and receiving frequency of measurement frequency were similar in air, water tank and sea. 2. The input and output wave forms of a manufactured speaker which is not water-proof in air were similar to each other in 300~450Hz, but other frequencies showed distorted wave forms. 3. The input and output wave forms of an underwater speaker in water thank and sea were similar to each other in 250~600Hz. But output wave forms showed combination waves with very low frequency. 4. Transmitting and receiving frequency wave forms and resisting pressure of an underwater speaker at 80m in the depth of water were in good condition. Therefore it can be possible to use it as an underwater speaker.

  • PDF

A study on User Experience of Artificial Intelligence speaker (인공지능 스피커(AI speaker) 사례 분석을 통한 고찰)

  • Jo, Gyu-Eun;Kim, Seung-In
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.8
    • /
    • pp.127-133
    • /
    • 2018
  • The purpose of this study is to analyze the technology trend of artificial intelligent speaker(AI speaker) and to suggest direction of domestic AI speaker through the case study of AI speaker. As a research method, technical background was studied through literature, and then, case of AI speaker was investigated. As a result, It attempts to extend it to the visual interface. One of these attempts is attention to the built-in screen AI speaker. AI speakers should be a platform for humans and computers to interact with, not just convenience facilities. Based on the implications presented in this study, we hope to be able to use it as a reference for predicting the service development direction of domestic artificial intelligent speakers in the future.

Experimental study of the sound quality performance and improvement of magnetic fluid speaker (자성유체 스피커의 음질 성능 및 향상에 관한 실험적 연구)

  • Lee, Moo-Yeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.12
    • /
    • pp.6993-6997
    • /
    • 2014
  • The aim of this study was to experimentally investigate the sound quality characteristics, such as sound deflection, sound pressure level and frequency characteristics of a magnetic type speaker in an anechoic chamber to overcome the sound quality and voice-coil temperature problems. To accomplish this, the sound quality performance of the magnetic type speaker was tested according to the magnetic fluid amount and magnetic field intensity. The sound deflection, sound pressure level, and frequency characteristics were measured using the Smarrt program. As a result, at a magnetic fluid amount of 2.4 ml, the sound deflection and the sound pressure level of the magnetic type speaker were enhanced by comparing with those of the general type speaker. The frequency characteristics and the sound pressure level of the magnetic type speaker were enhanced greatly with increasing magnetic field intensity from 8.06 mT to 9.10 mT. In addition, the sound deflection of the magnetic type speaker was 0.01% lower than that of the general type speaker.

Quantization Based Speaker Normalization for DHMM Speech Recognition System (DHMM 음성 인식 시스템을 위한 양자화 기반의 화자 정규화)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.4
    • /
    • pp.299-307
    • /
    • 2003
  • There have been many studies on speaker normalization which aims to minimize the effects of speaker's vocal tract length on the recognition performance of the speaker independent speech recognition system. In this paper, we propose a simple vector quantizer based linear warping speaker normalization method based on the observation that the vector quantizer can be successfully used for speaker verification. For this purpose, we firstly generate an optimal codebook which will be used as the basis of the speaker normalization, and then the warping factor of the unknown speaker will be extracted by comparing the feature vectors and the codebook. Finally, the extracted warping factor is used to linearly warp the Mel scale filter bank adopted in the course of MFCC calculation. To test the performance of the proposed method, a series of recognition experiments are conducted on discrete HMM with thirteen mono-syllabic Korean number utterances. The results showed that about 29% of word error rate can be reduced, and that the proposed warping factor extraction method is useful due to its simplicity compared to other line search warping methods.

Numerical Analysis on Temperature Characteristics of the Voice-Coil for Woofer Speaker Using Ferrofluid (자성유체를 이용한 우퍼 스피커의 보이스 코일 온도 특성에 관한 수치적 연구)

  • Lee, Moo-Yeon;Kim, Hyung-Jin;Lee, Woo-Young
    • Journal of the Korean Magnetics Society
    • /
    • v.23 no.5
    • /
    • pp.166-172
    • /
    • 2013
  • This article is to numerically investigate the temperature and heat transfer characteristics of the voice coil in the woofer speaker using ferrofluid with the input currents. The temperature and heat transfer of the major components of the woofer speakers with and without ferrofluid are calculated and analyzed with the increase of the input currents from 10 W to 50 W at an interval of 10W. As results, the temperature of voice coil is linearly increased with an increase of input currents. The temperature of the woofer speaker with ferrofluid is lower 51.0 % than that of the woofer speaker without ferrofluid at the condition of input current 40W and the required input current of the woofer speaker with ferrofluid is lower 42.5 % than that of the woofer speaker without ferrofluid at the condition of voice coil temperature 490 K. In addition, the heat transfer from voice coil to other components for woofer speaker with ferrofluid is higher 51.7 % than that for woofer speaker without ferrofluid.

A study on speech disentanglement framework based on adversarial learning for speaker recognition (화자 인식을 위한 적대학습 기반 음성 분리 프레임워크에 대한 연구)

  • Kwon, Yoohwan;Chung, Soo-Whan;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.447-453
    • /
    • 2020
  • In this paper, we propose a system to extract effective speaker representations from a speech signal using a deep learning method. Based on the fact that speech signal contains identity unrelated information such as text content, emotion, background noise, and so on, we perform a training such that the extracted features only represent speaker-related information but do not represent speaker-unrelated information. Specifically, we propose an auto-encoder based disentanglement method that outputs both speaker-related and speaker-unrelated embeddings using effective loss functions. To further improve the reconstruction performance in the decoding process, we also introduce a discriminator popularly used in Generative Adversarial Network (GAN) structure. Since improving the decoding capability is helpful for preserving speaker information and disentanglement, it results in the improvement of speaker verification performance. Experimental results demonstrate the effectiveness of our proposed method by improving Equal Error Rate (EER) on benchmark dataset, Voxceleb1.