• Title/Summary/Keyword: Background Speaker

Search Result 70, Processing Time 0.026 seconds

A Research on the Vibration Characteristics of Vehicle due to Speaker Sound at Low Frequency (저주파 스피커 출력음 대비 차량 진동 특성 연구)

  • Kim, Ki-Chang;Kim, Chan-Mook
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.17 no.8
    • /
    • pp.673-682
    • /
    • 2007
  • Recently the trend of automobile industry is that IQS evaluation index against a sensitivity quality is increasing. To reduce rattle noise due to speaker sound at low frequencies, it is required the advanced technology analysis process of body structure. This paper optimized the design parameters of package tray panel according to the theoretical background about robust design and suggested the design guideline for resonance avoidance and the reduction of vibrational sensitivity considering the excitation frequency of woofer speaker. And this paper described the design process of a door module panel through the sensitivity analysis in case of the door speaker excitation. Finally, the analysis of the quality deviation using mother car is suggested to guarantee the stable characteristics of vehicle vibration in the early stage of vehicle development. These improvements can lead to shortening the time needed to develop better vehicles.

An Improvement of the MLP Based Speaker Verification System through Improving the learning Speed and Reducing the Learning Data (학습속도 개선과 학습데이터 축소를 통한 MLP 기반 화자증명 시스템의 등록속도 향상방법)

  • Lee, Baek-Yeong;Lee, Tae-Seung;Hwang, Byeong-Won
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.3
    • /
    • pp.88-98
    • /
    • 2002
  • The multilayer perceptron (MLP) has several advantages against other pattern recognition methods, and is expected to be used as the learning and recognizing speakers of speaker verification system. But because of the low learning speed of the error backpropagation (EBP) algorithm that is used for the MLP learning, the MLP learning requires considerable time. Because the speaker verification system must provide verification services just after a speaker's enrollment, it is required to solve the problem. So, this paper tries to make short of time required to enroll speakers with the MLP based speaker verification system, using the method of improving the EBP learning speed and the method of reducing background speakers which adopts the cohort speakers method from the existing speaker verification.

Speaker Detection System for Video Conference (영상회의를 위한 화자 검출 시스템)

  • Lee, Byung-Sun;Ko, Sung-Won;Kwon, Heak-Bong
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.17 no.5
    • /
    • pp.68-79
    • /
    • 2003
  • In this paper, we propose a system that detects the current speaker in multi-speaker video conference by using lip motion. First, the system detects the face and lip area of each of the speakers using face color and shape information. Then, to detect the current speaker, it calculates the change between the current frame and the previous frame. To accomplish this, we used two CCD cameras. One is a general CCD camera, the other is a PTZ camera controlled by RS-232C serial port. The result is a system capable of detecting the face of current speaker in a video feed with more than three people, regardless of orientation of the faces. With this system, it only takes 4 to 5 seconds to zoom in on the speaker from the initial image. Also, it is amore efficient image transmission system for such things as video conference and internet broadcasting because it offers a face area screen at a resolution of 320X240, while at the same time providing a whole background screen.

Histogram Equalization Using Centroids of Fuzzy C-Means of Background Speakers' Utterances for Majority Voting Based Speaker Identification (다수 투표 기반의 화자 식별을 위한 배경 화자 데이터의 퍼지 C-Means 중심을 이용한 히스토그램 등화기법)

  • Kim, Myung-Jae;Yang, Il-Ho;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.1
    • /
    • pp.68-74
    • /
    • 2014
  • In a previous work, we proposed a novel approach of histogram equalization using a supplement set which is composed of centroids of Fuzzy C-Means of the background utterances. The performance of the proposed method is affected by the size of the supplement set, but it is difficult to find the best size at the point of recognition. In this paper, we propose a histogram equalization using a supplement set for majority voting based speaker identification. The proposed method identifies test utterances using a majority voting on the histogram equalization methods with various sizes of supplement sets. The proposed method is compared with the conventional feature normalization methods such as CMN(Cepstral Mean Normalization), MVN(Mean and Variance Normalization), and HEQ(Histogram Equalization) and the histogram equalization method using a supplement set.

An Improvement of Korean Speech Recognition Using a Compensation of the Speaking Rate by the Ratio of a Vowel length (모음길이 비율에 따른 발화속도 보상을 이용한 한국어 음성인식 성능향상)

  • 박준배;김태준;최성용;이정현
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.195-198
    • /
    • 2003
  • The accuracy of automatic speech recognition system depends on the presence of background noise and speaker variability such as sex, intonation of speech, and speaking rate. Specially, the speaking rate of both inter-speaker and intra-speaker is a serious cause of mis-recognition. In this paper, we propose the compensation method of the speaking rate by the ratio of each vowel's length in a phrase. First the number of feature vectors in a phrase is estimated by the information of speaking rate. Second, the estimated number of feature vectors is assigned to each syllable of the phrase according to the ratio of its vowel length. Finally, the process of feature vector extraction is operated by the number that assigned to each syllable in the phrase. As a result the accuracy of automatic speech recognition was improved using the proposed compensation method of the speaking rate.

  • PDF

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF

A Study on the Fast Enrollment of Text-Independent Speaker Verification for Vehicle Security (차량 보안을 위한 어구독립 화자증명의 등록시간 단축에 관한 연구)

  • Lee, Tae-Seung;Choi, Ho-Jin
    • Journal of Advanced Navigation Technology
    • /
    • v.5 no.1
    • /
    • pp.1-10
    • /
    • 2001
  • Speech has a good characteristics of which car drivers busy to concern with miscellaneous operation can make use in convenient handling and manipulating of devices. By utilizing this, this works proposes a speaker verification method for protecting cars from being stolen and identifying a person trying to access critical on-line services. In this, continuant phonemes recognition which uses language information of speech and MLP(mult-layer perceptron) which has some advantages against previous stochastic methods are adopted. The recognition method, though, involves huge computation amount for learning, so it is somewhat difficult to adopt this in speaker verification application in which speakers should enroll themselves at real time. To relieve this problem, this works presents a solution that introduces speaker cohort models from speaker verification score normalization technique established before, dividing background speakers into small cohorts in advance. As a result, this enables computation burden to be reduced through classifying the enrolling speaker into one of those cohorts and going through enrollment for only that cohort.

  • PDF

The Study on the Verification of Speaker Change using GMM-UBM based KL distance (GMM-UBM 기반 KL 거리를 활용한 화자변화 검증에 대한 연구)

  • Cho, Joon-Beom;Lee, Ji-eun;Lee, Kyong-Rok
    • Journal of Convergence Society for SMB
    • /
    • v.6 no.4
    • /
    • pp.71-77
    • /
    • 2016
  • In this paper, we proposed a verification of speaker change utilizing the KL distance based on GMM-UBM to improve the performance of conventional BIC based Speaker Change Detection(SCD). We have verified Conventional BIC-based SCD using KL-distance based SCD which is robust against difference of information volume than BIC-based SCD. And we have applied GMM-UBM to compensate asymmetric information volume. Conventional BIC-based SCD was composed of two steps. Step 1, to detect the Speaker Change Candidate Point(SCCP). SCCP is positive local maximum point of dissimilarity d. Step 2, to determine the Speaker Change Point(SCP). If ${\Delta}BIC$ of SCCP is positive, it decides to SCP. We examined verification of SCP using GMM-UBM based KL distance D. If the value of D on each SCP is higher than threshold, we accepted that point to the final SCP. In the experimental condition MDR(Missed Detection Rate) is 0, FAR(False Alarm Rate) when the threshold value of 0.028 has been improved to 60.7%.

Speaker verification with ECAPA-TDNN trained on new dataset combined with Voxceleb and Korean (Voxceleb과 한국어를 결합한 새로운 데이터셋으로 학습된 ECAPA-TDNN을 활용한 화자 검증)

  • Keumjae Yoon;Soyoung Park
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.209-224
    • /
    • 2024
  • Speaker verification is becoming popular as a method of non-face-to-face identity authentication. It involves determining whether two voice data belong to the same speaker. In cases where the criminal's voice remains at the crime scene, it is vital to establish a speaker verification system that can accurately compare the two voice evidence. In this study, to achieve this, a new speaker verification system was built using a deep learning model for Korean language. High-dimensional voice data with a high variability like background noise made it necessary to use deep learning-based methods for speaker matching. To construct the matching algorithm, the ECAPA-TDNN model, known as the most famous deep learning system for speaker verification, was selected. A large dataset of the voice data, Voxceleb, collected from people of various nationalities without Korean. To study the appropriate form of datasets necessary for learning the Korean language, experiments were carried out to find out how Korean voice data affects the matching performance. The results showed that when comparing models learned only with Voxceleb and models learned with datasets combining Voxceleb and Korean datasets to maximize language and speaker diversity, the performance of learning data, including Korean, is improved for all test sets.

RPCA-GMM for Speaker Identification (화자식별을 위한 강인한 주성분 분석 가우시안 혼합 모델)

  • 이윤정;서창우;강상기;이기용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.519-527
    • /
    • 2003
  • Speech is much influenced by the existence of outliers which are introduced by such an unexpected happenings as additive background noise, change of speaker's utterance pattern and voice detection errors. These kinds of outliers may result in severe degradation of speaker recognition performance. In this paper, we proposed the GMM based on robust principal component analysis (RPCA-GMM) using M-estimation to solve the problems of both ouliers and high dimensionality of training feature vectors in speaker identification. Firstly, a new feature vector with reduced dimension is obtained by robust PCA obtained from M-estimation. The robust PCA transforms the original dimensional feature vector onto the reduced dimensional linear subspace that is spanned by the leading eigenvectors of the covariance matrix of feature vector. Secondly, the GMM with diagonal covariance matrix is obtained from these transformed feature vectors. We peformed speaker identification experiments to show the effectiveness of the proposed method. We compared the proposed method (RPCA-GMM) with transformed feature vectors to the PCA and the conventional GMM with diagonal matrix. Whenever the portion of outliers increases by every 2%, the proposed method maintains almost same speaker identification rate with 0.03% of little degradation, while the conventional GMM and the PCA shows much degradation of that by 0.65% and 0.55%, respectively This means that our method is more robust to the existence of outlier.