• Title/Summary/Keyword: Equal Error Rate

Search Result 146, Processing Time 0.024 seconds

Calculation of a Threshold for Decision of Similar Features in Different Spatial Data Sets (이종의 공간 데이터 셋에서 매칭 객체 판별을 위한 임계값 산출)

  • Kim, Jiyoung;Huh, Yong;Yu, Kiyun;Kim, Jung Ok
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.1
    • /
    • pp.23-28
    • /
    • 2013
  • The process of a feature matching for two different spatial data sets is similar to the process of classification as a binary class such as matching or non-matching. In this paper, we calculated a threshold by applying an equal error rate (EER) which is widely used in biometrics that classification is a main topic into spatial data sets. In a process of discriminating what's a matching or what's not, a precision and a recall is changed and a trade-off appears between these indexes because the number of matching pairs is changed when a threshold is changed progressively. This trade-off point is EER, that is, threshold. To the result of applying this method into training data, a threshold is estimated at 0.802 of a value of shape similarity. By applying the estimated threshold into test data, F-measure that is a evaluation index of matching method is highly value, 0.940. Therefore we confirmed that an accurate threshold is calculated by EER without person intervention and this is appropriate to matching different spatial data sets.

An SVM-based Face Verification System Using Multiple Feature Combination and Similarity Space (다중 특징 결합과 유사도 공간을 이용한 SVM 기반 얼굴 검증 시스템)

  • 김도형;윤호섭;이재연
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.6
    • /
    • pp.808-816
    • /
    • 2004
  • This paper proposes the method of implementation of practical online face verification system based on multiple feature combination and a similarity space. The main issue in face verification is to deal with the variability in appearance. It seems difficult to solve this issue by using a single feature. Therefore, combination of mutually complementary features is necessary to cope with various changes in appearance. From this point of view, we describe the feature extraction approaches based on multiple principal component analysis and edge distribution. These features are projected on a new intra-person/extra-person similarity space that consists of several simple similarity measures, and are finally evaluated by a support vector machine. From the experiments on a realistic and large database, an equal error rate of 0.029 is achieved, which is a sufficiently practical level for many real- world applications.

Speaker verification system combining attention-long short term memory based speaker embedding and I-vector in far-field and noisy environments (Attention-long short term memory 기반의 화자 임베딩과 I-vector를 결합한 원거리 및 잡음 환경에서의 화자 검증 알고리즘)

  • Bae, Ara;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.137-142
    • /
    • 2020
  • Many studies based on I-vector have been conducted in a variety of environments, from text-dependent short-utterance to text-independent long-utterance. In this paper, we propose a speaker verification system employing a combination of I-vector with Probabilistic Linear Discriminant Analysis (PLDA) and speaker embedding of Long Short Term Memory (LSTM) with attention mechanism in far-field and noisy environments. The LSTM model's Equal Error Rate (EER) is 15.52 % and the Attention-LSTM model is 8.46 %, improving by 7.06 %. We show that the proposed method solves the problem of the existing extraction process which defines embedding as a heuristic. The EER of the I-vector/PLDA without combining is 6.18 % that shows the best performance. And combined with attention-LSTM based embedding is 2.57 % that is 3.61 % less than the baseline system, and which improves performance by 58.41 %.

A New Teat Data Generation for SPRT in Speaker Verification (화자 확인에서 SPRT를 위한 새로운 테스트 데이터 생성)

  • 서창우;이기용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.42-47
    • /
    • 2003
  • This paper proposes the method to generate new test data using the sample shift of the start frame for SPRT(sequential probability ratio test) in speaker verification. The SPRT method is a effective algorithm that can reduce the test computational complexity. However, in making the decision procedure, SPRT can be executed on the assumption that the input samples are usually to be i.i.d. (Independent and Identically Distributed) samples from a probability density function (pdf), also it's not suitable method to apply for the short utterance. The proposed method can achieve SPRT regardless of the utterance length of the test data because it is method to generate the new test data through the sample shift of start frame. Also, the correlation property of data to be considered in the SPRT method can be effectively removed by employing the principal component analysis. Experimental results show that the proposed method increased the computational complexity of data for sample shift a little, but it has a good performance result more than a conventional method above the average 0.7% in EER (equal error rate).

A study on speech disentanglement framework based on adversarial learning for speaker recognition (화자 인식을 위한 적대학습 기반 음성 분리 프레임워크에 대한 연구)

  • Kwon, Yoohwan;Chung, Soo-Whan;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.447-453
    • /
    • 2020
  • In this paper, we propose a system to extract effective speaker representations from a speech signal using a deep learning method. Based on the fact that speech signal contains identity unrelated information such as text content, emotion, background noise, and so on, we perform a training such that the extracted features only represent speaker-related information but do not represent speaker-unrelated information. Specifically, we propose an auto-encoder based disentanglement method that outputs both speaker-related and speaker-unrelated embeddings using effective loss functions. To further improve the reconstruction performance in the decoding process, we also introduce a discriminator popularly used in Generative Adversarial Network (GAN) structure. Since improving the decoding capability is helpful for preserving speaker information and disentanglement, it results in the improvement of speaker verification performance. Experimental results demonstrate the effectiveness of our proposed method by improving Equal Error Rate (EER) on benchmark dataset, Voxceleb1.

Text Independent Speaker Verficiation Using Dominant State Information of HMM-UBM (HMM-UBM의 주 상태 정보를 이용한 음성 기반 문맥 독립 화자 검증)

  • Shon, Suwon;Rho, Jinsang;Kim, Sung Soo;Lee, Jae-Won;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.2
    • /
    • pp.171-176
    • /
    • 2015
  • We present a speaker verification method by extracting i-vectors based on dominant state information of Hidden Markov Model (HMM) - Universal Background Model (UBM). Ergodic HMM is used for estimating UBM so that various characteristic of individual speaker can be effectively classified. Unlike Gaussian Mixture Model(GMM)-UBM based speaker verification system, the proposed system obtains i-vectors corresponding to each HMM state. Among them, the i-vector for feature is selected by extracting it from the specific state containing dominant state information. Relevant experiments are conducted for validating the proposed system performance using the National Institute of Standards and Technology (NIST) 2008 Speaker Recognition Evaluation (SRE) database. As a result, 12 % improvement is attained in terms of equal error rate.

Performance Improvement analysis of Acoustic Communication System using Receive Diversity (수신 다이버시티를 이용한 음향 통신 시스템의 성능 향상 분석)

  • Bok, Jun-Yeong;Ryu, Heung-Gyoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.3A
    • /
    • pp.198-204
    • /
    • 2011
  • Acoustic communication system is a transmission technology sending sound and data simultaneously. However, data signal can be audible in this system when data is transmitted with high transmission power. The more transmission power is reduced, the more distance that can transmit data is shortened. Therefore, the study that increase the transmission distance is needed. In this paper, we would like to increase transmission distance by adapting receive diversity in acoustic communication system. We measure received performance of both proposed system and Single Input Sing Output (SISO) system according to distance with same transmission power. When SISO satisfies Bit Error Rate (BER) of $7{\times}10^{-3}$ at about 2m, Selection Combining (SC) technique satisfies 2 meters, and Equal Gain Combining (EGC) technique satisfies 4 meters.

Face Recognition Network using gradCAM (gradCam을 사용한 얼굴인식 신경망)

  • Chan Hyung Baek;Kwon Jihun;Ho Yub Jung
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.9-14
    • /
    • 2023
  • In this paper, we proposed a face recognition network which attempts to use more facial features awhile using smaller number of training sets. When combining the neural network together for face recognition, we want to use networks that use different part of the facial features. However, the network training chooses randomly where these facial features are obtained. Other hand, the judgment basis of the network model can be expressed as a saliency map through gradCAM. Therefore, in this paper, we use gradCAM to visualize where the trained face recognition model has made a observations and recognition judgments. Thus, the network combination can be constructed based on the different facial features used. Using this approach, we trained a network for small face recognition problem. In an simple toy face recognition example, the recognition network used in this paper improves the accuracy by 1.79% and reduces the equal error rate (EER) by 0.01788 compared to the conventional approach.

Experimental analysis of very long range spread spectrum underwater acoustic communication using vertical sensor array (수직 배열 센서를 이용한 초장거리 대역확산 수중음향통신의 실험 분석)

  • Youn, Chang-hyun;Ra, Hyung-in;An, Jeong-ha;Kim, Ki-man;Kim, In-soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.2
    • /
    • pp.150-158
    • /
    • 2022
  • This paper presents the results of a sea trial for very long range spread spectrum underwater acoustic communication conducted in the East Sea in September 2021. Signals were collected through 8 vertical sensors, and the range between the transmitter and receiver was about 160 km. 30 bps Multi-Code Spread Spectrum (MCSS) method and 100 bps Chirp Spread Spectrum method were used for the transmitting signal generation. The results show that when the channel coding technique was not used in a single channel, the uncoded bit error rate was high, but when the Equal Gain Combining (EGC) diversity technique was used after frame synchronization in each receiving channel, the uncoded bit error rate was reduced to 0.1 or less.

Wavelet-based Feature Extraction Algorithm for an Iris Recognition System

  • Panganiban, Ayra;Linsangan, Noel;Caluyo, Felicito
    • Journal of Information Processing Systems
    • /
    • v.7 no.3
    • /
    • pp.425-434
    • /
    • 2011
  • The success of iris recognition depends mainly on two factors: image acquisition and an iris recognition algorithm. In this study, we present a system that considers both factors and focuses on the latter. The proposed algorithm aims to find out the most efficient wavelet family and its coefficients for encoding the iris template of the experiment samples. The algorithm implemented in software performs segmentation, normalization, feature encoding, data storage, and matching. By using the Haar and Biorthogonal wavelet families at various levels feature encoding is performed by decomposing the normalized iris image. The vertical coefficient is encoded into the iris template and is stored in the database. The performance of the system is evaluated by using the number of degrees of freedom, False Reject Rate (FRR), False Accept Rate (FAR), and Equal Error Rate (EER) and the metrics show that the proposed algorithm can be employed for an iris recognition system.