통합 검색 | Korea Science

A Robust Method for Speech Replay Attack Detection

Lin, Lang;Wang, Rangding;Yan, Diqun;Dong, Li
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제14권1호
- /
- pp.168-182
- /
- 2020
Spoofing attacks, especially replay attacks, pose great security challenges to automatic speaker verification (ASV) systems. Current works on replay attacks detection primarily focused on either developing new features or improving classifier performance, ignoring the effects of feature variability, e.g., the channel variability. In this paper, we first establish a mathematical model for replay speech and introduce a method for eliminating the negative interference of the channel. Then a novel feature is proposed to detect the replay attacks. To further boost the detection performance, four post-processing methods using normalization techniques are investigated. We evaluate our proposed method on the ASVspoof 2017 dataset. The experimental results show that our approach outperforms the competing methods in terms of detection accuracy. More interestingly, we find that the proposed normalization strategy could also improve the performance of the existing algorithms.
https://doi.org/10.3837/tiis.2020.01.010 인용 PDF KSCI HTML

위장 발화 방법의 차이가 청취 판단에 미치는 영향 (The Effects of the Methods of Disguised Voice on the Aural Decision)

송민창;신지영;강선미
- 대한음성학회지:말소리
- /
- 제46호
- /
- pp.25-35
- /
- 2003
This study deals with the disguised voice (or voice disguise) in the field of forensic phonetics. We especially studied the effects of the methods of disguised voice on the aural decision. Within the nonelectronic-deliberate voice disguise area, the methods of disguised voice include use of lowered pitch, pinched nostrils, falsetto, and whisper. Ten (male:5, female:5) Seoul speakers made a recording of 16 sentences. In the aural test, 30 subjects listened normal and disguised voice. And they were asked to make a decision whether speakers identified or not. The result is as follows: The speaker verification of the falsetto and whisper was more difficult than the lowered pitch and pinched nostrils.
PDF

위장발화의 단모음 포만트 연구 (A Study on the Vowel Fomants in Disguised Speech)

노석은;박미경;조민하;신지영;강선미
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2004년도 춘계 학술대회 발표논문집
- /
- pp.215-218
- /
- 2004
The aim of this paper is to analyze the acoustic features for disguised voice. In this paper we examined the features such as pitch range, vowel formants(F1, F2, F3, F4). So the result of the analysis is as follows. : (1) Pitch range and average of pitch value is very important cue for speaker verification. (2) F3-F2 is also important cue for speaker verification (3) /a/ is more verified than other vowels.
PDF

잡음 환경에서 화자 확인을 위한 다중대역에 기반한 공분산 방법 (Covariance Model Based on Multi-Band for Speaker Verification in Noise)

최민정;이기용
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2004년도 추계학술발표대회논문집 제23권 2호
- /
- pp.127-130
- /
- 2004
기존의 전대역(Full-Band)에서 특징 파라미터를 추출하는 화자 확인(Speaker Verification) 시스템은 저대역이나 고대역에서 화자 정보의 특징이 제거되기 쉽다. 또한, 주파수 스펙트럼에 부분적으로 오염이 되는 경우, 특징 파라미터를 왜곡시켜 화자 확인 시스템의 성능을 저하시킨다. 본 논문에서는 이러한 문제점을 해결하기 위해 다중대역 공분산 모델(Covariance Model)을 제안한다. 제안한 방법은 주파수 영역에서 전대역을 여러 개의 부대역(Sub-Band)으로 분할하고, 부대역별로 독립적으로 특징 파라미터를 추출하여 공분산 모델을 구한다. 제안된 방법의 성능 확인을 위하여 공분산 모델 간의 거리를 측정하는 화자 확인 실험을 하였다. 잡음 환경에서 기존의 방법인 전대역에 기반한 공분산 모델과 제안한 방법을 비교 분석한 결과, 제안한 방법이 기존 방법보다 $2\%$정도 성능이 향상되었다. 또한, 제안된 방법은 전대역에 기반한 파라미터 차원 수를 다중대역의 개수로 분할하여 사용하므로 계산량의 감소와 저장 공간면에서 효율적이다.
PDF

SVM을 이용한 화자인증 시스템 하드웨어 구현 (The Hardware Implementation of Speaker Verification System Using Support Vector Machine)

황병희;최우용;문대성;반성범;정용화;정상화
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2003년도 춘계학술발표논문집 (하)
- /
- pp.1933-1936
- /
- 2003
최근 목소리를 이용하여 사용자를 인증하는 화자인증(speaker verification)에 대한 관심이 증가하고 있으며, 다양한 화자 인증방법 중에서 SVM을 적용한 방법이 다른 알고리즘에 비해 우수한 성능을 나타내고 있다. 그러나 SVM을 이용한 화자인증 방법은 복잡한 계산으로 인해 휴대폰 등 휴대기기에서 실시간 처리에 어려움이 있다. 본 논문에서는 SVM을 이용한 화자인증 알고리즘을 실시간으로 처리하기 위한 하드웨어 구조를 제안하였고, VHDL을 이용하여 모델링 후 실험한 결과를 분석하였으며 전체 시스템 구성에 대하여 설명하였다.
PDF

PLDA 모델 적응과 데이터 증강을 이용한 짧은 발화 화자검증 (Short utterance speaker verification using PLDA model adaptation and data augmentation)

윤성욱;권오욱
- 말소리와 음성과학
- /
- 제9권2호
- /
- pp.85-94
- /
- 2017
Conventional speaker verification systems using time delay neural network, identity vector and probabilistic linear discriminant analysis (TDNN-Ivector-PLDA) are known to be very effective for verifying long-duration speech utterances. However, when test utterances are of short duration, duration mismatch between enrollment and test utterances significantly degrades the performance of TDNN-Ivector-PLDA systems. To compensate for the I-vector mismatch between long and short utterances, this paper proposes to use probabilistic linear discriminant analysis (PLDA) model adaptation with augmented data. A PLDA model is trained on vast amount of speech data, most of which have long duration. Then, the PLDA model is adapted with the I-vectors obtained from short-utterance data which are augmented by using vocal tract length perturbation (VTLP). In computer experiments using the NIST SRE 2008 database, the proposed method is shown to achieve significantly better performance than the conventional TDNN-Ivector-PLDA systems when there exists duration mismatch between enrollment and test utterances.
https://doi.org/10.13064/KSSS.2017.9.2.085 인용 PDF KSCI

위장발화에 대한 연구 - 운율적 특성을 중심으로 - (A Study On the Disguised Voice - From a prosodic point of view -)

조민하;노석은;송민규;신지영;강선미
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2003년도 5월 학술대회지
- /
- pp.191-195
- /
- 2003
The aim of this paper is to analyze the phonetic features for disguised voice. In this paper we examined the features such as phonation types, pitch range, speech rate, intonation type and boundary tones etc. So the result of the analysis is as follows. : $\circled1$ Phonation types are very important manner of disguised voice for male subjects. $\circled2$ Pitch range and average of pitch value is very important cue for speaker verification. $\circled3$ pitch contour, speech rate and boundary tones can be a secondary cue for speaker verification.
PDF

잡음 첨가된 화자 모델 구성에 의한 잡음 환경의 효과적인 화자확인 (Efficient Speaker Verification in Noise Environment with Noise-added Speaker Model Composition)

안성주;강선미;고한석
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 1999년도 가을 학술발표논문집 Vol.26 No.2 (2)
- /
- pp.542-544
- /
- 1999
본 논문에서는 다수의 화자 모델을 구성함으로써 잡음에 강인한 화자확인 방법을 제안한다. Non-stationary한 잡음을 가진 입력음성의 SNR을 측정하는 것은 어렵기 때문에, 각 화자에 대해 잡음이 없을 때의 화자모델에 여러 SNR에 대한 잡음 모델을 결합시킴으로써 여러 개의 잡음 첨가된 화자 모델을 구성한다. 그리고, 화자확인에서는 이렇게 구한 각 모델에 대한 입력 음성의 likelihood를 구해 그 중 가장 큰 likelihood만을 선택한다. 이 값을 이용하여 화자확인을 수행한다. 실험 결과, 제안한 방법은 입력음성의 SNR을 모르는 잡음환경에서 일반적으로 하나의 모델을 사용하는 것보다 훨씬 좋은 성능을 보였다.
PDF

화자 인증에서의 효과적인 화자 적응과 a priori Threshold Updating에 관한 연구 (A Study for Effective Speaker Adaptation and a priori Threshold Updating in Speaker Verification)

조영훈;이수호;홍대희;고한석
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2001년도 제14회 신호처리 합동 학술대회 논문집
- /
- pp.491-494
- /
- 2001
실제 화자 인증기를 설계함에 있어서 발생하는 가장큰 문제는, 적은 Enrollment data로 화자 모델이 만들어 지므로 화자 인증기의 성능이 시간이 지남에 따라 굉장히 줄어들게 되는 것과, 미리 훈련된 데이터 만으로 Threshold를 설정함에 따라 차후 실제 사용 시에 발생하는 변이를 고려하지 못하여 역시 성능 저하의 문제를 발생시킨다는 것이다. 위의 문제를 해결하기 위해 이 논문은 화자 모델을 구성하는데 있어 MAP 방법을 적용하고, threshold를 Resetting하는 방법을 적용했다. 본 논문에서 제안한 방법으로 HTER값이 23%정도 줄어듦을 보여준다.
PDF

DHMM 음성 인식 시스템을 위한 양자화 기반의 화자 정규화 (Quantization Based Speaker Normalization for DHMM Speech Recognition System)

신옥근
- 한국음향학회지
- /
- 제22권4호
- /
- pp.299-307
- /
- 2003
화자독립 음성인식기에서 화자사이의 성도 길이의 영향을 최소화시켜 인식 성능을 개선하는 화자 정규화에 대한 많은 연구가 있어 왔다. 본 연구에서는 벡터양자화기를 이용하여 화자 검증이 가능하다는 사실에 착안하여 벡터 양자화기를 이용한 비교적 간단한 선형 워핑 화자정규화방법을 제안한다. 제안하는 방법에서는 먼저 정규화에 이용될 최적의 코드북을 생성한 다음, 이 코드 북을 이용하여 화자의 선형 워핑계수를 추출하고 추출된 워핑계수는 멜 켑스트럼 추출시에 사용되는 멜스케일 필터뱅크를 워핑하기 위해 이용된다. 본고에서 제안한 워핑계수 추출 및 적용 방법의 성능을 확인하기 위해 이산 HMM을 이용한 13가지의 단음절 한글 숫자음 인식기를 이용하여 인식실험을 수행하였으며, 실험 결과 약 29%의 오인식률 감소를 보여 제안하는 화자 정규화방법이 다른 라인서치 워핑계수추출 방법보다 간단한 동시에 효용가치가 있음을 확인하였다.
PDF KSCI

검색결과 162건 처리시간 0.021초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)