• Title/Summary/Keyword: speech parameter

Search Result 373, Processing Time 0.025 seconds

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF

Tracking Performance Improvement of the Double-Talk Robust Algorithm for Network Echo Cancellation (네트워크 반향제거를 위한 동시통화에 강인한 알고리듬의 추적 성능 개선)

  • Yoo, Jae-Ha
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.195-200
    • /
    • 2012
  • We present a new algorithm which can improve the tracking performance of the double-talk robust algorithm. A detection method of the echo path change and a modification method for the update equation of the conventional adaptive filter are proposed. A duration of the high error signal to scale parameter ratio varies according to the call status and this property is used to detect the echo path change. The proposed update equation of the adaptive filter improves the tracking performance by prohibiting wrong selection of the error signal. Simulations using real speech signals and echo paths of the ITU-T G.168 standard confirmed that as compared to the conventional algorithm, the proposed algorithm improved the tracking performance by more than 4 dB.

On the Interval Detection of Implosive Stop Sounds by Frame Energy Difference (프레임간 에너지 차를 이용한 음성신호의 종성 폐쇄음 구간 검출에 관한 연구)

  • Bae, Myung-Jin;Choi, Jung-Ah;Ann, Sou-Guil
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.4
    • /
    • pp.145-150
    • /
    • 1989
  • Preprocessing in speech recognition system is useful, for it reduces some of the complicated procedures required for the final recognition. In this paper, we suggest a new preprocessing algorithm for detecting the intervals of implosive stop sounds. Implosive stop sounds follow vowels in Korean language, and its characteristic is included in the region of vowels. When an implosive stop is pronounced, the velum is quickly colsed, thus its energy decays abruptly and the closure lasts for about 50 to 150 msec. The enegy difference between adjacent frames is chosen as a parameter which represents well the above features.

  • PDF

Optimal Wavelet Selection for AR Model Parameter Identification of Nonstationary Time-Varying Signal (비정상 시변신호의 AR모델 파라메터 인식을 위한 최적의 웨이브렛 선택)

  • Shin, D.H.;Kim, S.H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.50-57
    • /
    • 1996
  • In this paper, we proposed the method of optimal wavelet selection and wavelet expansion of AR(autoregressive) parameters by selected wavelet using F-test. A cost function is introduced as a wavelet selection method. Using this cost function, wavelets (D4 to D20) are tested to the synthesized signal. With this selected wavelet, we get the wavelet coefficients of AR parameters to both synthesized signal and real speech signal. To evaluate the proposed method, this wavelet based algorithm is compared with the Kalman filering algorithm. As a results, the proposed method shows a better performance by about 5-10dB than the Kalman filter.

  • PDF

Speaker Verification Model Using Short-Time Fourier Transform and Recurrent Neural Network (STFT와 RNN을 활용한 화자 인증 모델)

  • Kim, Min-seo;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.6
    • /
    • pp.1393-1401
    • /
    • 2019
  • Recently as voice authentication function is installed in the system, it is becoming more important to accurately authenticate speakers. Accordingly, a model for verifying speakers in various ways has been suggested. In this paper, we propose a new method for verifying speaker verification using a Short-time Fourier Transform(STFT). Unlike the existing Mel-Frequency Cepstrum Coefficients(MFCC) extraction method, we used window function with overlap parameter of around 66.1%. In this case, the speech characteristics of the speaker with the temporal characteristics are studied using a deep running model called RNN (Recurrent Neural Network) with LSTM cell. The accuracy of proposed model is around 92.8% and approximately 5.5% higher than that of the existing speaker certification model.

Signal Processing and Implementation of Transmitter for Cochlear Implant (인공 와우를 위한 신호 처리 및 전달부의 구현)

  • Chae, D.;Choi, D.;Byun, J.;Baeck, S.;Kong, H.;Park, S.
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.284-286
    • /
    • 1993
  • Software and hardware for cochlear implant system have been developed to create a speech signal processing system which, in real-time, extracts model parameter including formants, pitch, amplitude information. The system is based on the Texas Instruments TMS320 family. In hardware, computer interface has been desisted and implemented that allows presentation of biphasic pulse stimuli to patients with the hearing handicapped. The host computer sends a stream of bytes to the parallel port. Upon receipt of the data the interface generates the appropriate burst sequence that is delivered to the patient's external transmitter coil. The coded information is interpreted by the Nucleus-22 internal receiver that delivers the pulse to the specified electrodes at the specified amplitude and pulse width.

  • PDF

Position Estimation of Underwater Target Using Proximity Sensor with Bearing Information (근접 센서의 방위정보를 이용한 수중표적 예상위치 추정 기법)

  • Choi, Young-Doo;Kim, Jung-Hoon;Yoon, Kyung-Sik;Seo, Ik-Su;Lee, Dong-Hun;Lee, Kyun-Kyung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.4
    • /
    • pp.422-429
    • /
    • 2014
  • Proximity sensor networks are aimed at estimation kinematic state of target using estimated position of the target by each sensor node or target parameter. To analyze the kinematic state of target, traditional approaches require detections on multiple sensors, very large number of sensors to achieve acceptable performance. In this paper, we propose a novel method which can estimate predicted position of the underwater target using minimum proximity sensor with bearing information to this problem. The proposed algorithm was verified performance through simulation.

Analysis of Sound Quality Parameters of Sound Sources applied for Soundscape Design (사운드 스케이프 적용 음원의 음질 지수 분석)

  • Park, Hyeon-Ku;Song, Min-Jeong;Jang, Gil-Soo
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.11a
    • /
    • pp.814-819
    • /
    • 2004
  • When we evaluate sound, there are various methods for noise such as A-weighted SPL(sound pressure level), NC(noise criteria), NR(noise rating) and SIL(speech interference level) etc. however, it is not sufficient for the sounds supplied to public places used in soundscape design. Consequently it is needed to develop the tool for evaluating the good acoustical environment and futhermore quantifying the effect of improvement by supplying sound sources. In this study, it was tried to analyse the sound sources applied for soundscape design using sound quality parameters. The sound sources used were natural sound artificial sound. For the sound quality parameters, Loudness(L), Sharpness(S), Fluctuation strength(FL), Tonality(T), Roughness(R), Unbiased Annoyance(UA) were used and sound quality values were compared both natural and artificial sounds, depending on the convolution of sound sources with background noise, the duration, the frequency contents and the SPL. As a result, the values of L and UA have shown to be changed comparing to the other parameters, and it is necessary to analyse the correlation with subjects' responses.

  • PDF

A Study on the Acoustical Characteristics of Pistol Impluse and MLS Source Measurements in Room Types (음향측정시 실의 종류와 음원에 따르는 음향인자 측정분석에 관한 연구)

  • Kim, Jeong-Jung;Son, Jang-Ryeol
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.11a
    • /
    • pp.1028-1031
    • /
    • 2004
  • Last target of architectural acoustics is that people wish to convey voice effectively from the space adaptively in use purpose in building. But, to how exactly through space sound source that wish to deliver from indoor can be passed does quantification sound estimation method is proposing various kinds physical parameter to estimate degree of voice definition (Speech articulation) and reverberation. Result that evaluate sound source about MLS signal and Impluse signal by pistol in this treatise could know that converge in MLS and measurement error extent about reverberation time(RT) of sound benevolent person. And value is thought there is problem showing change irregularly about sound benevolent person of D50, C80 etc. Finally, in case is spread sound field in difference of sound pressure level, when measure about change of sound pressure level, sound benevolent person could know that there is no different effect.

  • PDF

Research on PEFT Feasibility for On-Device Military AI (온 디바이스 국방 AI를 위한 PEFT 효용성 연구)

  • Gi-Min Bae;Hak-Jin Lee;Sei-Ok Kim;Jang-Hyong Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.51-54
    • /
    • 2024
  • 본 논문에서는 온 디바이스 국방 AI를 위한 효율적인 학습 방법을 제안한다. 제안하는 방법은 모델 전체를 재학습하는 대신 필요한 부분만 세밀하게 조정하여 계산 비용과 시간을 대폭 줄이는 PEFT 기법의 LoRa를 적용하였다. LoRa는 기존의 신경망 가중치를 직접 수정하지 않고 추가적인 낮은 랭크의 매트릭스를 학습하는 방식으로 기존 모델의 구조를 크게 변경하지 않으면서도, 효율적으로 새로운 작업에 적응할 수 있다. 또한 학습 파라미터 및 연산 입출력에 데이터에 대하여 32비트의 부동소수점(FP32) 대신 부동소수점(FP16, FP8) 또는 정수형(INT8)을 활용하는 경량화 기법인 양자화도 적용하였다. 적용 결과 학습시 요구되는 GPU의 사용량이 32GB에서 5.7GB로 82.19% 감소함을 확인하였다. 동일한 조건에서 동일한 데이터로 모델의 성능을 평가한 결과 동일 학습 횟수에선 LoRa와 양자화가 적용된 모델의 오류가 기본 모델보다 53.34% 증가함을 확인하였다. 모델 성능의 감소를 줄이기 위해서는 학습 횟수를 더 증가시킨 결과 오류 증가율이 29.29%로 동일 학습 횟수보다 더 줄어듬을 확인하였다.

  • PDF