Search | Korea Science

Robust Endpoint Detection for Bimodal System in Noisy Environments (잡음환경에서의 바이모달 시스템을 위한 견실한 끝점검출)

오현화;권홍석;손종목;진성일;배건성
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.40 no.5
- /
- pp.289-297
- /
- 2003
The performance of a bimodal system is affected by the accuracy of the endpoint detection from the input signal as well as the performance of the speech recognition or lipreading system. In this paper, we propose the endpoint detection method which detects the endpoints from the audio and video signal respectively and utilizes the signal to-noise ratio (SNR) estimated from the input audio signal to select the reliable endpoints to the acoustic noise. In other words, the endpoints are detected from the audio signal under the high SNR and from the video signal under the low SNR. Experimental results show that the bimodal system using the proposed endpoint detector achieves satisfactory recognition rates, especially when the acoustic environment is quite noisy.
PDF KSCI

A Spectral Compensation Method for Noise Robust Speech Recognition (잡음에 강인한 음성인식을 위한 스펙트럼 보상 방법)

Cho, Jung-Ho
- 전자공학회논문지 IE
- /
- v.49 no.2
- /
- pp.9-17
- /
- 2012
One of the problems on the application of the speech recognition system in the real world is the degradation of the performance by acoustical distortions. The most important source of acoustical distortion is the additive noise. This paper describes a spectral compensation technique based on a spectral peak enhancement scheme followed by an efficient noise subtraction scheme for noise robust speech recognition. The proposed methods emphasize the formant structure and compensate the spectral tilt of the speech spectrum while maintaining broad-bandwidth spectral components. The recognition experiments was conducted using noisy speech corrupted by white Gaussian noise, car noise, babble noise or subway noise. The new technique reduced the average error rate slightly under high SNR(Signal to Noise Ratio) environment, and significantly reduced the average error rate by 1/2 under low SNR(10 dB) environment when compared with the case of without spectral compensations.
PDF KSCI

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.6
- /
- pp.544-551
- /
- 2023
Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.
https://doi.org/10.7776/ASK.2023.42.6.544 인용 PDF

PN Code Acquisition at Low Signal-to-Noise Ratio Based on Seed Accumulating Sequential Estimation (시드 누적 순차적 추정 기법을 이용한 낮은 신호대잡음비 환경에서의 의사 잡음 부호 획득)

윤석호;김선용
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.9A
- /
- pp.678-683
- /
- 2003
The pseudo-noise (PN) code acquisition based on the sequential estimation (SE) proposed by Ward performs well only at relatively high chip signal-to-noise ratios (SNRs). In this paper, a seed accumulating sequential estimation (SASE) method and a PN code acquisition system based on it are proposed, which perform well at low chip SNR (of practical interest) also. Then, the mean acquisition time performance of the proposed system is investigated. Numerical results show that the system based on the SASE performs dramatically better than that based on the SE at low chip SNR, and the improvement becomes larger as the period of PN code increases.
PDF KSCI

Study of Target Tracking Algorithm using iterative Joint Integrated Probabilistic Data Association in Low SNR Multi-Target Environments (낮은 SNR 다중 표적 환경에서의 iterative Joint Integrated Probabilistic Data Association을 이용한 표적추적 알고리즘 연구)

Kim, Hyung-June;Song, Taek-Lyul
- Journal of the Korea Institute of Military Science and Technology
- /
- v.23 no.3
- /
- pp.204-212
- /
- 2020
For general target tracking works by receiving a set of measurements from sensor. However, if the SNR(Signal to Noise Ratio) is low due to small RCS(Radar Cross Section), caused by remote small targets, the target's information can be lost during signal processing. TBD(Track Before Detect) is an algorithm that performs target tracking without threshold for detection. That is, all sensor data is sent to the tracking system, which prevents the loss of the target's information by thresholding the signal intensity. On the other hand, using all sensor data inevitably leads to computational problems that can severely limit the application. In this paper, we propose an iterative Joint Integrated Probabilistic Data Association as a practical target tracking technique suitable for a low SNR multi-target environment with real time operation capability, and verify its performance through simulation studies.
https://doi.org/10.9766/KIMST.2020.23.3.204 인용 PDF KSCI

LP-Based SNR Estimation with Low Computation Complexity (낮은 계산 복잡도를 갖는 Linear Prediction 기반의 SNR 추정 기법)

Kim, Seon-Ae;Jo, Byung-Gak;Baek, Gwang-Hoon;Ryu, Heung-Gyoon
- The Journal of Korean Institute of Electromagnetic Engineering and Science
- /
- v.20 no.12
- /
- pp.1287-1296
- /
- 2009
It is very important to estimate the Signal to Noise Ratio(SNR) of received signal in time varying channel state. Most SNR estimation techniques derive the SNR estimates solely from the samples of the received signal after the matched filter. In the severe distorted wireless channel, the performance of these estimators become unstable and degraded. LP-based SNR estimator which can operate on data samples collected at the front-end of a receiver shows more stable performance than other SNR estimator. In this paper, we study an efficient SNR estimation algorithm based on LP and propose a new estimation method to decrease the computation complexity. Proposed algorithm accomplishes the SNR estimation process efficiently because it uses the forward prediction error and its conjugate value during the linear prediction error update. Via the computer simulation, the performance of this proposed estimation method is compared and discussed with other conventional SNR estimators in digital communication channels.
https://doi.org/10.5515/KJKIEES.2009.20.12.1287 인용 PDF KSCI

Performance of Serial Concatenated Convolutional Codes according to the Concatenation Methods of Component Codes (구성부호의 연접방법에 따른 직렬연접 길쌈부호의 성능)

Bae, Sang-Jae;Lee, Sang-Hoon;Joo, Eon-Kyeong
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.27 no.1A
- /
- pp.18-25
- /
- 2002
In this paper, the performance of three types of serial concatenated convolutional codes (SCCC) in AWGN (additive white Gaussian noise) channel is compared and analyzed. As results of simulations, it can be observed that Type I shows the best error performance at lower signal-to-noise ratio (SNR) region. However, Type III shows the best error performance at higher SNR region. It can be also observed the error floor that the performance cannot be improved even though increasing of the number of iterations and SNR at Type I. However, the performance of Type II and Type III are still improved over the five iterations at higher SNR without error floor. And BER performance of three types can be closed to upper bound of three types with increase of SNR. It can be also observed that the upper bound of Type III shows the best performance among the three types due to the greatest free distance.
PDF KSCI

A study on DEMONgram frequency line extraction method using deep learning (딥러닝을 이용한 DEMON 그램 주파수선 추출 기법 연구)

Wonsik Shin;Hyuckjong Kwon;Hoseok Sul;Won Shin;Hyunsuk Ko;Taek-Lyul Song;Da-Sol Kim;Kang-Hoon Choi;Jee Woong Choi
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.1
- /
- pp.78-88
- /
- 2024
Ship-radiated noise received by passive sonar that can measure underwater noise can be identified and classified ship using Detection of Envelope Modulation on Noise (DEMON) analysis. However, in a low Signal-to-Noise Ratio (SNR) environment, it is difficult to analyze and identify the target frequency line containing ship information in the DEMONgram. In this paper, we conducted a study to extract target frequency lines using semantic segmentation among deep learning techniques for more accurate target identification in a low SNR environment. The semantic segmentation models U-Net, UNet++, and DeepLabv3+ were trained and evaluated using simulated DEMONgram data generated by changing SNR and fundamental frequency, and the DEMONgram prediction performance of DeepShip, a dataset of ship-radiated noise recordings on the strait of Georgia in Canada, was compared using the trained models. As a result of evaluating the trained model with the simulated DEMONgram, it was confirmed that U-Net had the highest performance and that it was possible to extract the target frequency line of the DEMONgram made by DeepShip to some extent.
https://doi.org/10.7776/ASK.2024.43.1.078 인용 PDF

Preprocessing Technique for Improvement of Speech Recognition in a Car (차량에서의 음성인식율 향상을 위한 전처리 기법)

Kim, Hyun-Tae;Park, Jang-Sik
- The Journal of the Korea Contents Association
- /
- v.9 no.1
- /
- pp.139-146
- /
- 2009
This paper addresses a modified spectral subtraction schemes which is suitable to speech recognition under low signal-to-noise ratio (SNR) noisy environment such as the automatic speech recognition (ASR) system in car. The conventional spectral subtraction schemes rely on the SNR such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. Proposed methods focused specifically to low SNR noisy environment by using weighting function for enhancing speech dominant region in speech spectrum. Experimental results by using voice commands for car show the superior performance of the proposed method over conventional methods.
https://doi.org/10.5392/JKCA.2009.9.1.139 인용 PDF

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments (잡음환경에서 Teager Energy 기반의 전역 음성부재확률을 이용하는 음성검출)

Park, Yun-Sik;Lee, Sang-Min
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.49 no.1
- /
- pp.97-103
- /
- 2012
In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. Global speech absence probability (GSAP) derived from likelihood ratio (LR) based on the statistical model is widely used as the feature parameter for VAD. However, the feature parameter based on conventional GSAP is not sufficient to distinguish speech from noise at low SNRs (signal-to-noise ratios). The presented VAD algorithm utilizes GSAP based on Teager energy (TE) as the feature parameter to provide the improved performance of decision for speech segments in noisy environment. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.
PDF KSCI

Search Result 43, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)