• 제목/요약/키워드: 잡음에 강인한 특징

검색결과 121건 처리시간 0.03초

A Tracking Method of Robust Lip Movement Image Regions for Blocking the External Acoustic Noise (외부응향잡음 차단을 위한 강인한 입술움직임 영상영역 추적방법)

  • Kim, Eung-Kyeu
    • Proceedings of the KIEE Conference
    • /
    • 대한전기학회 2009년도 제40회 하계학술대회
    • /
    • pp.1913_1914
    • /
    • 2009
  • 본 논문에서 조명환경하에서 음성/영상 연동시스템을 통해서 외부음향잡음의 차단을 위한 강인한 입술움직임 영상영역을 추적하는 한 가지 방법을 제안한다. 조명환경하에서 강인한 입술움직임 영상영역을 추적하기 위해 온라인상에서 입술움직임 표준영상을 수집하였고 다양한 조명환경에 적응하는 입술 움직임 영상의 특징들을 추출하였다. 동시에 온라인 템플릿 영상을 획득하였고, 이 영상들을 템플릿 정합을 위해 사용했다. 음성/영상처리시스템의 연동결과, 다양한 조명환경하에서 그 연동률을 99.3%까지 높일 수 있었고 음향잡음에 의한 음성인식 실행을 원천적으로 차단할 수 있었다.

  • PDF

Robust Distributed Speech Recognition under noise environment using MESS and EH-VAD (멀티밴드 스펙트럼 차감법과 엔트로피 하모닉을 이용한 잡음환경에 강인한 분산음성인식)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • 제48권1호
    • /
    • pp.101-107
    • /
    • 2011
  • The background noises and distortions by channel are major factors that disturb the practical use of speech recognition. Usually, noise reduce the performance of speech recognition system DSR(Distributed Speech Recognition) based speech recognition also bas difficulty of improving performance for this reason. Therefore, to improve DSR-based speech recognition under noisy environment, this paper proposes a method which detects accurate speech region to extract accurate features. The proposed method distinguish speech and noise by using entropy and detection of spectral energy of speech. The speech detection by the spectral energy of speech shows good performance under relatively high SNR(SNR 15dB). But when the noise environment varies, the threshold between speech and noise also varies, and speech detection performance reduces under low SNR(SNR 0dB) environment. The proposed method uses the spectral entropy and harmonics of speech for better speech detection. Also, the performance of AFE is increased by precise speech detections. According to the result of experiment, the proposed method shows better recognition performance under noise environment.

Bird sounds classification by combining PNCC and robust Mel-log filter bank features (PNCC와 robust Mel-log filter bank 특징을 결합한 조류 울음소리 분류)

  • Badi, Alzahra;Ko, Kyungdeuk;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • 제38권1호
    • /
    • pp.39-46
    • /
    • 2019
  • In this paper, combining features is proposed as a way to enhance the classification accuracy of sounds under noisy environments using the CNN (Convolutional Neural Network) structure. A robust log Mel-filter bank using Wiener filter and PNCCs (Power Normalized Cepstral Coefficients) are extracted to form a 2-dimensional feature that is used as input to the CNN structure. An ebird database is used to classify 43 types of bird species in their natural environment. To evaluate the performance of the combined features under noisy environments, the database is augmented with 3 types of noise under 4 different SNRs (Signal to Noise Ratios) (20 dB, 10 dB, 5 dB, 0 dB). The combined feature is compared to the log Mel-filter bank with and without incorporating the Wiener filter and the PNCCs. The combined feature is shown to outperform the other mentioned features under clean environments with a 1.34 % increase in overall average accuracy. Additionally, the accuracy under noisy environments at the 4 SNR levels is increased by 1.06 % and 0.65 % for shop and schoolyard noise backgrounds, respectively.

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

  • Kao, Chao Yuan;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • 제38권6호
    • /
    • pp.670-677
    • /
    • 2019
  • As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.

Robust-Detection of Pig Respiratory Diseases in the Noisy Environment (잡음 환경에 강인한 돼지 호흡기 질병 탐지)

  • Lee, Jonguk;Choi, Yongju;Lee, Junhee;Park, Daihee;Chung, Yongwha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2018년도 춘계학술발표대회
    • /
    • pp.327-330
    • /
    • 2018
  • 국내 축산 농가들은 대부분 돼지우리의 구역을 나눈 후 해당 구역별로 30여 마리의 돼지들을 합사하여 사육하고 있다. 따라서 전염성이 강한 호흡기 질병이 발병하게 되면 돼지우리 전체로 확산되어 심각한 피해가 발생하게 된다. 본 논문에서는 돼지우리에서 발생하는 다양한 소음에도 강인한 소리 기반의 호흡기 질병 탐지 시스템을 제안한다. 제안된 시스템은 먼저, 소리 신호에서 스펙트로그램 정보를 추출하고, 이를 CNN을 기반으로 돼지 호흡기 질병에 효과적인 특징 벡터를 생성한다. 마지막으로, 추출된 특징 벡터를 MLP에 적용하여 해당 호흡기 질병을 탐지 및 식별과정을 수행한다. 본 연구의 실험 결과, 다양한 잡음 환경에서도 돼지 호흡기 질병 탐지 및 식별이 가능함을 확인하였다.

A Single Channel Voice Activity Detection for Noisy Environments Using Wavelet Packet Decomposition and Teager Energy (웨이블렛 패킷 변환과 Teager 에너지를 이용한 잡음 환경에서의 단일 채널 음성 판별)

  • Koo, Boneung
    • The Journal of the Acoustical Society of Korea
    • /
    • 제33권2호
    • /
    • pp.139-145
    • /
    • 2014
  • In this paper, a feature parameter is obtained by applying the Teager energy to the WPD(Wavelet Packet Decomposition) coefficients. The threshold value is obtained based on means and standard deviations of nonspeech frames. Experimental results by using TIMIT speech and NOISEX-92 noise databases show that the proposed algorithm is superior to the typical VAD algorithm. The ROC(Receiver Operating Characteristics) curves are used to compare performance of VAD's for SNR values of ranging from 10 to -10 dB.

Noise Rabust Speaker Verification Using Sub-Band Weighting (서브밴드 가중치를 이용한 잡음에 강인한 화자검증)

  • Kim, Sung-Tak;Ji, Mi-Kyong;Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • 제28권3호
    • /
    • pp.279-284
    • /
    • 2009
  • Speaker verification determines whether the claimed speaker is accepted based on the score of the test utterance. In recent years, methods based on Gaussian mixture models and universal background model have been the dominant approaches for text-independent speaker verification. These speaker verification systems based on these methods provide very good performance under laboratory conditions. However, in real situations, the performance of speaker verification system is degraded dramatically. For overcoming this performance degradation, the feature recombination method was proposed, but this method had a drawback that whole sub-band feature vectors are used to compute the likelihood scores. To deal with this drawback, a modified feature recombination method which can use each sub-band likelihood score independently was proposed in our previous research. In this paper, we propose a sub-band weighting method based on sub-band signal-to-noise ratio which is combined with previously proposed modified feature recombination. This proposed method reduces errors by 28% compared with the conventional feature recombination method.

Development of Interferences Reduction Algorithm for Ambulatory Blood Pressure Measurement (휴대용 혈압 측정을 위한 잡음 제거 알고리즘의 개발)

  • Choi, Hyun-Seok;Park, Ho-Dong;Lee, Kyoung-Joung
    • Proceedings of the KIEE Conference
    • /
    • 대한전기학회 2008년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.131-132
    • /
    • 2008
  • 오실로메트릭 방법으로 휴대용 혈압 측정 시 빈번하게 발생하는 잡음에 의한 오실레이션 신호의 왜곡을 줄이기 위해 새로운 잡음 제거 알고리즘을 제안하였다. 제안된 잡음 제거 알고리즘은 선형 예측기 구조 기반의 적응 필터를 이용한다. 제안된 잡음 제거알고리즘의 성능을 평가하기 위해 왜곡된 오실레이션 신호에 선형보간법을 사용하는 기존의 방법과 적응 필터를 사용하는 제안된 방법을 적용하여 잡음 제거 성능을 비교하였다. 제안된 방법은 잡음이 오실레이션과 중첩되어 나타난 경우에 기존의 방법과 달리 잡음에 강인한 특징을 보여주었다.

  • PDF

Robust iris recognition for local noise based on wavelet transforms (국부잡음에 강인한 웨이블릿 기반의 홍채 인식 기법)

  • Park Jonggeun;Lee Chulhee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • 제42권2호
    • /
    • pp.121-130
    • /
    • 2005
  • In this paper, we propose a feature extraction method for iris recognition using wavelet transforms. The wavelet transform is fast and has a good localization characteristic. In particular, the low frequency band can be used as an effective feature vector. In iris recognition, the noise caused by eyelid the eyebrow, glint, etc may be included in iris. The iris pattern is distorted by noises by itself, and a feature extraction algorithm based on filter such as Wavelets, Gabor transform spreads noises into whole iris region. Namely, such noises degrade the performance of iris recognition systems a major problem. This kind of noise has adverse effect on performance. In order to solve these problems, we propose to divide the iris image into a number of sub-region and apply the wavelet transform to each sub-region. Experimental results show that the performance of proposed method is comparable to existing methods using Gabor transform and region division noticeably improves recognition performance. However, it is noted that the processing time of the wavelet transform is much faster than that of the existing methods.

Noise-Robust Porcine Respiratory Diseases Classification Using Texture Analysis and CNN (질감 분석과 CNN을 이용한 잡음에 강인한 돼지 호흡기 질병 식별)

  • Choi, Yongju;Lee, Jonguk;Park, Daihee;Chung, Yongwha
    • KIPS Transactions on Software and Data Engineering
    • /
    • 제7권3호
    • /
    • pp.91-98
    • /
    • 2018
  • Automatic detection of pig wasting diseases is an important issue in the management of group-housed pigs. In particular, porcine respiratory diseases are one of the main causes of mortality among pigs and loss of productivity in intensive pig farming. In this paper, we propose a noise-robust system for the early detection and recognition of pig wasting diseases using sound data. In this method, first we convert one-dimensional sound signals to two-dimensional gray-level images by normalization, and extract texture images by means of dominant neighborhood structure technique. Lastly, the texture features are then used as inputs of convolutional neural networks as an early anomaly detector and a respiratory disease classifier. Our experimental results show that this new method can be used to detect pig wasting diseases both economically (low-cost sound sensor) and accurately (over 96% accuracy) even under noise-environmental conditions, either as a standalone solution or to complement known methods to obtain a more accurate solution.