Search | Korea Science

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

Yoon, Ki-mu;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.1
- /
- pp.47-50
- /
- 2019
This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).
https://doi.org/10.7776/ASK.2019.38.1.047 인용 PDF KSCI HTML

Recognition for Noisy Speech by a Nonstationary AR HMM with Gain Adaptation Under Unknown Noise (잡음하에서 이득 적응을 가지는 비정상상태 자기회귀 은닉 마코프 모델에 의한 오염된 음성을 위한 인식)

이기용;서창우;이주헌
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1
- /
- pp.11-18
- /
- 2002
In this paper, a gain-adapted speech recognition method in noise is developed in the time domain. Noise is assumed to be colored. To cope with the notable nonstationary nature of speech signals such as fricative, glides, liquids, and transition region between phones, the nonstationary autoregressive (NAR) hidden Markov model (HMM) is used. The nonstationary AR process is represented by using polynomial functions with a linear combination of M known basis functions. When only noisy signals are available, the estimation problem of noise inevitably arises. By using multiple Kalman filters, the estimation of noise model and gain contour of speech is performed. Noise estimation of the proposed method can eliminate noise from noisy speech to get an enhanced speech signal. Compared to the conventional ARHMM with noise estimation, our proposed NAR-HMM with noise estimation improves the recognition performance about 2-3%.
PDF KSCI

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.3
- /
- pp.234-240
- /
- 2021
In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.
https://doi.org/10.7776/ASK.2021.40.3.234 인용 PDF KSCI

Color Image Processing using Fuzzy Cluster Filters and Weighted Vector $\alpha$-trimmed Mean Filter (퍼지 클러스터 필터와 가중화 된 벡터 $\alpha$-trimmed 평균 필터를 이용한 칼라 영상처리)

엄경배;이준환
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.24 no.9B
- /
- pp.1731-1741
- /
- 1999
Color images are often corrupted by the noise due to noisy sensors or channel transmission errors. Some filters such as vector media and vector $\alpha$-trimmed mean filter have bee used for color noise removal. In this paper, We propose the fuzzy cluster filters based on the possibilistic c-means clustering, because the possibilistic c-means clustering can get robust memberships in noisy environments. Also, we propose weighted vector $\alpha$-trimmed mean filter to improve the conventional vector $\alpha$-trimmed mean filter. In this filter, the central data are more weighted than the outlying data. In this paper, we implemented the color noise generator to evaluate the performance of the proposed filters in the color noise environments. The NCD measure and visual measure by human observer are used for evaluation the performance of the proposed filters. In the experiment, proposed fuzzy cluster filters in the sense of NCD measure gave the best performance over conventional filters in the mixed noise. Simulation results showed that proposed weighted vector $\alpha$-trimmed mean filters better than the conventional vector $\alpha$-trimmed mean filter in any kinds of noise.
PDF

The Algorithm for Weak Signal Detection and Estimation (미소신호 검출과 추정에 관한 알고리즘)

신승호;진용옥
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.11 no.5
- /
- pp.349-359
- /
- 1986
This paper is the basic research to identify automatically signals that are less than the bandwidth of 200Hz in shortwave band between 3 to 7 MHz and rarely appear. In order to do so, first, we describe the Detection and Estimation method of testing for the presence of absence about OOK signals of odB degree in 100KHz bandwidth. In the course of Detection and Estimation, it has decided the presence of OOK modulation Signal in additive noise to about 77% using LOD and E-C and about 90% using pattern model method of correlation function.
PDF

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.38-44
- /
- 2022
Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.
https://doi.org/10.7776/ASK.2022.41.1.038 인용 PDF KSCI

Hierarchical Smoothing Technique by Empirical Mode Decomposition (경험적 모드분해법에 기초한 계층적 평활방법)

Kim Dong-Hoh;Oh Hee-Seok
- The Korean Journal of Applied Statistics
- /
- v.19 no.2
- /
- pp.319-330
- /
- 2006
A signal in real world usually composes of multiple signals having different scales of frequencies. For example sun-spot data is fluctuated over 11 year and 85 year. Economic data is supposed to be compound of seasonal component, cyclic component and long-term trend. Decomposition of the signal is one of the main topics in time series analysis. However when the signal is subject to nonstationarity, traditional time series analysis such as spectral analysis is not suitable. Huang et. at(1998) proposed data-adaptive method called empirical mode decomposition (EMD) . Due to its robustness to nonstationarity, EMD has been applied to various fields. Huang et. at, however, have not considered denoising when data is contaminated by error. In this paper we propose efficient denoising method utilizing cross-validation.
https://doi.org/10.5351/KJAS.2006.19.2.319 인용 PDF KSCI

Region-Segmental Scheme in Local Normalization Process of Digital Image (디지털영상 국부정규화처리의 영역분할 구도)

Hwang, Jung-Won;Hwang, Jae-Ho
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.44 no.4 s.316
- /
- pp.78-85
- /
- 2007
This paper presents a segmental scheme for regions-composed images in local normalization process. The scheme is based on local statistics computed through a moving window. The normalization algorithm uses linear or nonlinear functions to transfer the pixel distribution and the homogeneous affine of regions which is corrupted by additive noise. It adjusts the mean and standard deviation for nearest-neighbor interpoint distance between current and the normalized image signals and changes the segmentation performance according to local statistics and parameter variation adaptively. The performance of newly advanced local normalization algorithm is evaluated and compared to the performance of conventional normalization methods. Experimental results are presented to show the region segmentation properties of these approaches.
PDF KSCI

Fault Detection and Reuse of Self-Adaptive Module (자가 적응 모듈의 오류 탐지와 재사용)

Lee, Joon-Hoon;Lee, Hee-Won;Park, Jeong-Min;Jung, Jin-Su;Lee, Eun-Seok
- Proceedings of the Korean Information Science Society Conference
- /
- 2007.10b
- /
- pp.247-252
- /
- 2007
오늘날 컴퓨팅 환경은 점차 복잡해지고 있으며, 복잡한 환경을 관리하는 이 점차 중요해 지고 있다. 이러한 관리를 위해 어플리케이션의 내부 구조를 드러내지 않은 상태에서 환경에 적응하는 자가치유에 관한 연구가 중요한 이슈가 되고 있다. 우리의 이전 연구에서는 자가 적응 모듈의 성능 향상을 위해 스위치를 사용하여 컴포넌트의 동작 유무를 결정하였다. 그러나 바이러스와 같은 외부 상황에 의해 자가 적응 모듈이 정상적으로 동작하지 않을 수 있으며 다수의 파일을 전송할 때 스위치가 꺼진 컴포넌트들은 메모리와 같은 리소스를 낭비한다. 본 연구에서는 이전 연구인 성능 개선 자가 적응 모듈에서 발생할 수 있는 문제점을 해결하기 위한 방법을 제안한다. 1) 컴포넌트의 동작 여부를 결정하는 스위치를 확인하여 비정상 상태인 컴포넌트를 찾아 치유를 하고, 2) 현재 단계에서 사용하지 않는 컴포넌트를 다른 작업에서 재사용한다. 이러한 제안 방법론을 통해 파일 전송이 않은 상황에서도 전체 컴포넌트의 수를 줄일 수 있으며 자가 적응 제어 모듈을 안정적으로 작동할 수 있도록 한다. 본 논문에서는 명가를 위하여 비디오 회의 시스템 내의 파일 전송 모듈에 제안 방법론을 적용하여 이전 연구의 모듈과 제안 방법론을 적용한 모듈이 미리 정한 상황들에서 정상적으로 적응할 수 있는지를 비교한다. 또한 파일 전송이 많은 상황에서 제안 방법론을 적용하였을 때 이전 연구 방법론과의 컴포넌트 수를 비교한다. 이를 통해 이전 연구의 자가 적응 모듈의 비정상 상태를 찾아낼 수 있었고, 둘 이상의 파일 전송이 이루어 질 때 컴포넌트의 재사용을 통해 리소스의 사용을 줄일 수 있었다.위해 잡음과 그림자 영역을 제거한다. 잡음과 그림자 영역을 제거하면 구멍이 발생하거나 실루엣이 손상되는 문제가 발생한다. 손상된 정보는 근접한 픽셀이 유사하지 않을 때 낮은 비용을 할당하는 에너지 함수의 스무드(smooth) 항에 의해 에지 정보를 기반으로 채워진다. 결론적으로 제안된 방법은 스무드 항과 대략적으로 설정된 데이터 항으로 구성된 에너지 함수를 그래프 컷으로 전역적으로 최소화함으로써 더욱 정확하게 목적이 되는 영역을 추출할 수 있다.능적으로 우수한 기호성, 즉석에서 먹을 수 있는 간편성, 장기저장에 의한 식품 산패, 오염 및 변패 미생물의 생육 등이 발생하지 않는 우수한 생선가공, 저장방법, 저가 생선류의 부가가치 상승 등 여러 유익한 결과를 얻을 수 있는 효과적인 가공방법을 증명하였다.의 평균섭취량에도 미치지 못하는 매우 저조한 영양상태를 보여 경제력, 육체적 활동 및 건강상태 등이 매우 열악한 이들 집단에 대한 질 좋은 영양서비스의 제공이 국가적 차원에서 시급히 재고되어야 할 것이다. 연구대상자 특히 배달급식 대상자의 경우 모집의 어려움으로 인해 적은 수의 연구대상자의 결과를 보고한 것은 본 연구의 제한점이라 할 수 있다 따라서 본 연구결과를 바탕으로 좀 더 많은 대상자를 대상으로 한 조사 연구가 계속 이루어져 가정배달급식 프로그램의 개선을 위한 유용한 자료로 축적되어야 할 것이다.상범주로 회복함을 알수 있었고 실험결과 항암제 투여후 3 일째 피판 형성한 군에서 피판치유가 늦어진 것으로 관찰되어 인체에서 항암 투여후 수술시기는 인체면역계가 회복하는 시기를 3주이상 경과후 적어도 4주째 수술시기를 정하는 것이 유리하리라 생각되
PDF

Empirical Mode Decomposition using the Second Derivative (이차 미분을 이용한 경험적 모드분해법)

Park, Min-Su;Kim, Donghoh;Oh, Hee-Seok
- The Korean Journal of Applied Statistics
- /
- v.26 no.2
- /
- pp.335-347
- /
- 2013
There are various types of real world signals. For example, an electrocardiogram(ECG) represents myocardium activities (contraction and relaxation) according to the beating of the heart. ECG can be expressed as the fluctuation of ampere ratings over time. A signal is a composite of various types of signals. An orchestra (which boasts a beautiful melody) consists of a variety of instruments with a unique frequency; subsequently, each sound is combined to form a perfect harmony. Various research on how to to decompose mixed stationary signals have been conducted. In the case of non-stationary signals, there is a limitation to use methodologies for stationary signals. Huang et al. (1998) proposed empirical mode decomposition(EMD) to deal with non-stationarity. EMD provides a data-driven approach to decompose a signal into intrinsic mode functions according to local oscillation through the identification of local extrema. However, due to the repeating process in the construction of envelopes, EMD algorithm is not efficient and not robust to a noise, and its computational complexity tends to increase as the size of a signal grows. In this research, we propose a new method to extract a local oscillation embedded in a signal by utilizing the second derivative.
https://doi.org/10.5351/KJAS.2013.26.2.335 인용 PDF KSCI

Search Result 10, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)