Search | Korea Science

Robust Speech Endpoint Detection in Noisy Environments for HRI (Human-Robot Interface) (인간로봇 상호작용을 위한 잡음환경에 강인한 음성 끝점 검출 기법)

Park, Jin-Soo;Ko, Han-Seok
- The Journal of the Acoustical Society of Korea
- /
- v.32 no.2
- /
- pp.147-156
- /
- 2013
In this paper, a new speech endpoint detection method in noisy environments for moving robot platforms is proposed. In the conventional method, the endpoint of speech is obtained by applying an edge detection filter that finds abrupt changes in the feature domain. However, since the feature of the frame energy is unstable in such noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction method based on the twice-iterated fast fourier transform (TIFFT) and statistical models of speech is proposed. The proposed feature extraction method was applied to an edge detection filter for effective detection of the endpoint of speech. Representative experiments claim that there was a substantial improvement over the conventional method.
https://doi.org/10.7776/ASK.2013.32.2.147 인용 PDF KSCI

Signal Processing for Speech Recognition in Noisy Environment (잡음 환경에서 음성 인식을 위한 신호처리)

Kim, Weon-Goo;Lim, Yong-Hoon;Cha, Il-Whan;Youn, Dae-Hee
- The Journal of the Acoustical Society of Korea
- /
- v.11 no.2
- /
- pp.73-84
- /
- 1992
This paper studies noise subtraction methods and distance measures for speech recognition in a noisy environment, and investigates noise robustness of the distance measures applied to the problem of isolated word recognition in white Gaussian and colored noise (vehicle noise) environments. Noise subtraction methods which can be used as a pre-processor for the speech recognition system, such as the spectral subtraction method, autocorrelation subtraction method, adaptive noise cancellation and acoustic beamforming are studied, and distance measures such and Log Likelihood Ratio ($d_{LLR}$), cepstral distance measure ($d_{CEP}$), weighted cepstral distance measure ($d_{WCEP}$), spectral slope distance measure ($d_{RPS}$) and cepstral projection distance measure ($d_{CP},\;d_{BCP},\;d_{WCP},\;d_{BWCP}$) are also investigated. Testing of the distance measures for speaker-dependent isolated word recognition in a noisy environment indicate that $d_{RPS}\;and\;d_{WCEP}$ which weigh higher order cepstral coefficients more heavily give considerable performance improvement over $d_{CEP}and\;d_{LLR}$. In addition, when no pre-emphasis is performed, the recognizer can maintain higher performance under high noise conditions.
PDF

A Study on SNR Estimation of Continuous Speech Signal (연속음성신호의 SNR 추정기법에 관한 연구)

Song, Young-Hwan;Park, Hyung-Woo;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.4
- /
- pp.383-391
- /
- 2009
In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. The waveform which is stationary region of voiced speech is very correlated by pitch period. So we can estimate the SNR by correlation of near waveform after dividing a frame for each pitch. For unvoiced speech signal, vocal track characteristic is reflected by noise, so we can estimate SNR by using spectral distance between spectrum of received signal and estimated vocal track. Lastly, energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced.
https://doi.org/10.7776/ASK.2009.28.4.383 인용 PDF KSCI

Time- and Frequency-Domain Optimization of Sparse Multisine Coefficients for Nonlinear Amplifier Characterization

Park, Youngcheol;Yoon, Hoijin
- Journal of electromagnetic engineering and science
- /
- v.15 no.1
- /
- pp.53-58
- /
- 2015
For the testing of nonlinear power amplifiers, this paper suggests an approach to design optimized multisine signals that could be substituted for the original modulated signal. In the design of multisines, complex coefficients should be determined to mimic the target signal as much as possible, but very few methods have been adopted as general solutions to the coefficients. Furthermore, no solid method for the phase of coefficients has been proven to show the best resemblance to the original. Therefore, in order to determine the phase of multisine coefficients, a time-domain nonlinear optimization method is suggested. A frequency-domain-method based on the spectral response of the target signal is also suggested for the magnitude of the coefficients. For the verification, multisine signals are designed to emulate the LTE downlink signal of 10 MHz bandwidth and are used to test a nonlinear amplifier at 1.9 GHz. The suggested phase-optimized multisine had a lower normalized error by 0.163 dB when N = 100, and the measurement results showed that the suggested multisine achieved more accurate adjacent-channel leakage ratio (ACLR) estimation by as much as 12 dB compared to that of the conventional iterative method.
https://doi.org/10.5515/JKIEES.2015.15.1.53 인용 PDF

An Adaptive Wind Noise Reduction Method Based on a priori SNR Estimation for Speech Eenhancement (음성 강화를 위한 a priori SNR 추정기반 적응 바람소리 저감 방법)

Seo, Ji-Hun;Lee, Seok-Pil
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.64 no.12
- /
- pp.1756-1760
- /
- 2015
This paper focuses on a priori signal to noise ratio (SNR) estimation method for the speech enhancement. There are many researches for speech enhancement with several ambient noise cancellation methods. The method based on spectral subtraction (SS) which is widely used in noise reduction has a trade-off between the performance and the distortion of the signals. So the need of adaptive method like an estimated a priori SNR being able to making a high performance and low distortion is increasing. The decision directed (DD) approach is used to determine a priori SNR in noisy speech signals. A priori SNR is estimated by using only the magnitude components and consequently follows a posteriori SNR with one frame delay. We propose a modified a priori SNR estimator and the weighted rational transfer function for speech enhancement with wind noises. The experimental result shows the performance of our proposed estimator is better Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862) compare to the conventional DD approach-based systems and different noise reduction methods.
https://doi.org/10.5370/KIEE.2015.64.12.1756 인용 PDF KSCI

Feasibility study of using triple-energy CT images for improving stopping power estimation

Yejin Kim;Jin Sung Kim ;Seungryong Cho
- Nuclear Engineering and Technology
- /
- v.55 no.4
- /
- pp.1342-1349
- /
- 2023
The planning accuracy of charged particle therapy (CPT) is subject to the accuracy of stopping power (SP) estimation. In this study, we propose a method of deriving a pseudo-triple-energy CT (pTECT) that can be achievable in the existing dual-energy CT (DECT) systems for better SP estimation. In order to remove the direct effect of errors in CT values, relative CT values according to three scanning voltage settings were used. CT values of each tissue substitute phantom were measured to show the non-linearity of the values thereby suggesting the absolute difference and ratio of CT values as parameters for SP estimation. Electron density, effective atomic number (EAN), mean excitation energy and SP were calculated based on these parameters. Two of conventional methods were implemented and compared to the proposed pTECT method in terms of residuals, absolute error and root-mean-square-error (RMSE). The proposed method outperformed the comparison methods in every evaluation metrics. Especially, the estimation error for EAN and mean excitation using pTECT were converging to zero. In this proof-of-concept study, we showed the feasibility of using three CT values for accurate SP estimation. Our suggested pTECT method indicates potential clinical utility of spectral CT imaging for CPT planning.
https://doi.org/10.1016/j.net.2022.12.018 인용 PDF

Frequency Recognition in SSVEP-based BCI systems With a Combination of CCA and PSDA (CCA와 PSDA를 결합한 SSVEP 기반 BCI 시스템의 주파수 인식 기법)

Lee, Ju-Yeong;Lee, Yu-Ri;Kim, Hyoung-Nam
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.10
- /
- pp.139-147
- /
- 2015
Steady state visual evoked potential (SSVEP) has been actively studied because of its short training time, relatively higher signal-to-noise ratio, and higher information transfer rate. There are two popular analysis methods for SSVEP signals: power spectral density analysis (PSDA) and canonical correlation analysis (CCA). However, the PSDA is known to be vulnerable to noise due to the use of a single channel. Although conventional CCA is more accurate than PSDA, it may not be appropriate for the real-time SSVEP-based BCI system when it has short time window length because it uses sinusoidal signals as references. Therefore, the two methods are not efficient for the real-time BCI system that requires a short TW and a high recognition accuracy. To overcome this limitation of the conventional methods, this paper proposes a frequency recognition method with a combination of CCA and PSDA using the difference between powers of canonical variables obtained from the results of CCA. Experimental results show that the performance of the combination of CCA and PSDA is better than that of CCA for the case of a short TW.
https://doi.org/10.5573/ieie.2015.52.10.139 인용 PDF KSCI

Estimation of Rice Canopy Leaf Area Index(LAI) by Spectral Reflectance of Solar Radiation in Paddy Field (태양광 반사율을 이용한 벼 군락의 엽면적지수 추정)

이정택;이춘우;주문갑;홍석영
- KOREAN JOURNAL OF CROP SCIENCE
- /
- v.42 no.2
- /
- pp.173-181
- /
- 1997
To estimate the leaf area index(LAI) of rice plant by non-destructive method, spectral reflectance from rice plant canopy was measured by using the spectroradiometer (LI-1800, LICOR Inc.) with one week interval during the rice growing season at Suwon paddy field in 1993. LAI of two medium late maturing varieties, Daechungbyeo and Ilpumbyeo, and one early maturing variety, Jinbubyeo, were observed and compared with those estimated by vegetation index. The reflectance(R) of visible wavelength remained less than 0.1 over entire growing season, but that of near infrared wavelength remained from 0.1 to 0.5 with the significant positive correlation with LAI. Vegetation index determined by the reflectance of visible against near infrared wavelength showed high correlation with LAI of rice canopy. Vegetation index derived from wide band ratio, NIR(720~1, 100nm) /Blue(400~500nm), showed the highest correlation coefficient with LAI. Vegetation index derived from narrow band(10nm interval) ratio, R910/R460, from transplanting to heading stage corresponded well to measured values (Y=0.16799X-0.79776 ; $R^2$=0.94). But another vegetation index, NIR(720~1, 100nm) /Red (600~700nm), showed higher correlation with LAI than NIR /Blue did from heading stage to maturity.
PDF

Protein molecular structure, degradation and availability of canola, rapeseed and soybean meals in dairy cattle diets

Tian, Yujia;Zhang, Xuewei;Huang, Rongcai;Yu, Peiqiang
- Asian-Australasian Journal of Animal Sciences
- /
- v.32 no.9
- /
- pp.1381-1388
- /
- 2019
Objective: The aims of this study were to reveal the magnitude of the differences in protein structures at a cellular level as well as protein utilization and availability among soybean meal (SBM), canola meal (CM), and rapeseed meal (RSM) as feedstocks in China. Methods: Experiments were designed to compare the three different types of feedstocks in terms of: i) protein chemical profiles; ii) protein fractions partitioned according to Cornell Net Carbohydrate and Protein System; iii) protein molecular structures and protein second structures; iv) special protein compounds-amino acid (AA); v) total digestible protein and energy values; vi) in situ rumen protein degradability and intestinal digestibility. The protein second structures were measured using FT/IR molecular spectroscopy technique. A summary chemical approach in National Research Council (NRC) model was applied to analyze truly digestible protein. Results: The results showed significant differences in both protein nutritional profiles and protein structure parameters in terms of ${\alpha}-helix$, ${\beta}-sheet$ spectral intensity and their ratio, and amide I, amide II spectral intensity and their ratio among SBM, CM, and RSM. SBM had higher crude protein (CP) and AA content than CM and RSM. For dry matter (DM), SBM, and CM had a higher DM content compared with RSM (p<0.05), whereas no statistical significance was found between SBM and CM (p = 0.28). Effective degradability of CP and DM did not demonstrate significant differences among the three groups (p>0.05). Intestinal digestibility of rumen undegradable protein measured by three-step in vitro method showed that there was significant difference (p = 0.05) among SBM, CM, and RSM, which SBM was the highest and RSM was the lowest with CM in between. NRC modeling results showed that digestible CP content in SBM was significantly higher than that of CM and RSM (p<0.05). Conclusion: This study suggested that SBM and CM contained similar protein value and availability for dairy cattle, while RSM had the lowest protein quality and utilization.
https://doi.org/10.5713/ajas.18.0829 인용 PDF KSCI

Robust Blind Source Separation to Noisy Environment For Speech Recognition in Car (차량용 음성인식을 위한 주변잡음에 강건한 브라인드 음원분리)

Kim, Hyun-Tae;Park, Jang-Sik
- The Journal of the Korea Contents Association
- /
- v.6 no.12
- /
- pp.89-95
- /
- 2006
The performance of blind source separation(BSS) using independent component analysis (ICA) declines significantly in a reverberant environment. A post-processing method proposed in this paper was designed to remove the residual component precisely. The proposed method used modified NLMS(normalized least mean square) filter in frequency domain, to estimate cross-talk path that causes residual cross-talk components. Residual cross-talk components in one channel is correspond to direct components in another channel. Therefore, we can estimate cross-talk path using another channel input signals from adaptive filter. Step size is normalized by input signal power in conventional NLMS filter, but it is normalized by sum of input signal power and error signal power in modified NLMS filter. By using this method, we can prevent misadjustment of filter weights. The estimated residual cross-talk components are subtracted by non-stationary spectral subtraction. The computer simulation results using speech signals show that the proposed method improves the noise reduction ratio(NRR) by approximately 3dB on conventional FDICA.
PDF

Search Result 302, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)