Search | Korea Science

Speech Enhancement in Noisy Speech Using Neural Network (신경회로망을 사용한 잡음이 중첩된 음성 강조)

Choi, Jae-Seung
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.5 s.305
- /
- pp.165-172
- /
- 2005
In speech recognition under a noisy environment, it is necessary to construct a system which reduces the noise and enhances the speech. Then it is effective to imitate the human auditory system which has an excellent analytical spectrum mechanism for speech enhancement. Accordingly, this paper proposes an adaptive method using the auditory mechanism which is called lateral inhibition. This method first estimates the noise intensity by neural network, then adaptively adjusts both the coefficients of the lateral inhibition and the adjusting coefficient of amplitude component according to the noise intensity for each input frame. It is confirmed that the proposed method is effective for speech degraded by white noise, colored noise, and road noise based on the spectral distortion measurement.
PDF KSCI

A Study on the Cross-Polarization Interference Canceller of Radio Relay System for Spectral Efficiency Enhancement (주파수 효율 향상을 위한 무선 중계장치의 직교편파 간섭제거기에 관한 연구)

서경환
- Proceedings of the Korea Electromagnetic Engineering Society Conference
- /
- 2001.11a
- /
- pp.25-28
- /
- 2001
In this paper. to eliminate a cross-polarization interference caused by co-channel dual polarization technique of digital radio relay system(DRRS), a cross-polarization interference canceller(XPIC) is analysed in terms of the analytical modeling, digital design, and its performance. By virtue of a 13-tap adaptive equalizer and XPIC, about 23dB in XPIC improvement can be obtained by computer simulation. To show the operation of designed XPIC, some simulated results are reviewed under 64-QAM DRRS with co -channel dual Polarization.
PDF

Comparison of Different Methods to Merge IRS-1C PAN and Landsat TM Data (IRS-1C PAN 데이터와 Landsat TM 데이터의 종합방법 비교분석)

안기원;서두천
- Korean Journal of Remote Sensing
- /
- v.14 no.2
- /
- pp.149-164
- /
- 1998
The main object of this study was to prove the effectiveness of different merging methods by using the high resolution IRS(Indian Remote Sensing Satellite)-1C panchromatic data and the multispectral Landsat TM data. The five methods used to merging the information contents of each of the satellite data were the intensity-hue-saturation(IHS), principal component analysis(PCA), high pass filter(HPF), ratio enhancement method and look-up-table(LUT) procedures. Two measures are used to evaluate the merging method. These measures include visual inspection and comparisons of the mean, standard deviation and root mean square error between merged image and original image data values of each band. The ratio enhancement method was well preserved the spectral characteristics of the data. From visual inspection, PCA method provide the best result, HPF next, ratio enhancement, IHS and LUT method the worst for the preservation of spatial resolution.
https://doi.org/10.7780/kjrs.1998.14.2.149 인용 PDF

Assessment of the Ochang Plain NDVI using Improved Resolution Method from MODIS Images (MODIS영상의 고해상도화 수법을 이용한 오창평야 NDVI의 평가)

Park, Jong-Hwa;La, Sang-Il
- Journal of the Korean Society of Environmental Restoration Technology
- /
- v.9 no.6
- /
- pp.1-12
- /
- 2006
Remote sensing cannot provide a direct measurement of vegetation index (VI) but it can provide a reasonably good estimate of vegetation index, defined as the ratio of satellite bands. The monitoring of vegetation in nearby urban regions is made difficult by the low spatial resolution and temporal resolution image captures. In this study, enhancing spatial resolution method is adapted as to improve a low spatial resolution. Recent studies have successfully estimated normalized difference vegetation index (NDVI) using improved resolution method such as from the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard EOS Terra satellite. Image enhancing spatial resolution is an important tool in remote sensing, as many Earth observation satellites provide both high-resolution and low-resolution multi-spectral images. Examples of enhancement of a MODIS multi-spectral image and a MODIS NDVI image of Cheongju using a Landsat TM high-resolution multi-spectral image are presented. The results are compared with that of the IHS technique is presented for enhancing spatial resolution of multi-spectral bands using a higher resolution data set. To provide a continuous monitoring capability for NDVI, in situ measurements of NDVI from paddy field was carried out in 2004 for comparison with remotely sensed MODIS data. We compare and discuss NDVI estimates from MODIS sensors and in-situ spectroradiometer data over Ochang plain region. These results indicate that the MODIS NDVI is underestimated by approximately 50%.
PDF KSCI

Lossless Coding of Audio Spectral Coefficients Using Selective Bit-Plane Coding (선택적 비트 플레인 부호화를 이용한 오디오 주파수 계수의 무손실 부호화 기술)

Yoo, Seung-Kwan;Park, Ho-Chong;Oh, Seoung-Jun;Ahn, Chang-Beom;Sim, Dong-Gyu;Beak, Seung-Kwon;Kang, Kyoung-Ok
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.1
- /
- pp.18-25
- /
- 2008
In this paper, new lossless coding method of spectral coefficients for audio codec is proposed. Conventional lossless coder uses Huffman coding utilizing the statistical characteristics of spectral coefficients, but does not provide the high coding efficiency due to its simple structure. To solve this limitation, new lossless coding scheme with better performance is proposed that consists of bit-plane transform and run-length coding. In the proposed scheme, the spectral coefficients are first transformed by bit-plane into 1-D bit-stream with better correlative properties, which is then coded intorun-length and is finally Huffman coded. In addition, the coding performance is further increased by applying the proposed bit-plane coding selectively to each group, after the entire frequency is divided into 3 groups. The performance of proposed coding scheme is measured in terms of theoretical number of bits based on the entropy, and shows at most 6% enhancement compared to that of conventional lossless coder used in AAC audio codec.
https://doi.org/10.7776/ASK.2008.27.1.018 인용 PDF KSCI

Parameter Estimation and Fitting Error Analysis of the Representative Spectrums using the Wave Spectrum off the Namhangjin, East Sea (남항진 파랑 스펙트럼 정보를 이용한 대표 스펙트럼 매개변수 추정 및 분석)

Cho, Hong Yeon;Jeong, Weon Mu;Oh, Sang-Ho;Baek, Won Dae
- Journal of Korean Society of Coastal and Ocean Engineers
- /
- v.32 no.5
- /
- pp.363-371
- /
- 2020
The parameters of the modified BM and JONSWAP spectra are estimated using spectral data set off Namhangjin, located in the east coast of Korea, collected during high wave events. The parameters of the modified BM spectrum were estimated to be 1.04 and 0.27, which were similar to the conventional values of 1.098 and 0.30, but showed significant differences in statistical terms. On the other hand, the peak enhancement factor of JONSWAP spectrum was estimated to be 1.4, which was substantially small compared to the conventional value of 3.3. The RMSE differences from the fitted results of the two spectra were small, approximately 0.2. In the frequency range greater than the peak frequency, however, the spectral energy density showed relatively mild decrease with increase of the frequency, compared to the standard forms of the modified BM and JONSWAP spectra.
https://doi.org/10.9765/KSCOE.2020.32.5.363 인용 PDF KSCI

Enhancement of DNA-mediated Energy Transfer from Ethidium to meso-Tetrakis(N-methylpyridinium-4-yl)porphyrin by Ca²⁺ Ion

Kim, Jong-Moon;Park, Bo-Ra-Mi;Kim, Young-Rhan;Gong, Lindan;Jang, Myung-Duk;Kim, Seog-K.
- Bulletin of the Korean Chemical Society
- /
- v.33 no.4
- /
- pp.1165-1169
- /
- 2012
The fluorescence intensity of DNA-intercalated ethidium with [ethidium]/[DNA base] being 0.005 was quenched upon the binding of another intercalating ligand, meso-tetrakis(N-methylpyridinium-4-yl)porphyrin (TMPyP). Addition of $Ca^{2+}$ enhanced the quenching efficiency. The range of separations between donor and acceptor molecules, within which total quenching occurs, was calculated using a one-dimensional resonance energy transfer mechanism to be 9.5 base-pairs or $32.3{\AA}$ in the absence of $Ca^{2+}$ ions. The distance increased to 18.7 base-pairs or about $63.6{\AA}$ in the presence $100{\mu}M$ $Ca^{2+}$. Considering that (1) $Ca^{2+}$ had little effect on the binding modes of ethidium and TMPyP, which was investigated by reduced linear dichroism and (2) spectral overlap between the emission spectrum of ethidium and the absorption spectrum of TMPyP was maintained in the presence of $Ca^{2+}$, contributions from orientation factor and spectral overlap to $Ca^{2+}$-induced enhancement in DNA mediated energy transfer was limited. Although there is no direct evidence, electron transfer along the DNA stem may accompany the observed fluorescence quenching. In this respect, DNA bound $Ca^{2+}$ act as a partially conducting medium.
https://doi.org/10.5012/bkcs.2012.33.4.1165 인용 PDF KSCI

A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise Using Whitening Transformation (유색 잡음에 오염된 음성의 향상을 위한 백색 변환을 이용한 일반화 부공간 접근)

Lee, Jeong-Wook;Son, Kyung-Sik;Park, Jang-Sik;Kim, Hyun-Tae
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.8
- /
- pp.1665-1674
- /
- 2011
In this paper, we proposed an algorithm for speech enhancement of speeches corrupted by colored noise. When there is no correlation between colored noise and speech signal, the colored noise turns into white noise through whitening transformation. This transformed signal has been applied to the generalized subspace approach for speech enhancement. The speech spectral distortion, produced by the whitening transformation as pre-processing, has been restored by using the inverse whitening transformation as post-processing of the proposed algorithm. The performance of the proposed algorithm for speech enhancement has been confirmed by computer simulation. The colored noises used in this experiment were car noise and multi-talker babble. It is confirmed that the proposed algorithm shows better performance from SNR and SSD viewpoint over the previous approach with the data from the AURORA and TIMIT data base.
https://doi.org/10.6109/jkiice.2011.15.8.1665 인용 PDF KSCI

Nonlinear Speech Enhancement Method for Reducing the Amount of Speech Distortion According to Speech Statistics Model (음성 통계 모형에 따른 음성 왜곡량 감소를 위한 비선형 음성강조법)

Choi, Jae-Seung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.16 no.3
- /
- pp.465-470
- /
- 2021
A robust speech recognition technology is required that does not degrade the performance of speech recognition and the quality of the speech when speech recognition is performed in an actual environment of the speech mixed with noise. With the development of such speech recognition technology, it is necessary to develop an application that achieves stable and high speech recognition rate even in a noisy environment similar to the human speech spectrum. Therefore, this paper proposes a speech enhancement algorithm that processes a noise suppression based on the MMSA-STSA estimation algorithm, which is a short-time spectral amplitude method based on the error of the least mean square. This algorithm is an effective nonlinear speech enhancement algorithm based on a single channel input and has high noise suppression performance. Moreover this algorithm is a technique that reduces the amount of distortion of the speech based on the statistical model of the speech. In this experiment, in order to verify the effectiveness of the MMSA-STSA estimation algorithm, the effectiveness of the proposed algorithm is verified by comparing the input speech waveform and the output speech waveform.
https://doi.org/10.13067/JKIECS.2021.16.3.465 인용 PDF KSCI

Performance comparison evaluation of real and complex networks for deep neural network-based speech enhancement in the frequency domain (주파수 영역 심층 신경망 기반 음성 향상을 위한 실수 네트워크와 복소 네트워크 성능 비교 평가)

Hwang, Seo-Rim;Park, Sung Wook;Park, Youngcheol
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.30-37
- /
- 2022
This paper compares and evaluates model performance from two perspectives according to the learning target and network structure for training Deep Neural Network (DNN)-based speech enhancement models in the frequency domain. In this case, spectrum mapping and Time-Frequency (T-F) masking techniques were used as learning targets, and a real network and a complex network were used for the network structure. The performance of the speech enhancement model was evaluated through two objective evaluation metrics: Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI) depending on the scale of the dataset. Test results show the appropriate size of the training data differs depending on the type of networks and the type of dataset. In addition, they show that, in some cases, using a real network may be a more realistic solution if the number of total parameters is considered because the real network shows relatively higher performance than the complex network depending on the size of the data and the learning target.
https://doi.org/10.7776/ASK.2022.41.1.030 인용 PDF KSCI

Search Result 208, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)