Search | Korea Science

Robust Speech Enhancement Based on Soft Decision Employing Spectral Deviation (스펙트럼 변이를 이용한 Soft Decision 기반의 음성향상 기법)

Choi, Jae-Hun;Chang, Joon-Hyuk;Kim, Nam-Soo
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.47 no.5
- /
- pp.222-228
- /
- 2010
In this paper, we propose a new approach to noise estimation incorporating spectral deviation with soft decision scheme to enhance the intelligibility of the degraded speech signal in non-stationary noisy environments. Since the conventional noise estimation technique based on soft decision scheme estimates and updates the noise power spectrum using a fixed smoothing parameter which was assumed in stationary noisy environments, it is difficult to obtain the robust estimates of noise power spectrum in non-stationary noisy environments that spectral characteristics of noise signal such as restaurant constantly change. In this paper, once we first classify the stationary noise and non-stationary noise environments based on the analysis of spectral deviation of noise signal, we adaptively estimate and update the noise power spectrum according to the classified noise types. The performances of the proposed algorithm are evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various ambient noise environments and show better performances compared with the conventional method.
PDF KSCI

ULTRAVIOLET COLOR - COLOR RELATION OF EARLY-TYPE GALAXIES AT 0.05

Lee, Chang-Hui;Jeong, Hyeon-Jin;O, Gyu-Seok;Jeong, Cheol;Lee, Jun-Hyeop;Kim, Sang-Cheol;Gyeong, Jae-Man
- The Bulletin of The Korean Astronomical Society
- /
- v.37 no.1
- /
- pp.48.1-48.1
- /
- 2012
We present the ultraviolet (UV) color-color relation of early-type galaxies (ETGs) in the nearby universe (0.05 < z < 0.12) to investigate the properties of hot stellar populations responsible for the UV excess (UVX). The initial sample of ETGs is selected by the spectroscopic redshift and the morphology parameter from the SDSS DR 7, and then cross-matched with the GALEX far-UV (FUV) and near-UV (NUV) GR6 data. The cross-matched ETG sample is further classified by their emission line characteristics in the optical spectra into quiescent, star-forming, and active galactic nucleus categories. Contaminations from early-type spiral galaxies, mergers, and morphologically disturbed galaxies are removed by visual inspection. By drawing the FUV-NUV (as a measure of UV spectral shape) versus FUV-r (as a measure of UVX strength) diagram for the final sample of -3700 quiescent ETGs, we find that the "old and dead" ETGs consist of a well-defined sequence in UV colors, the "UV red sequence," so that the stronger UVX galaxies should have a harder UV spectral shape systematically. However, the observed UV spectral slope is too steep to be reproduced by the canonical stellar population models in which the UV flux is mainly controlled by age or metallicity parameters. Moreover, 2 mag of color spreads both in FUV-NUV and FUV-r appear to be ubiquitous among any subsets in distance or luminosity. This implies that the UVX in ETGs could be driven by yet another parameter which might be even more influential than age or metallicity.
PDF

Fast Spectral Inversion of the Strong Absorption Lines in the Solar Chromosphere Based on a Deep Learning Model

Lee, Kyoung-Sun;Chae, Jongchul;Park, Eunsu;Moon, Yong-Jae;Kwak, Hannah;Cho, Kyuhyun
- The Bulletin of The Korean Astronomical Society
- /
- v.46 no.2
- /
- pp.46.3-47
- /
- 2021
Recently a multilayer spectral inversion (MLSI) model has been proposed to infer the physical parameters of plasmas in the solar chromosphere. The inversion solves a three-layer radiative transfer model using the strong absorption line profiles, H alpha and Ca II 8542 Å, taken by the Fast Imaging Solar Spectrograph (FISS). The model successfully provides the physical plasma parameters, such as source functions, Doppler velocities, and Doppler widths in the layers of the photosphere to the chromosphere. However, it is quite expensive to apply the MLSI to a huge number of line profiles. For example, the calculating time is an hour to several hours depending on the size of the scan raster. We apply deep neural network (DNN) to the inversion code to reduce the cost of calculating the physical parameters. We train the models using pairs of absorption line profiles from FISS and their 13 physical parameters (source functions, Doppler velocities, Doppler widths in the chromosphere, and the pre-determined parameters for the photosphere) calculated from the spectral inversion code for 49 scan rasters (~2,000,000 dataset) including quiet and active regions. We use fully connected dense layers for training the model. In addition, we utilize a skip connection to avoid a problem of vanishing gradients. We evaluate the model by comparing the pairs of absorption line profiles and their inverted physical parameters from other quiet and active regions. Our result shows that the deep learning model successfully reproduces physical parameter maps of a scan raster observation per second within 15% of mean absolute percentage error and the mean squared error of 0.3 to 0.003 depending on the parameters. Taking this advantage of high performance of the deep learning model, we plan to provide the physical parameter maps from the FISS observations to understand the chromospheric plasma conditions in various solar features.
PDF

Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments (비정상 잡음환경에서 음질향상을 위한 적응 임계 치 알고리즘)

Lee, Soo-Jeong;Kim, Sun-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.7
- /
- pp.386-393
- /
- 2008
This paper proposes a new approach for speech enhancement in highly nonstationary noisy environments. The spectral subtraction (SS) is a well known technique for speech enhancement in stationary noisy environments. However, in real world, noise is mostly nonstationary. The proposed method uses an auto control parameter for an adaptive threshold to work well in highly nonstationary noisy environments. Especially, the auto control parameter is affected by a linear function associated with an a posteriori signal to noise ratio (SNR) according to the increase or the decrease of the noise level. The proposed algorithm is combined with spectral subtraction (SS) using a hangover scheme (HO) for speech enhancement. The performances of the proposed method are evaluated ITU-T P.835 signal distortion (SIG) and the segment signal to-noise ratio (SNR) in various and highly nonstationary noisy environments and is superior to that of conventional spectral subtraction (SS) using a hangover (HO) and SS using a minimum statistics (MS) methods.
https://doi.org/10.7776/ASK.2008.27.7.386 인용 PDF KSCI

Transcoding Algorithm for SMV and G.723.1 Vocoders via Direct Parameter Transformation (SMV와 G.723.1 음성부호화기를 위한 파라미터 직접 변환 방식의 상호부호화 알고리듬)

서성호;장달원;이선일;유창동
- Proceedings of the IEEK Conference
- /
- 2003.07e
- /
- pp.2228-2231
- /
- 2003
In this paper, a transcoding algorithm for the Selectable Mode Vocoder (SMV) and the G.723.1 speech coder via direct parameter transformation is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm converts the parameters of one coder to the Other Without going through the decoding md encoding process. The proposed algorithm is composed of four parts: the parameter decoding, line spectral pair (LSP) conversion, pitch period conversion, excitation conversion and rate selection. The evaluation results show that the proposed algorithm achieves equivalent speech quality to that of tandem transcoding with reduced computational complexity and delay.
PDF

Speech Active Interval Detection Method in Noisy Speech (잡음음성에서의 음성 활성화 구간 검출 방법)

Lee, Kwang-Seok;Choo, Yeon-Gyu;Kim, Hyun-Deok
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.10a
- /
- pp.779-782
- /
- 2008
It is important to detect speech active interval from Noisy Speech in speech communication and speech recognition. In this research, we propose characteristic parameter with combining spectral Entropy for detect speech active interval in Noisy Speech, and compare performance of speech active interval based on energy. The results shows that analysis using proposed characteristic parameter is higher performance the others in noisy environment.
PDF

Truncation Parameter Selection in Binary Choice Models (이항 선택 모형에서의 절단 모수 선택)

Kim, Kwang-Rae;Cho, Kyu-Dong;Koo, Ja-Yong
- Communications for Statistical Applications and Methods
- /
- v.17 no.6
- /
- pp.811-827
- /
- 2010
This paper deals with a density estimation method in binary choice models that can be regarded as a statistical inverse problem. We use an orthogonal basis to estimate density function and consider the choice of an appropriate truncation parameter to reflect the model complexity and the prediction accuracy. We propose a data-dependent rule to choose the truncation parameter in the context of binary choice models. A numerical simulation is provided to illustrate the performance of the proposed method.
https://doi.org/10.5351/CKSS.2010.17.6.811 인용 PDF KSCI

Spectrum Requirements for the Future Development of IMT-2000 and Systems beyond IMT-2000 (4세대 이동통신 서비스 주파수 소요량에 관한 연구)

Chung Woo-Ghee;Yoon Hyun-Goo;Lim Jae-Woo;Yook Jong-Gwan;Park Han-Kyu
- The Journal of Korean Institute of Electromagnetic Engineering and Science
- /
- v.17 no.2 s.105
- /
- pp.110-116
- /
- 2006
In this paper the algorithm of a methodology for the calculation of spectrum requirements was implemented. As well, the influence of traffic distribution ratio among radio access technology groups, spectral efficiency, and flexible spectrum usage(FSU) margin was analyzed in terms of the spectrum requirements, with a view toward for future development of IMT-2000 and systems beyond IMT-2000. The ratio of the spectrum requirement to the traffic distribution ratio is approximately $1\;GHz/20\;\%$, and the spectrum requirement varies from 5 to 9 GHz. As the FSU margin increases by 1.0 dB, the total spectrum requirement decreases by 0.9 dB. The required spectrum for the market input parameter, ${\rho}=0.5$ is 801.63 MHz, while the required spectrum for ${\rho}=1.0$ is 6295.4 MHz. It can be concluded that the market input parameter is the most effective parameter in the calculation of spectrum requirements.
PDF KSCI

Spectral & Aerodynamic Analysis of Cries in Infants with Cleft Lip and Palate. (구순구개열 환아의 crying에 대한 음향학적 및 공기역학적 분석)

Kim Eun-Ju;Ko Seung-O;Shin Hyo-Keun;Kim Hyun-Ki
- Korean Journal of Cleft Lip And Palate
- /
- v.5 no.2
- /
- pp.95-108
- /
- 2002
언어 발달의 조기 단계를 이해하기 위한 일환으로 crying은 언어전 발달의 기초 단계로서 여러 학문적 분야에서 많은 연구가 있어왔다. 그러나 구순구개열(CLP))환아의 경우는cry-producing/control mechnism에 variation이 많은 이유로 이 분야의 연구는 거의 없는 실정이다. 이에 본 연구에서는 다음과 같은 의문점을 가지고 CLP환아의 cry feature에 대한분석을 하였다. 첫째, 정상아와 CLP환아의 cry에 전형적인 차이가 있는가? 둘째, CLP환아의 술전, 술후 cry feature에 변화가 있는가? 셋째, cry분석이 CLP환아의 이후 speech disorder에 대한 언어전 평가로서의 가치가 있는가? 넷째, 특정 parameter가 언어전 평가에 적절한 도구로 작용할 수 있는가? 생후 15개월 이내의 CLP 환아 3명과 유사한 나이대의 정상아 8명의 cry에 대한 공기역학 및 음향음성학적 분석을 통해 CLP 환아와 정상아, CLP환아의 술전, 술후 cry특성을 비교 분석하였다. 결과는 다음과 같다. 1 공기역학적 분석 1) airflow는 CLP 환아의 경우 정상아보다 약간 높았고 술 후 약간 증가하였다. 2)폐활량을 나타내는volume에서는 정상아보다 술전 CLP환자의 경우 보상적으로 더 큰 수치를 보였고 술후 약간 증가하였다. 3)강도를 나타내는 parameter(SPL)에서는 정상아 보다 술전 CLP환자의 계측치가 약간 작았으나 술 후 증가하는 양상을 보였다. 2. 음향음성학적 분석 1)기저 주파수 분석시 정상아에 비해 술 전 CLP환자의 경우 계측치가 약간 낮았으나 술 후 증가하여 정상군의 계측치에 근접하였다. 2)강도를 나타내는energy 측정시 정상아에 비해 술 전 CLP계측치가 보상성으로 약간 큰수치를 나타내었고 술 후 약간 더 증가하였다. 3) Shimmer에서는CUI환자의 술후계측치가술전에 비해 현저히 감소하여 정상군의 수치에 근접하였다.
PDF

The Application of Quantitative Electroencephalography (Spectral Edge Frequency 95) to Evaluate Sedation in Dogs (개에서 진정 평가를 위한 정량적 뇌파검사의 적용)

Kim Min-Su;Nam Tchi-Chou
- Journal of Veterinary Clinics
- /
- v.23 no.1
- /
- pp.31-35
- /
- 2006
This study was performed to evaluate sedation with quantitative electroencephalography (EEG) analysis in dogs. EEG is used to evaluate objectively the effects of CNS acting with brain and behavioral changes. Especially, spectral edge frequency 95 (SEF 95) parameter is an effective method to determine the sedative status. The SEF 95 is the frequency below 95% of the total power. Twelve healthy intact male Miniature Schnauzer dogs, which did not show any neurological abnormalities and disease, were used for the study. EEG electrodes were inserted in subcutaneous tissue over the calvaria without entering adjacent muscles. The EEG data were acquired and analyzed by EEG raw wave and spectral edge frequency 95 analysis. After the administration of sedatives, the SEF 95 values were shown the significant changes compared with the normal state In all groups (p<0.05). It is suggested that SEF 95 analysis is useful method for assessing the state of sedation in dogs.
PDF KSCI

Search Result 310, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)