• Title/Summary/Keyword: Spectral enhancement

Search Result 208, Processing Time 0.02 seconds

Artificial speech bandwidth extension technique based on opus codec using deep belief network (심층 신뢰 신경망을 이용한 오푸스 코덱 기반 인공 음성 대역 확장 기술)

  • Choi, Yoonsang;Li, Yaxing;Kang, Sangwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.1
    • /
    • pp.70-77
    • /
    • 2017
  • Bandwidth extension is a technique to improve speech quality, intelligibility and naturalness, extending from the 300 ~ 3,400 Hz narrowband speech to the 50 ~ 7,000 Hz wideband speech. In this paper, an Artificial Bandwidth Extension (ABE) module embedded in the Opus audio decoder is designed using the information of narrowband speech to reduce the computational complexity of LPC (Linear Prediction Coding) and LSF (Line Spectral Frequencies) analysis and the algorithm delay of the ABE module. We proposed a spectral envelope extension method using DBN (Deep Belief Network), one of deep learning techniques, and the proposed scheme produces better extended spectrum than the traditional codebook mapping method.

A Downlink Spectral Efficiency Improvement Scheme Using Intercell Cooperative Spatial Multiplexing and Beamforming (셀 간 협조적 공간 다중화 및 빔포밍을 이용한 하향링크 전송 효율 증대 방안)

  • Chang, Jae-Won;Jin, Gwy-Un;Sung, Won-Jin
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.7
    • /
    • pp.45-52
    • /
    • 2008
  • In typical cellular systems using frequency reuse scheme, the terminal suffers a performance degradation due to the intercell interference signals from adjacent cells as the terminal moves toward the cell boundary. In this paper, a signal transmission and reception scheme which achieve spatial multiplexing and beamforming gain from a distributed MIMO (multiple-input multiple-output) channel using multiple-antenna terminal is proposed for the spectral efficiency enhancement in a multi-cell downlink environment, when geographically separated base stations cooperatively transmit signals. In particular, we analyze the effective signal-to-interference ratio and spectral efficiency of the proposed scheme for different frequency reuse patterns and for varying numbers of receive antennas, and compare with the performance of the MRC (maximal ratio combining) reception scheme in typical cellular systems. We evaluate the amount of transmission efficiency of the scheme by comparing the performance near the cell boundary where the strong intercell interference is experienced.

The effects of clouds on enhancing surface solar irradiance (구름에 의한 지표 일사량의 증가)

  • Jung, Yeonjin;Cho, Hi Ku;Kim, Jhoon;Kim, Young Joon;Kim, Yun Mi
    • Atmosphere
    • /
    • v.21 no.2
    • /
    • pp.131-142
    • /
    • 2011
  • Spectral solar irradiances were observed using a visible and UV Multi-Filter Rotating Shadowband Radiometer on the rooftop of the Science Building at Yonsei University, Seoul ($37.57^{\circ}N$, $126.98^{\circ}E$, 86 m) during one year period in 2006. 1-min measurements of global(total) and diffuse solar irradiances over the solar zenith angle (SZA) ranges from $20^{\circ}$ to $70^{\circ}$ were used to examine the effects of clouds and total optical depth (TOD) on enhancing four solar irradiance components (broadband 395-955 nm, UV channel 304.5 nm, visible channel 495.2 nm, and infrared channel 869.2 nm) together with the sky camera images for the assessment of cloud conditions at the time of each measurement. The obtained clear-sky irradiance measurements were used for empirical model of clear-sky irradiance with the cosine of the solar zenith angle (SZA) as an independent variable. These developed models produce continuous estimates of global and diffuse solar irradiances for clear sky. Then, the clear-sky irradiances are used to estimate the effects of clouds and TOD on the enhancement of surface solar irradiance as a difference between the measured and the estimated clear-sky values. It was found that the enhancements occur at TODs less than 1.0 (i.e. transmissivity greater than 37%) when solar disk was not obscured or obscured by optically thin clouds. Although the TOD is less than 1.0, the probability of the occurrence for the enhancements shows 50~65% depending on four different solar radiation components with the low UV irradiance. The cumulus types such as stratoculmus and altoculumus were found to produce localized enhancement of broadband global solar irradiance of up to 36.0% at TOD of 0.43 under overcast skies (cloud cover 90%) when direct solar beam was unobstructed through the broken clouds. However, those same type clouds were found to attenuate up to 80% of the incoming global solar irradiance at TOD of about 7.0. The maximum global UV enhancement was only 3.8% which is much lower than those of other three solar components because of the light scattering efficiency of cloud drops. It was shown that the most of the enhancements occurred under cloud cover from 40 to 90%. The broadband global enhancement greater than 20% occurred for SZAs ranging from 28 to $62^{\circ}$. The broadband diffuse irradiance has been increased up to 467.8% (TOD 0.34) by clouds. In the case of channel 869.0 nm, the maximum diffuse enhancement was 609.5%. Thus, it is required to measure irradiance for various cloud conditions in order to obtain climatological values, to trace the differences among cloud types, and to eventually estimate the influence on solar irradiance by cloud characteristics.

A study on deep neural speech enhancement in drone noise environment (드론 소음 환경에서 심층 신경망 기반 음성 향상 기법 적용에 관한 연구)

  • Kim, Jimin;Jung, Jaehee;Yeo, Chaneun;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.342-350
    • /
    • 2022
  • In this paper, actual drone noise samples are collected for speech processing in disaster environments to build noise-corrupted speech database, and speech enhancement performance is evaluated by applying spectrum subtraction and mask-based speech enhancement techniques. To improve the performance of VoiceFilter (VF), an existing deep neural network-based speech enhancement model, we apply the Self-Attention operation and use the estimated noise information as input to the Attention model. Compared to existing VF model techniques, the experimental results show 3.77%, 1.66% and 0.32% improvements for Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligence (STOI), respectively. When trained with a 75% mix of speech data with drone sounds collected from the Internet, the relative performance drop rates for SDR, PESQ, and STOI are 3.18%, 2.79% and 0.96%, respectively, compared to using only actual drone noise. This confirms that data similar to real data can be collected and effectively used for model training for speech enhancement in environments where real data is difficult to obtain.

Speech Enhancement System Using a Model of Auditory Mechanism (청각기강의 모델을 이용한 음성강조 시스템)

  • 최재승
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.295-302
    • /
    • 2004
  • On the field of speech processing the treatment of noise is still important problems for speech research. Especially, it has been noticed that the background noise causes remarkable reduction of speech recognition ratio. As the examples of the background noise, there are such various non-stationary noises existing in the real environment as driving noise of automobiles on the road or typing noise of printer. The treatment for these kinds of noises is not so simple as could be eliminated by the former Wiener filter, but needs more skillful techniques. In this paper as one of these trials, we show an algorithm which is a speech enhancement method using a model of mutual inhibition for noise reduction in speech which is contaminated by white noise or background noise mentioned above. It is confirmed that the proposed algorithm is effective for the speech degraded not only by white noise but also by colored noise, judging from the spectral distortion measurement.

An Implementation of an ARM Platform based MP3 Sound Enhancement System (ARM 플랫폼 기반의 MP3 오디오 음질 향상 시스템 구현)

  • Oh, Sang-Hun;Park, Kyu-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.70-75
    • /
    • 2007
  • In order to mitigate the problems in storage space and network bandwidth for the full CD quality audio with 44.1 kHz sampling rate, current existing digital audio is always restricted by sampling rate and bandwidth. This kind of restriction normally can be resolved by using low bit rate audio codec such as MP3, OGG, and AAC. However it suffers a major problem such as a loss of high frequency fidelity. This high frequency loss will reproduce only the band-limited low-frequency part of audio in the standard CD-quality audio. In general, the high frequency contents of audio have lots of information such as localization and ambient information, and bright nature of audio. The purpose of this paper is to implement on ARM platform system that can effectively estimate and compensate the missing high frequency contents of MP3 audio. From the experimental results with spectrum analysis and listening test, we confirm the superiority of the proposed algorithms for MP3 audio quality enhancement.

Iodine Quantification on Spectral Detector-Based Dual-Energy CT Enterography: Correlation with Crohn's Disease Activity Index and External Validation

  • Kim, Yeon Soo;Kim, Se Hyung;Ryu, Hwa Sung;Han, Joon Koo
    • Korean Journal of Radiology
    • /
    • v.19 no.6
    • /
    • pp.1077-1088
    • /
    • 2018
  • Objective: To correlate CT parameters on detector-based dual-energy CT enterography (DECTE) with Crohn's disease activity index (CDAI) and externally validate quantitative CT parameters. Materials and Methods: Thirty-nine patients with CD were retrospectively enrolled. Two radiologists reviewed DECTE images by consensus for qualitative and quantitative CT features. CT attenuation and iodine concentration for the diseased bowel were also measured. Univariate statistical tests were used to evaluate whether there was a significant difference in CTE features between remission and active groups, on the basis of the CDAI score. Pearson's correlation test and multiple linear regression analyses were used to assess the correlation between quantitative CT parameters and CDAI. For external validation, an additional 33 consecutive patients were recruited. The correlation and concordance rate were calculated between real and estimated CDAI. Results: There were significant differences between remission and active groups in the bowel enhancement pattern, subjective degree of enhancement, mesenteric fat infiltration, comb sign, and obstruction (p < 0.05). Significant correlations were found between CDAI and quantitative CT parameters, including number of lesions (correlation coefficient, r = 0.573), bowel wall thickness (r = 0.477), iodine concentration (r = 0.744), and relative degree of enhancement (r = 0.541; p < 0.05). Iodine concentration remained the sole independent variable associated with CDAI in multivariate analysis (p = 0.001). The linear regression equation for CDAI (y) and iodine concentration (x) was y = 53.549x + 55.111. For validation patients, a significant correlation (r = 0.925; p < 0.001) and high concordance rate (87.9%, 29/33) were observed between real and estimated CDAIs. Conclusion: Iodine concentration, measured on detector-based DECTE, represents a convenient and reproducible biomarker to monitor disease activity in CD.

Stereo Sound Image Expansion Using Phase Difference and Sound Pressure Level Difference in Television (위상차와 음압 레벨차를 이용한 텔레비전에서의 스테레오 음상 확대)

  • 박해광;오제화
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1243-1246
    • /
    • 1998
  • Three-dimensional(3-D) sound is a technique for generating or recreating sounds so they are perceived as emanating from locations in a three-dimensional space. Three dimensional sound has the potential of increasing the feeling of realism in music or movie soundtracks. Three-dimensional sound effects depend on psychoacoustic spectral and phase cues being presented in a reproduced signal. In this paper we propose an effective algorithm for the sound image expansion in television system using stereo image enhancement techniques. Compared to the other techniques of three-dimensional sound, the proposed algorithm use only two speakers to enhance the sound image expansion, while maintaining the original sound characteristics.

  • PDF

Low-band Extension of CELP Speech Coder by Recovery of Harmonics (고조파 복원에 의한 CELP 음성 부호화기의 저대역 확장)

  • Park Jin Soo;Choi Mu Yeol;Kim Hyung Soon
    • MALSORI
    • /
    • no.49
    • /
    • pp.63-75
    • /
    • 2004
  • Most existing telephone speech transmitted in current public networks is band-limited to 0.3-3.4 kHz. Compared with wideband speech(0-8 kHz), the narrowband speech lacks low-band (0-0.3 kHz) and high-band(3.4-8 kHz) components of sound. As a result, the speech is characterized by the reduced intelligibility and a muffled quality, and degraded speaker identification. Bandwidth extension is a technique to provide wideband speech quality, which means reconstruction of low-band and high-band components without any additional transmitted information. Our new approach considers to exploit harmonic synthesis method for reconstruction of low-band speech over the CELP coded speech. A spectral distortion measurement and listening test are introduced to assess the proposed method, and the improvement of synthesized speech quality was verified.

  • PDF

Beamforming Optimization Using Filterbank-based Frost Algorithm (필터뱅크 기반 프로스트 알고리즘을 이용한 빔포밍 최적화)

  • Park, Ji-Hoon;Lee, Sung-Joo;Hong, Jeong-Pyo;Jeong, Sang-Bae;Hahn, Min-Soo
    • MALSORI
    • /
    • no.66
    • /
    • pp.73-86
    • /
    • 2008
  • Beamforming is one of the spatial filtering techniques which extract only desired signals from noisy environments using microphone arrays. Fixed beamforming is a simple concept and easy to implement. However, it does not show good performance in real noisy conditions. As an adaptive beamforming, Frost algorithm can be a good candidate. It uses the concept of the linearly constrained minimum variance (LCMV) algorithm. The difference between the Frost and the LCMV algorithm is the error correction scheme which is very effective feature in the aspect of performance. In this paper, as quadrature mirror filtering (QMF)-based filterbank is utilized as the pre-processing of the Frost beamformning, the filter length and the learning rate of each band is optimized to improve the performance. The performance is measured by the signal-to-noise ratio (SNR) and the Bark's scale spectral distortion (BSD).

  • PDF