• Title/Summary/Keyword: 음원 분리

Search Result 89, Processing Time 0.024 seconds

A Perception Based Active Matrix Decoder with Virtual Source Location Information (가상 음원 위치 정보를 이용한 능동 메트릭스 디코더)

  • Moon, Han-Gil
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.18-24
    • /
    • 2010
  • In this paper, a new matrix decoding system using vector based Virtual Source Location Information (VSLI) is proposed as an alternative to the conventional Dolby Pro logic II/IIx system for reconstructing multi-channel output signals from matrix encoded two channel signals, Lt/Rt. This new matrix decoding system is composed of passive decoding part and active part. The passive part makes crude multi-channel signals using linear combination of the two encoded signals(Lt/Rt) and the active part enhances each channel regarding to the virtual source which is emergent in each inter channel. Since the virtual sources are related to the perceptual sound images in virtual sound field, the reconstructed multi-channel sound results in good dynamic perception and stable image localization. Moreover, the good channel separation is maintained with nonlinear trigonometric enhancing function.

A Study on the Implementation of Realistic Sound Through Cross-Talk Cancellation (크로스토크 제거를 통한 입체 음향 구현에 관한 연구)

  • 김학진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.2
    • /
    • pp.99-108
    • /
    • 2004
  • This thesis deals a method to deliver more realistic sound by cancelling the cross-talk which is inherent to the 5.1 channel speaker system. The acoustical model for cross-talk cancellation is the free field model. This model minimizes distortion of sound. I used the bark scale sound quality compensation which based on psycho-acoustic. For the surround channels, band-limited sound quality compensation is performed in the frequency domain. I also performed the sound quality assessment test on the traditional 2 channel stereo and 5.1 channel system. This test is performed in the test chamber which satisfies the ITU-R specifications. I uses the IACC(Inter-Aural Cross-Correlation) to determine the preferences of the amateur and the golden ear experts to asses the trans-aural filter. According to the result from the proposed method, I got more the 38㏈ separation rates with the Dolby standard speaker array. The results on the diffusion by the subjective test with the experts shows 0.4 point increased then before.

A Study on the Sound Quality Improvement Using the Equal Compensation Filter in Bark-scale for the Cross-talk Cancellation (크로스토크 제거를 위한 바크스케일 등가 보상 필터를 이용한 음질 향상에 관한 연구)

  • Kim, Hack-Jin;Kim, Soon-Hyub
    • The KIPS Transactions:PartB
    • /
    • v.11B no.3
    • /
    • pp.345-352
    • /
    • 2004
  • This paper deals a method to deliver more realistic sound by cancelling the cross-talk which is inherent to the 5.1 channel speaker system. The acoustical model for cross-talk cancellation is the free field model. This model minimizes distortion of sound. 1 used the bark scale sound quality compensation which based on psycho-acoustic. For the surround channels, band-limited sound quality compensation is performed in the frequency domain. I also performed the sound qualify assessment test on the traditional 2 channel stereo and 5.1 channel system. This test is performed in the tort chamber which satisfies the ITU-R specifications. 1 uses the IACC(Inter-Aural Cross-Correlation) to determine the preferences of the amateur and the golden ear experts to asses the trans-aural filter. According to the result from the proposed method, I got more the 38dB separation rates with the Dolby standard speaker array. The results on the diffusion by the subjective test with the experts shows 0.4∼0.5 point Increased then before.

Estimation of a source range using acoustic wavefront in bottom reflection environment (해저면 반사 환경에서 음파의 파면을 이용하는 음원의 거리 추정)

  • Joung-Soo Park;Jungyong Park;Su-Uk Son;Ho Seuk Bae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.3
    • /
    • pp.324-334
    • /
    • 2024
  • The Wavefront Curvature Ranging (WCR) is an estimation method for a source range from the wavefront curvature of acoustic waves. The conventional method uses trigonometry to estimate the source range by assuming the sound speed as a constant. Because of this assumption, range error occurs in the ocean environment where the bottom reflection is clearly separated. In order to reduce the range error, Matched Wavefront Curvature Ranging (MWCR) was proposed applying the sound speed structure in the ocean environment and Maximum Likelihood Estimation (MLE). The range error was reduced in the results of the simulation on the proposed method. In the future, this method will be applicable to the sonar system if the reliability of ranging is confirmed by measured signal.

Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases (에너지와 위상을 고려한 선택적 주파수 차감법을 이용한 보컬 분리)

  • Kim, Hyuntae;Park, Jangsik
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.408-413
    • /
    • 2015
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. In this case, it is determined in consideration not only the energies for each frequency bin but also the phase of the each frequency bin at each channel signal. Listening test with removed vocal music from proposed system show relatively high satisfactory level.

Robust Blind Source Separation to Noisy Environment For Speech Recognition in Car (차량용 음성인식을 위한 주변잡음에 강건한 브라인드 음원분리)

  • Kim, Hyun-Tae;Park, Jang-Sik
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.89-95
    • /
    • 2006
  • The performance of blind source separation(BSS) using independent component analysis (ICA) declines significantly in a reverberant environment. A post-processing method proposed in this paper was designed to remove the residual component precisely. The proposed method used modified NLMS(normalized least mean square) filter in frequency domain, to estimate cross-talk path that causes residual cross-talk components. Residual cross-talk components in one channel is correspond to direct components in another channel. Therefore, we can estimate cross-talk path using another channel input signals from adaptive filter. Step size is normalized by input signal power in conventional NLMS filter, but it is normalized by sum of input signal power and error signal power in modified NLMS filter. By using this method, we can prevent misadjustment of filter weights. The estimated residual cross-talk components are subtracted by non-stationary spectral subtraction. The computer simulation results using speech signals show that the proposed method improves the noise reduction ratio(NRR) by approximately 3dB on conventional FDICA.

  • PDF

Music Transcription Using Non-Negative Matrix Factorization (비음수 행렬 분해 (NMF)를 이용한 악보 전사)

  • Park, Sang-Ha;Lee, Seok-Jin;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2
    • /
    • pp.102-110
    • /
    • 2010
  • Music transcription is extracting pitch (the height of a musical note) and rhythm (the length of a musical note) information from audio file and making a music score. In this paper, we decomposed a waveform into frequency and rhythm components using Non-Negative Matrix Factorization (NMF) and Non-Negative Sparse coding (NNSC) which are often used for source separation and data clustering. And using the subharmonic summation method, fundamental frequency is calculated from the decomposed frequency components. Therefore, the accurate pitch of each score can be estimated. The proposed method successfully performed music transcription with its results superior to those of the conventional methods which used either NMF or NNSC.

Inverse Estimation of Geoacoustic Parameters in Shallow Water Using tight Bulb Sound Source (천해환경에서 전구음원을 이용한 지음향인자의 역추정)

  • 한주영;이성욱;나정열;김성일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.8-16
    • /
    • 2004
  • An inversion method is presented for the determination of the compressional wave speed, compressional wave attenuation, thickness of the sediment layer and density as a function of depth for a horizontally stratified ocean bottom. An experiment for estimating those properties was conducted in the shallow water of South Sea in Korea. In the experiment, a light bulb implosion and the propagating sound were measured using a VLA (vertical line array). As a method for estimating the geoacoustic properties, a coherent broadband matched field processing combined with Genetic Algorithm was employed. When a time-dependent signal is very short, the Fourier transform results are not accurate, since the frequency components are not locatable in time and the windowed Fourier transform is limited by the length of the window. However, it is possible to do this using the wavelet transform a transform that yields a time-frequency representation of a signal. In this study, this transform is used to identify and extract the acoustic components from multipath time series. The inversion is formulated as an optimization problem which maximizes the cost function defined as a normalized correlation between the measured and modeled signals in the wavelet transform coefficient vector. The experiments and procedures for deploying the light bulbs and the coherent broadband inversion method are described, and the estimated geoacoustic profile in the vicinity of the VLA site is presented.

DNN based Speech Detection for the Media Audio (미디어 오디오에서의 DNN 기반 음성 검출)

  • Jang, Inseon;Ahn, ChungHyun;Seo, Jeongil;Jang, Younseon
    • Journal of Broadcast Engineering
    • /
    • v.22 no.5
    • /
    • pp.632-642
    • /
    • 2017
  • In this paper, we propose a DNN based speech detection system using acoustic characteristics and context information of media audio. The speech detection for discriminating between speech and non-speech included in the media audio is a necessary preprocessing technique for effective speech processing. However, since the media audio signal includes various types of sound sources, it has been difficult to achieve high performance with the conventional signal processing techniques. The proposed method improves the speech detection performance by separating the harmonic and percussive components of the media audio and constructing the DNN input vector reflecting the acoustic characteristics and context information of the media audio. In order to verify the performance of the proposed system, a data set for speech detection was made using more than 20 hours of drama, and an 8-hour Hollywood movie data set, which was publicly available, was further acquired and used for experiments. In the experiment, it is shown that the proposed system provides better performance than the conventional method through the cross validation for two data sets.

Online Monaural Ambient Sound Extraction based on Nonnegative Matrix Factorization Method for Audio Contents (오디오 컨텐츠를 위한 비음수 행렬 분해 기법 기반의 실시간 단일채널 배경 잡음 추출 기법)

  • Lee, Seokjin
    • Journal of Broadcast Engineering
    • /
    • v.19 no.6
    • /
    • pp.819-825
    • /
    • 2014
  • In this paper, monaural ambient component extraction algorithm based on nonnegative matrix factorization (NMF) is described. The ambience component extraction algorithm in this paper is developed for audio upmixing system; Recent researches have shown that they can enhance listener envelopment if the extracted ambient signal is applied into the multichannel audio upmixing system. However, the conventional method stores all of the audio signal and processes all at once, so it cannot be applied to streaming system and digital signal processor (DSP) system. In this paper, the ambient component extraction algorithm based on on-line nonnegative matrix factorization is developed and evaluated to solve the problem. As a result of analysis of the processed signal with spectral flatness measures in the experiment, it was shown that the developed system can extract the ambient signal similarly with the conventional batch process system.