• Title/Summary/Keyword: 음원 분리

Search Result 88, Processing Time 0.035 seconds

Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals (분리된 보컬을 활용한 음색기반 음악 특성 탐색 연구)

  • Lee, Seungjin
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1024-1034
    • /
    • 2019
  • Preference for music is determined by a variety of factors, and identifying characteristics that reflect specific factors is important for music recommendations. In this paper, we propose a method to extract the singing voice related music features reflecting various musical characteristics by using a model learned for singer identification. The model can be trained using a music source containing a background accompaniment, but it may provide degraded singer identification performance. In order to mitigate this problem, this study performs a preliminary work to separate the background accompaniment, and creates a data set composed of separated vocals by using the proven model structure that appeared in SiSEC, Signal Separation and Evaluation Campaign. Finally, we use the separated vocals to discover the singing voice related music features that reflect the singer's voice. We compare the effects of source separation against existing methods that use music source without source separation.

Robust Primary-ambient Signal Decomposition Method using Principal Component Analysis with Phase Alignment (위상 정렬을 이용한 주성분 분석법의 강인한 스테레오 음원 분리 성능유지 기법)

  • Baek, Yong-Hyun;Hyun, Dong-Il;Park, Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.19 no.1
    • /
    • pp.64-74
    • /
    • 2014
  • The primary and ambient signal decomposition of a stereo sound is a key step to the stereo upmix. The principal component analysis (PCA) is one of the most widely used methods of primary-ambient signal decomposition. However, previous PCA-based decomposition algorithms assume that stereo sound sources are only amplitude-panned without any consideration of phase difference. So it occurs some performance degradation in case of live recorded stereo sound. In this paper, we propose a new PCA-based stereo decomposition algorithm that can consider the phase difference between the channel signals. The proposed algorithm overcomes limitation of conventional signal model using PCA with phase alignment. The phase alignment is realized by using inter-channel phase difference (IPD) which is widely used in parametric stereo coding. Moreover, Enhanced Modified PCA(EMPCA) is combined to solve the problem of conventional PCA caused by Primary to Ambient energy Ratio(PAR) and panning angle dependency. The simulation results are presented to show the improvements of the proposed algorithm.

Online blind source separation and dereverberation of speech based on a joint diagonalizability constraint (공동 행렬대각화 조건 기반 온라인 음원 신호 분리 및 잔향제거)

  • Yu, Ho-Gun;Kim, Do-Hui;Song, Min-Hwan;Park, Hyung-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.503-514
    • /
    • 2021
  • Reverberation in speech signals tends to significantly degrade the performance of the Blind Source Separation (BSS) system. Especially in online systems, the performance degradation becomes severe. Methods based on joint diagonalizability constraints have been recently developed to tackle the problem. To improve the quality of separated speech, in this paper, we add the proposed de-reverberation method to the online BSS algorithm based on the constraints in reverberant environments. Through experiments on the WSJCAM0 corpus, the proposed method was compared with the existing online BSS algorithm. The performance evaluation by the Signal-to-Distortion Ratio and the Perceptual Evaluation of Speech Quality demonstrated that SDR improved from 1.23 dB to 3.76 dB and PESQ improved from 1.15 to 2.12 on average.

Convolutional Neural Network Based Source Separation Using a Non-uniform Linear Microphone Array (비균등 선형 마이크로폰 어레이를 활용한 합성곱 신경망 기반의 음원분리)

  • Moon, Jung Min;Park, In Young;Kim, Hong Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.11a
    • /
    • pp.44-45
    • /
    • 2017
  • 본 논문에서는 비균등 선형 마이크로폰 어레이를 활용한 convolutional neural network (CNN) 기반의 음원분리 방법을 제안한다. 우선, 주어진 어레이 배치에 따라 채널간의 시간차를 분석하고, 분석된 시간차에 따라 주파수별로 방사각과 넓이에 따라 입력 오디오 신호의 spectral magnitude를 예측한다. 그러고 나서, CNN 분류기로부터 최적의 방사각과 넓이를 선별하고 이를 통해 음원을 분리한다.

  • PDF

A Signal Separation Method Based on Sparsity Estimation of Source Signals and Non-negative Matrix Factorization (음원 희소성 추정 및 비음수 행렬 인수분해 기반 신호분리 기법)

  • Hong, Serin;Nam, Siyeon;Yun, Deokgyu;Choi, Seung Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.11a
    • /
    • pp.202-203
    • /
    • 2017
  • 비음수 행렬 인수분해(Non-negative Matrix Factorization, NMF)의 신호분리 성능을 개선하기 위해 희소조건을 인가한 방법이 희소 비음수 행렬 인수분해 알고리즘(Sparse NMF, SNMF)이다. 기존의 SNMF 알고리즘은 개별 음원의 희소성을 고려하지 않고 임의로 결정한 희소 조건을 사용한다. 본 논문에서는 음원의 특성에 따른 희소성을 추정하고 이를 SNMF 학습알고리즘에 적용하는 새로운 신호분리 기법을 제안한다. 혼합 신호에서의 잡음제거 실험을 통해, 제안한 방법이 기존의 NMF와 SNMF에 비해 성능이 더 우수함을 보였다.

  • PDF

Sound Source Separation Using Interaural Intensity Difference in Closely Spaced Stereo Omnidirectional Microphones (인접 배치된 스테레오 무지향성 마이크로폰 환경에서 양이간 강도차를 활용한 음원 분리 기법)

  • Chun, Chan Jun;Jeong, Seok Hee;Kim, Hong Kook
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.12
    • /
    • pp.191-196
    • /
    • 2013
  • In this paper, the interaural intensity difference (IID)-based sounr source separation method in closely spaced stereo omnidirectional microphones is proposed. First, in order to improve the channel separability, a minimum variance distortionless response (MVDR) beamformer is employed to increase the intensity difference between stereo channels. After that, IID-based sound source separation method is applied. In order to evaluate the performance of the proposed method, source-to-distortion ratio (SDR), source-to-interference ratio (SIR), and sources-to-artifacts ratio (SAR), which are defined as objective evaluation criteria in stereo audio source separation evaluation campaign (SASSEC), are measured. As a result, it was shown from the objective evaluation that the proposed method outperforms a sound source separation method without applying a beamformer.

A Drum Onset Detection Scheme Based on Probabilistic Latent Component Analysis (확률적 은닉 성분 분석에 기반한 드럼 Onset 검출 방법)

  • Han, Byeong-jun;Kim, Yunjoo;Lee, Jangwoo;Kim, Minje;Lee, Kyogu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.762-765
    • /
    • 2010
  • 특정 시간에 동시 연주된 다수 음원의 onset 을 검출하기 위해서는 음원 분리 문제가 선결되어야 한다. 특히, 드럼과 같은 조음(?音) 악기 신호 검출 문제를 해결하기 위해서는 음원 분리 방법의 성능이 중요하다. 이에 본 연구에서는 효과적인 음원 분리 방법으로 알려진 확률적 은닉 성분 분석(PLCA) 방법에 기반한 주요 악기 신호의 onset 검출 방법을 제안한다. 효과적인 onset 검출을 위해, 첫째, 확률적 은닉 성분 분석으로 훈련 된 비음수 주파수 성분 중 최적의 성분을 선택하는 방법을 적용하고, 둘째, 드럼 악기 신호의 정확한 onset 검출을 위해 고안된 비음수 시계열 신호 threshold 방법을 적용한다. 실험에서는 제시된 방법을 이용하여 드럼의 주요 악기 신호 onset 검출 성능이 향상됨을 보인다.

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.

Implementation of Environmental Noise Remover for Speech Signals (배경 잡음을 제거하는 음성 신호 잡음 제거기의 구현)

  • Kim, Seon-Il;Yang, Seong-Ryong
    • 전자공학회논문지 IE
    • /
    • v.49 no.2
    • /
    • pp.24-29
    • /
    • 2012
  • The sounds of exhaust emissions of automobiles are independent sound sources which are nothing to do with voices. We have no information for the sources of voices and exhaust sounds. Accordingly, Independent Component Analysis which is one of the Blind Source Separaton methods was used to segregate two source signals from each mixed signals. Maximum Likelyhood Estimation was applied to the signals came through the stereo microphone to segregate the two source signals toward the maximization of independence. Since there is no clue to find whether it is speech signal or not, the coefficients of the slope was calculated by the autocovariances of the signals in frequcency domain. Noise remover for speech signals was implemented by coupling the two algorithms.

Evaluation of a signal segregation by FDBM (FDBM의 음원분리 성능평가)

  • Lee, Chai-Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.12
    • /
    • pp.1793-1802
    • /
    • 2013
  • Various approaches for sound source segregation have been proposed. Among these approaches, frequency domain binaural model(FDBM) has the advantages of low computational load and effective howling cancellation. A binaural hearing assistance system based on FDBM has been proposed. This system can enhance desired signal based on the directivity information. Although FDBM has been evaluated in terms of signal-to-noise ratio (SNR) and coherence function, the evaluation results do not always agree with the human impressions. These evaluation methods provide physical measures, and do not take account of perceptual aspect of human being. Considering a binaural hearing assistance system as a one of major applications, the quality of segregated sound should keep level enough. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and Perceptual Evaluation of Speech Quality(PESQ), to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and PESQ, to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions.