• Title/Summary/Keyword: 음원 분리

Search Result 88, Processing Time 0.023 seconds

On Altering the Pitch of Speech Signals in Waveform Coding -Alteration Method by the LPC and the Pitch Halving- (음성 파형코딩 음원피치 변경에 관한 연구 -LPC와 주기반분법에 의한 피치변경법-)

  • 배명진;윤희상;안수길
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.5
    • /
    • pp.11-19
    • /
    • 1991
  • 음성 신호의 합성기법들 중에서 파형코딩법은 음질이 우수하기 때문에 분석에 의한 합성법으로 많이 사용하고 있다. 그렇지만 음원과 성도의특성을 분리하지 않고 파형의 잉여분만을 제거한 후에 파 형자체를 저장하기 때문에 규칙에 의한 합성기법으로 사용하기에는 어려움이 많다. 본 논문은 파형코딩 법 중 선형 PCM 코딩법으로 저장된 음성파형에 대해 피치를 양분할 수 있는 주기반분법을 제안하여 파형자체의 음원을 분리하지 않고 피치 주기를 변경시킬 수 있는 새로운 피치 변경법을 제안하였다. 따 라서 음질이 우수한 파형코딩 합성법으로 규칙에 의한 합성을 수행할 수 있다.

  • PDF

Target Speaker Speech Restoration via Spectral bases Learning (주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원)

  • Park, Sun-Ho;Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.3
    • /
    • pp.179-186
    • /
    • 2009
  • This paper proposes a target speech extraction which restores speech signal of a target speaker form noisy convolutive mixture of speech and an interference source. We assume that the target speaker is known and his/her utterances are available in the training time. Incorporating the additional information extracted from the training utterances into the separation, we combine convolutive blind source separation(CBSS) and non-negative decomposition techniques, e.g., probabilistic latent variable model. The nonnegative decomposition is used to learn a set of bases from the spectrogram of the training utterances, where the bases represent the spectral information corresponding to the target speaker. Based on the learned spectral bases, our method provides two postprocessing steps for CBSS. Channel selection step finds a desirable output channel from CBSS, which dominantly contains the target speech. Reconstruct step recovers the original spectrogram of the target speech from the selected output channel so that the remained interference source and background noise are suppressed. Experimental results show that our method substantially improves the separation results of CBSS and, as a result, successfully recovers the target speech.

Efficient Primary-Ambient Decomposition Algorithm for Audio Upmix (오디오 업믹스를 위한 효율적인 주성분-주변성분 분리 알고리즘)

  • Baek, Yong-Hyun;Jeon, Se-Woon;Lee, Seok-Pil;Park, Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.17 no.6
    • /
    • pp.924-932
    • /
    • 2012
  • Decomposition of a stereo signal into the primary and ambient components is a key step to the stereo upmix and it is often based on the principal component analysis (PCA). However, major shortcoming of the PCA-based method is that accuracy of the decomposed components is dependent on both the primary-to-ambient power ratio (PAR) and the panning angle. Previously, a modified PCA was suggested to solve the PAR-dependent problem. However, its performance is still dependent on the panning angle of the primary signal. In this paper, we proposed a new PCA-based primary-ambient decomposition algorithm whose performance is not affected by the PAR as well as the panning angle. The proposed algorithm finds scale factors based on a criterion that is set to preserve the powers of the mixed components, so that the original primary and ambient powers are correctly retrieved. Simulation results are presented to show the effectiveness of the proposed algorithm.

Sound Source Localization and Separation for Emotional Robot (감성로봇을 위한 음원의 위치측정 및 분리)

  • 김경환;김연훈;곽윤근
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.20 no.5
    • /
    • pp.116-123
    • /
    • 2003
  • These days, the researches related with the emotional robots are actively investigated and in progress. And human language, expression, action etc. are merged in the emotional robot to understand the human emotion. However, there are so many sound sources and background noise around the robot, that the robots should be able to separate the mixture of these sound sources into the original sound sources, moreover to understand the meaning of voice of a specific person. Also they should be able to turn or move to the direction of a specific person to observe his expression or action effectively. Until now, the researches on the localization and separation of sound sources have been so theoretical and computative that real-time processing is hardly possible. In this reason for the practical emotional robot, fast computation should be realized by using simple principle. In this paper the methods for detecting the direction of sound sources by using the phase difference between peaks on spectrums, and the separating the sound sources by using fundamental frequency and its overtones of human voice, are proposed. Also by using these methods, it is shown that the effective and real-time localization and separation of sound sources in living room are possible.

A Study on the Sweet-Spot Widening using 2-Channel Sound Transaural Filter (2채널 트랜스오럴 필터를 이용한 최적 청취영역 확대에 관한 연구)

  • Ahn Chan-Shik;Hwang Shin;Kim Soon-Hyob
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.53-56
    • /
    • 2002
  • 본 논문은 2채널 스피커를 사용하여 청취자에게 보다 입체적인 음향 효과를 제시하기 위하여 크로스토크현상을 제거하고 청취자의 보다 자유로운 청취를 위해 최적 청취영역 확대를 위한 실험과 시스템 구현에 관한 것이다. 정면에 위치한 두 스피커로부터 교차경로인 크로스토크를 제거하기 위해 음질의 왜곡을 최소화하는 자유음장 모델을 이용하여 구현한 트랜스오럴 필터 사용하였고 최적 청취영역의 확대를 위해 스피커는 BPF(Band Pass Filter)를 이용하여 저주파와 고주파를 분리하여 각각 재생할 수 있는 스피커를 구성하였으며 저주파 영역은 제외하고 중고주파 영역을 이용하였으며 기존 크로스토크제거 시스템을 사용하여 고정된 한 점의 청취영역에서 좌${\cdot}$우로 5Cm씩 이동하au 100Cm까지 측정한 결과 30Cm, 55Cm, 75Cm, 90Cm, 100Cm에서 크로스토크제거됨을 알 수 있는 음의 분리도가 5dB이상 나타났다. 실험 결과 얻어진 각 지점들로부터 자유음장 모델을 이용하여 트랜스오럴 필터링 하였으며 각각의 간섭현상을 막기 위해 주파수 영역에서 심리음향에 기초한 1/3-Octave Band Pass Filter를 사용하여 음질 보상을 실시하였다. 음원을 제작하여 기존의 2채널 시스템에서 제시하는 음원을 각각의 위치의 음원과 비교하여 음질 평가를 실시하였으며 기존의 트랜스오럴 필터와 비교평가를 실시하였다.

  • PDF

A system for recommending audio devices based on frequency band analysis of vocal component in sound source (음원 내 보컬 주파수 대역 분석에 기반한 음향기기 추천시스템)

  • Jeong-Hyun, Kim;Cheol-Min, Seok;Min-Ju, Kim;Su-Yeon, Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.1-12
    • /
    • 2022
  • As the music streaming service and the Hi-Fi market grow, various audio devices are being released. As a result, consumers have a wider range of product choices, but it has become more difficult to find products that match their musical tastes. In this study, we proposed a system that extracts the vocal component from the user's preferred sound source and recommends the most suitable audio device to the user based on this information. To achieve this, first, the original sound source was separated using Python's Spleeter Library, the vocal sound source was extracted, and the result of collecting frequency band data of manufacturers' audio devices was shown in a grid graph. The Matching Gap Index (MGI) was proposed as an indicator for comparing the frequency band of the extracted vocal sound source and the measurement data of the frequency band of the audio devices. Based on the calculated MGI value, the audio device with the highest similarity with the user's preference is recommended. The recommendation results were verified using equalizer data for each genre provided by sound professional companies.

Acoustic parabolic equation model with a directional source (방향성 있는 음원이 적용된 음향 포물선 방정식 모델)

  • Lee, Keunhwa;Na, Youngnam;Son, Su-Uk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.1
    • /
    • pp.1-7
    • /
    • 2020
  • The acoustic parabolic equation method in the ocean is an efficient technique to calculate the acoustic field in the range-dependent environment, emanating from a point source. However, we often need to use the directional source with a main beam in the practical problem. In this paper, we present two methods to implement the directional source in the acoustic parabolic equation code easily. One is simply to filter the Delta function idealized as an omni-directional point source. Another method is based on the rational filtering of the self-starter solution. It has a limitation not to separate the up-going and the down-going wave for the depth, but would be useful in implementing the mode propagation. Numerical examples for validation are given in the Pekeris environment and the deep sea environment.

A method for localization of multiple drones using the acoustic characteristic of the quadcopter (쿼드콥터의 음향 특성을 활용한 다수의 드론 위치 추정법)

  • In-Jee Jung;Wan-Ho Cho;Jeong-Guon Ih
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.3
    • /
    • pp.351-360
    • /
    • 2024
  • With the increasing use of drone technology, the Unmanned Aerial Vehicle (UAV) is now being utilized in various fields. However, this increased use of drones has resulted in various issues. Due to its small size, the drone is difficult to detect with radar or optical equipment, so acoustical tracking methods have been recently applied. In this paper, a method of localization of multiple drones using the acoustic characteristics of the quadcopter drone is suggested. Because the acoustic characteristics induced by each rotor are differentiated depending on the type of drone and its movement state, the sound source of the drone can be reconstructed by spatially clustering the results of the estimated positions of the blade passing frequency and its harmonic sound source. The reconstructed sound sources are utilized to finally determine the location of multiple-drone sound sources by applying the source localization algorithm. An experiment is conducted to analyze the acoustic characteristics of the test quadcopter drones, and the simulations for three different types of drones are conducted to localize the multiple drones based on the measured acoustic signals. The test result shows that the location of multiple drones can be estimated by utilizing the acoustic characteristics of the drone. Also, one can see that the clarity of the separated drone sound source and the source localization algorithm affect the accuracy of the localization for multiple-drone sound sources.

Real-time Orchestra Method using MIDI Files (MIDI파일을 이용한 실시간 합주 기법)

  • Lee, Ji-Hye;Kim, Svetlana;Yoon, Yong-Ik
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.91-97
    • /
    • 2010
  • Recently, Internet users have an interest about Social Media Service in Web2.0 environment. We suggest the orchestra service as social media service to meet user satisfactions in changed web environment. We accept a concept of the MMMD (Multiple Media Multiple Devices). In other words, Internet users listen to the music not only one device but multiple devices. Each one of multiple devices can play a sound source under earmark instruments for providing users with actual feeling like an orchestra. To meet the purpose, we define 3 steps. First, we separate the sound source based on instrument information. Second, we exact the suitable sound source for play orchestra. In final step, the sound source transmits to each suitable playing device. We named the 3 step for AET process. Beside we suggest synchronization method using rest point in the MIDI file for control sound sources. Using the AET process and synchronization method we provide the orchestra service for meet user's satisfactions to users.