• Title/Summary/Keyword: Sound channel

Search Result 252, Processing Time 0.026 seconds

High Directivity Sound Beamforming Algorithm (방향성이 높은 사운드 빔 형성 알고리즘)

  • Kim, Seona-Woo;Hur, Yoo-Mi;Park, Young-Chul;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.24-33
    • /
    • 2010
  • This paper proposes a technique of sound beamforming that can generate high-directive sound beams, and this paper also presents applications of the proposed algorithm to multi-channel 3D sound systems. The proposed algorithm consists of two phases: first, optimum weights maximizing a sound pressure level ratio between the target and control acoustic regions are designed, and later, the directivity of the pre-designed sound beam is iteratively enhanced by modifying the covariance matrix. The proposed method was evaluated under various situations, and the results showed that it could provide more focused sound beams than the conventional methods.

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

  • Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.600-605
    • /
    • 2020
  • In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.

A Study on the Implementation of Realistic Sound Through Cross-Talk Cancellation (크로스토크 제거를 통한 입체 음향 구현에 관한 연구)

  • 김학진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.2
    • /
    • pp.99-108
    • /
    • 2004
  • This thesis deals a method to deliver more realistic sound by cancelling the cross-talk which is inherent to the 5.1 channel speaker system. The acoustical model for cross-talk cancellation is the free field model. This model minimizes distortion of sound. I used the bark scale sound quality compensation which based on psycho-acoustic. For the surround channels, band-limited sound quality compensation is performed in the frequency domain. I also performed the sound quality assessment test on the traditional 2 channel stereo and 5.1 channel system. This test is performed in the test chamber which satisfies the ITU-R specifications. I uses the IACC(Inter-Aural Cross-Correlation) to determine the preferences of the amateur and the golden ear experts to asses the trans-aural filter. According to the result from the proposed method, I got more the 38㏈ separation rates with the Dolby standard speaker array. The results on the diffusion by the subjective test with the experts shows 0.4 point increased then before.

Improved methods for measuring early reflections from Five-channel room impulse response using newly introduced Peak-Detecting algorithm

  • Kim Lae-Hoon;Doo Sejin;Oh Yangki;Lee Heewon;Sung Koeng-Mo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.439-442
    • /
    • 2000
  • When we measure the acoustical properties of a room using multiple microphone system, it is important to grasp exact time delay of the early reflections from impulse response pair. But it is often very difficult to identify the early reflections in natural shape, because a waveform may be deformed due to the characteristics of a sound source loudspeaker, microphone and reflected wall and overlapping of plural waveform. In this paper to obtain more accurate and enough early reflections, we propose the brand-new five-channel sound receiving system and introduce peak-detecting algorithm. The system has microphones mounted at the origin and four points of a regular tetrahedron. The newly introduced peak-detecting algorithm can show exact peak position in each channel, in spite of deformation due to reflected walls, loudspeaker and microphone.

  • PDF

Polyphonic sound event detection using multi-channel audio features and gated recurrent neural networks (다채널 오디오 특징값 및 게이트형 순환 신경망을 사용한 다성 사운드 이벤트 검출)

  • Ko, Sang-Sun;Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.4
    • /
    • pp.267-272
    • /
    • 2017
  • In this paper, we propose an effective method of applying multichannel-audio feature values to GRNNs (Gated Recurrent Neural Networks) in polyphonic sound event detection. Real life sounds are often overlapped with each other, so that it is difficult to distinguish them by using a mono-channel audio features. In the proposed method, we tried to improve the performance of polyphonic sound event detection by using multi-channel audio features. In addition, we also tried to improve the performance of polyphonic sound event detection by applying a gated recurrent neural network which is simpler than LSTM (Long Short Term Memory), which shows the highest performance among the current recurrent neural networks. The experimental results show that the proposed method achieves better sound event detection performance than other existing methods.

A Study on the Sound Quality Improvement Using the Equal Compensation Filter in Bark-scale for the Cross-talk Cancellation (크로스토크 제거를 위한 바크스케일 등가 보상 필터를 이용한 음질 향상에 관한 연구)

  • Kim, Hack-Jin;Kim, Soon-Hyub
    • The KIPS Transactions:PartB
    • /
    • v.11B no.3
    • /
    • pp.345-352
    • /
    • 2004
  • This paper deals a method to deliver more realistic sound by cancelling the cross-talk which is inherent to the 5.1 channel speaker system. The acoustical model for cross-talk cancellation is the free field model. This model minimizes distortion of sound. 1 used the bark scale sound quality compensation which based on psycho-acoustic. For the surround channels, band-limited sound quality compensation is performed in the frequency domain. I also performed the sound qualify assessment test on the traditional 2 channel stereo and 5.1 channel system. This test is performed in the tort chamber which satisfies the ITU-R specifications. 1 uses the IACC(Inter-Aural Cross-Correlation) to determine the preferences of the amateur and the golden ear experts to asses the trans-aural filter. According to the result from the proposed method, I got more the 38dB separation rates with the Dolby standard speaker array. The results on the diffusion by the subjective test with the experts shows 0.4∼0.5 point Increased then before.

A Method of Sound Segmentation in Time-Frequency Domain Using Peaks and Valleys in Spectrogram for Speech Separation (음성 분리를 위한 스펙트로그램의 마루와 골을 이용한 시간-주파수 공간에서 소리 분할 기법)

  • Lim, Sung-Kil;Lee, Hyon-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.418-426
    • /
    • 2008
  • In this paper, we propose an algorithm for the frequency channel segmentation using peaks and valleys in spectrogram. The frequency channel segments means that local groups of channels in frequency domain that could be arisen from the same sound source. The proposed algorithm is based on the smoothed spectrum of the input sound. Peaks and valleys in the smoothed spectrum are used to determine centers and boundaries of segments, respectively. To evaluate a suitableness of the proposed segmentation algorithm before that the grouping stage is applied, we compare the synthesized results using ideal mask with that of proposed algorithm. Simulations are performed with mixed speech signals with narrow band noises, wide band noises and other speech signals.

Comparison of Speech Intelligibility & Performance of Speech Recognition in Real Driving Environments (자동차 주행 환경에서의 음성 전달 명료도와 음성 인식 성능 비교)

  • Lee Kwang-Hyun;Choi Dae-Lim;Kim Young-Il;Kim Bong-Wan;Lee Yong-Ju
    • MALSORI
    • /
    • no.50
    • /
    • pp.99-110
    • /
    • 2004
  • The normal transmission characteristics of sound are hardly obtained due to the various noises and structural factors in a running car environment. It is due to the channel distortion of the original source sound recorded by microphones, and it seriously degrades the performance of the speech recognition in real driving environments. In this paper we analyze the degree of intelligibility under the various sound distortion environments by channels according to driving speed with respect to speech transmission index(STI) and compare the STI with rates of speech recognition. We examine the correlation between measures of intelligibility depending on sound pick-up patterns and performance in speech recognition. Thereby we consider the optimal location of a microphone in single channel environment. In experimentation we find that high correlation is obtained between STI and rates of speech recognition.

  • PDF

HRTF Enhancement Algorithm for Stereo ground Systems (스테레오 시스템을 위한 머리전달함수의 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.4
    • /
    • pp.207-214
    • /
    • 2008
  • To create 3D sound, we usually use two methods which are two channels or multichannel sound systems. Because of cost and space problems, we prefer two channel sound system to multi-channel. Using a headphone or two speakers, the most typical method to create 3D sound effects is a technology of head related transfer function (HRTF) which contains the information that sound arrives from a sound source to the ears of the listener. But it causes a problem to localize a sound source around a certain places which is called cone-of-confusion. In this paper, we proposed the new algorithm to reduce the confusion of sound image localization. HRTF grouping and psychoacoustics theory are used to boost the spectral cue with spectrum difference among each directions. Informal listening tests show that the proposed method improves the front-back sound localization characteristics much better than conventional methods.

A Perception Based Active Matrix Decoder with Virtual Source Location Information (가상 음원 위치 정보를 이용한 능동 메트릭스 디코더)

  • Moon, Han-Gil
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.18-24
    • /
    • 2010
  • In this paper, a new matrix decoding system using vector based Virtual Source Location Information (VSLI) is proposed as an alternative to the conventional Dolby Pro logic II/IIx system for reconstructing multi-channel output signals from matrix encoded two channel signals, Lt/Rt. This new matrix decoding system is composed of passive decoding part and active part. The passive part makes crude multi-channel signals using linear combination of the two encoded signals(Lt/Rt) and the active part enhances each channel regarding to the virtual source which is emergent in each inter channel. Since the virtual sources are related to the perceptual sound images in virtual sound field, the reconstructed multi-channel sound results in good dynamic perception and stable image localization. Moreover, the good channel separation is maintained with nonlinear trigonometric enhancing function.