• Title/Summary/Keyword: 분산음원

Search Result 39, Processing Time 0.023 seconds

Speech extraction based on AuxIVA with weighted source variance and noise dependence for robust speech recognition (강인 음성 인식을 위한 가중화된 음원 분산 및 잡음 의존성을 활용한 보조함수 독립 벡터 분석 기반 음성 추출)

  • Shin, Ui-Hyeop;Park, Hyung-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.326-334
    • /
    • 2022
  • In this paper, we propose speech enhancement algorithm as a pre-processing for robust speech recognition in noisy environments. Auxiliary-function-based Independent Vector Analysis (AuxIVA) is performed with weighted covariance matrix using time-varying variances with scaling factor from target masks representing time-frequency contributions of target speech. The mask estimates can be obtained using Neural Network (NN) pre-trained for speech extraction or diffuseness using Coherence-to-Diffuse power Ratio (CDR) to find the direct sounds component of a target speech. In addition, outputs for omni-directional noise are closely chained by sharing the time-varying variances similarly to independent subspace analysis or IVA. The speech extraction method based on AuxIVA is also performed in Independent Low-Rank Matrix Analysis (ILRMA) framework by extending the Non-negative Matrix Factorization (NMF) for noise outputs to Non-negative Tensor Factorization (NTF) to maintain the inter-channel dependency in noise output channels. Experimental results on the CHiME-4 datasets demonstrate the effectiveness of the presented algorithms.

Underdetermined blind source separation using normalized spatial covariance matrix and multichannel nonnegative matrix factorization (멀티채널 비음수 행렬분해와 정규화된 공간 공분산 행렬을 이용한 미결정 블라인드 소스 분리)

  • Oh, Son-Mook;Kim, Jung-Han
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.120-130
    • /
    • 2020
  • This paper solves the problem in underdetermined convolutive mixture by improving the disadvantages of the multichannel nonnegative matrix factorization technique widely used in blind source separation. In conventional researches based on Spatial Covariance Matrix (SCM), each element composed of values such as power gain of single channel and correlation tends to degrade the quality of the separated sources due to high variance. In this paper, level and frequency normalization is performed to effectively cluster the estimated sources. Therefore, we propose a novel SCM and an effective distance function for cluster pairs. In this paper, the proposed SCM is used for the initialization of the spatial model and used for hierarchical agglomerative clustering in the bottom-up approach. The proposed algorithm was experimented using the 'Signal Separation Evaluation Campaign 2008 development dataset'. As a result, the improvement in most of the performance indicators was confirmed by utilizing the 'Blind Source Separation Eval toolbox', an objective source separation quality verification tool, and especially the performance superiority of the typical SDR of 1 dB to 3.5 dB was verified.

Geophysical Investigation of Gas Hydrate-Bearing Sediments in the Sea of Okhotsk (오호츠크해 가스하이드레이트 퇴적층의 지구물리 탐사)

  • Jin, YoungKeun;Chung, KyungHo;Kim, YeaDong
    • Journal of the Korean Geophysical Society
    • /
    • v.7 no.3
    • /
    • pp.207-215
    • /
    • 2004
  • As the sea connecting with the East Sea, the Sea of Okhotsk is the most potential area of gas hydrates in the world. In other to examine geophysical structures of gas hydrate-bearing sediments in the Sea of Okhotsk, the CHAOS (hydro-Carbon Hydrate Accumulation in the Okhotsk) international research expedition was carried out in August 2003. In the expedition, high-resolution seismic and geochemical survey was also conducted. Sparker seismic profiles show only diffusive high-amplitude reflections without BSRs at BSR depth. It means that BSR appears to be completely different images on seismic profiles obtained using different frequencies. Many gas chimneys rise up from BSR depth to seafloor. The chimneys can be divided into two groups with different seismic characteristics; wipe-out (WO) and enhanced reflection (ER) chimneys. Different seismic responses in the chimneys would be caused by amount of gas and gas hydrates filling in the chimneys. In hydroacoustic data, a lot of gas flares rise up several hundreds meters from seafloor to the water column. All flares took placed at the depths within gas hydrate stability zone. It is interpreted that gas hydrate-bearing sediments with low porosity and permeability due to gas hydrate filling in the pore space make good pipe around gas chimneys in which gas is migrating up without loss of amount. Therefore, large-scale gas flare at the site on gas chimney releases into the water column.

  • PDF

Distribution of Seagrass (Zostera marina) Beds and High Frequency Backscattering Characteristics by Photosynthesis (잘피 서식지의 분포와 광합성에 의한 고주파 후방산란 특성)

  • Yoon Kwan-Seob;La Hyoung Sul;Na Jungyul;Lee Jae-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.8
    • /
    • pp.562-569
    • /
    • 2004
  • An experiment for observation of the distribution of the seagrass (zostera marina) beds and characteristics of high-frequency backscattering by the photosynthesis was conducted off the coast. Acoustic data were taken as a function of the grazing angles and the relative azimuth angles on the seagrass beds of which bottom type was sandy-mud. The transmitted source signal was a 120 kHz CW waveform. Mapping of the seagrass beds distribution was drawn up using the seagrass backscattering strength with azimuth and grazing angles. The result of the comparison backscattering strength distribution of the seagrass beds was shown to be the similar to the photograph of real seagrass beds. The seagrass backscattering strength was also compared between day and night to verify the effects of the acoustical scattering by the bubbles of Photosynthetic oxygen formed on the seagrass. In these results. it is clear that observation of the seagrass beds between day and night showed the different characteristics because the bubbles of Photosynthetic oxygen affect the acoustical scattering.

Error analysis of acoustic target detection and localization using Cramer Rao lower bound (크래머 라오 하한을 이용한 음향 표적 탐지 및 위치추정 오차 분석)

  • Park, Ji Sung;Cho, Sungho;Kang, Donhyug
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.3
    • /
    • pp.218-227
    • /
    • 2017
  • In this paper, an algorithm to calculate both bearing and distance error for target detection and localization is proposed using the Cramer Rao lower bound to estimate the minium variance of their error in DOA (Direction Of Arrival) estimation. The performance of arrays in detection and localization depends on the accuracy of DOA, which is affected by a variation of SNR (Signal to Noise Ratio). The SNR is determined by sonar parameters such as a SL (Source Level), TL (Transmission Loss), NL (Noise Level), array shape and beam steering angle. For verification of the suggested method, a Monte Carlo simulation was performed to probabilistically calculate the bearing and distance error according to the SNR which varies with the relative position of the target in space and noise level.

Hearing Threshold of Children with Hearing Screening-Passed in Day Care Center and Speech-Language Pathology Clinic (청각선별을 통과한 주간 보호와 언어재활 서비스 수혜 소아의 가청역치)

  • Heo, Seung-Deok
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.10 no.4
    • /
    • pp.273-278
    • /
    • 2016
  • Responded threshold level in hearing screening depends on the noise level of test surroundings, physiological characteristics of hearing organs, excessive sound source exposures, and so on. The purpose of this study is to obtain the basic information of hearing threshold level at each frequencies in children with passed hearing screening. Subjects were 110 children, aged were from 3.3 to 16.3 ($9.01{\pm}2.52$), who were at private speech language pathological clinics and daycare centers. Methods of Hearing screening were tympanometry, acoustic reflex threshold, automated otoacoustic emission, and pure tone screening. The subjects were in normal criteria of hearing screening. The differences of hearing threshold among ages and frequencies were measured by means of repeated measures ANOVA. The mean of hearing thresholds level was observed $16{\pm}6.49$, $11.5{\pm}4.79$, $6.86{\pm}4.99$, $5.95{\pm}6.65$ dB HL in the right ear and $15.68{\pm}6.01$, $9.95{\pm}5.24$, $5.72{\pm}5.21$, $5.63{\pm}7.04$ dB HL in the left ear, in frequency of 500, 1,000, 2,000, 4,000 Hz respectively. There was a significant difference between 500 and 1,000, 2,000, 4,000 Hz (p=.000), between 1,000 and 2,000, 4,000 Hz (p=.000).

Effectiveness of multi-mode surface wave inversion in shallow engineering site investigations (토목관련 천부층 조사에서 다중 모드 표면파 역산의 효과)

  • Feng Shaokong;Sugiyama Takeshi;Yamanaka Hiroaki
    • Geophysics and Geophysical Exploration
    • /
    • v.8 no.1
    • /
    • pp.26-33
    • /
    • 2005
  • Inversion of multi-mode surface-wave phase velocity for shallow engineering site investigation has received much attention in recent years. A sensitivity analysis and inversion of both synthetic and field data demonstrates the greater effectiveness of this method over employing the fundamental mode alone. Perturbation of thickness and shear-wave velocity parameters in multi-modal Rayleigh wave phase velocities revealed that the sensitivities of higher modes: (a) concentrate in different frequency bands, and (b) are greater than the fundamental mode for deeper parameters. These observations suggest that multi-mode phase velocity inversion can provide better parameter discrimination and imaging of deep structure, especially with a velocity reversal, than can inversion of fundamental mode data alone. An inversion of the theoretical phase velocities in a model with a low velocity layer at 20 m depth can only image the soft layer when the first higher mode is incorporated. This is especially important when the lowest measurable frequency is only 6 Hz. Field tests were conducted at sites surveyed by borehole and PS logging. At the first site, an array microtremor survey, often used for deep geological surveying in Japan, was used to survey the soil down to 35 m depth. At the second site, linear multichannel spreads with a sledgehammer source were recorded, for an investigation down to 12 m depth. The f-k power spectrum method was applied for dispersion analysis, and velocities up to the second higher mode were observed in each test. The multi-mode inversion results agree well with PS logs, but models estimated from the fundamental mode alone show f large underestimation of the depth to shallow soft layers below artificial fill.

Construction of an Audio Steganography Botnet Based on Telegram Messenger (텔레그램 메신저 기반의 오디오 스테가노그래피 봇넷 구축)

  • Jeon, Jin;Cho, Youngho
    • Journal of Internet Computing and Services
    • /
    • v.23 no.5
    • /
    • pp.127-134
    • /
    • 2022
  • Steganography is a hidden technique in which secret messages are hidden in various multimedia files, and it is widely exploited for cyber crime and attacks because it is very difficult for third parties other than senders and receivers to identify the presence of hidden information in communication messages. Botnet typically consists of botmasters, bots, and C&C (Command & Control) servers, and is a botmasters-controlled network with various structures such as centralized, distributed (P2P), and hybrid. Recently, in order to enhance the concealment of botnets, research on Stego Botnet, which uses SNS platforms instead of C&C servers and performs C&C communication by applying steganography techniques, has been actively conducted, but image or video media-oriented stego botnet techniques have been studied. On the other hand, audio files such as various sound sources and recording files are also actively shared on SNS, so research on stego botnet based on audio steganography is needed. Therefore, in this study, we present the results of comparative analysis on hidden capacity by file type and tool through experiments, using a stego botnet that performs C&C hidden communication using audio files as a cover medium in Telegram Messenger.

S-wave Velocity and Attenuation Structure from Multichannel Seismic surface waves: Geotechnical Characteristics of NakDong Delta Soil (다중채널 표면파 자료를 이용하여 구한 S파 속도와 감쇠지수 구조: 낙동강 하구의 연약 지반 특성)

  • Jung, Hee-Ok
    • Journal of the Korean earth science society
    • /
    • v.25 no.8
    • /
    • pp.774-783
    • /
    • 2004
  • The S wave velocity and Q$s^{-1}$ structure of the uppermost part of the soil in Nakdong Delta area have been obtained to determine the characteristics of the forementioned soil. The phase and attenuation coefficients of multichannel seismic records were inverted to obtain the S wave velocity and Q$s^{-1}$ structure of the soil. The inversion results have been compared with the borehole measurements of the area. The seismic signal of the nearest geophone from a seismic source was used as the source signal to obtain the attenuation coefficients. Amplitude ratios of the signal at each geophone to the source signal wave plotted as a function of distance for the frequency range between 10 Hz and 45 Hz. The slope of a linear regression line which fits amplitude ratio-distance relationship best for a given frequency was used as the attenuation coefficients for the frequency. The dispersion curve of Rayleigh waves and the attenuation coefficients were inverted to obtain the S-wave velocity and Q$s^{-1}$, respectively, in the uppermost 8 meter of soil layer. The borehole measurements of the area show that are two distinct layers; the upper 4 meter of silty-sand and the lower 4 meter of silty-clay. The inversion results indicate that the shear wave velocity of the upper layer is 80 m/sec and 40m/sec in the lower silty-clay layer. The spacial resolution of the shear wave velocity structure is very good down to a depth of 8 meter. The Q$s^{-1}$ in the upper silty-sand layer is 0.02 and increase to 0.03 in the lower silty-sand layer. The spacial resolution of quality factor is relatively good down to a depth of 5 meter, but very poor below the depth. In this study, the S-wave velocity is higher in the silty-clay and the Q$s^{-1}$ is smaller silty-sand than in the silty-clay. However, much more data should be analyzed and accumulated before making any generalization on the shear wave velocity and Q$s^{-1}$ of the sediments.