• Title/Summary/Keyword: 음향 정보

Search Result 1,315, Processing Time 0.024 seconds

High-frequency Reverberation Simulation of High-speed Moving Source in Range-independent Ocean Environment (거리독립 해양환경에서 고속이동 음원의 고주파 잔향음 신호모의)

  • Kim, Sunhyo;Lee, Wonbyoung;You, Seung-Ki;Choi, Jee Woong;Kim, Wooshik;Park, Joung Soo;Park, Kyoung Ju
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.104-115
    • /
    • 2013
  • In a shallow water waveguide, reverberation signals and their Doppler effects form the primary limitation on sonar system performance. Therefore, in the reverberation-limited environment, it is necessary to estimate the reverberation level to be encountered under the conditions in which the sonar system is operated. In this paper, high-frequency reverberation model capable of simulating the reverberation signals received by a high-speed moving source in a range independent waveguide is suggested. In this model, eigenray information from the source to each boundary is calculated using the ray-based approach and the optimizing method for the launch angles. And the source receiving position changed by the moving source is found by a scattering path-finding algorithm, which considers the speed and direction of source and sound speed to find the path of source movement. The scattering effects from sea surface and bottom boundaries are considered by APL-UW scattering models. The model suggested in this paper is verified by a comparison to the measurements made in August 2010. Lastly, this model reflects well statistical properties of the reverberation signals.

Masked cross self-attentive encoding based speaker embedding for speaker verification (화자 검증을 위한 마스킹된 교차 자기주의 인코딩 기반 화자 임베딩)

  • Seo, Soonshin;Kim, Ji-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.497-504
    • /
    • 2020
  • Constructing speaker embeddings in speaker verification is an important issue. In general, a self-attention mechanism has been applied for speaker embedding encoding. Previous studies focused on training the self-attention in a high-level layer, such as the last pooling layer. In this case, the effect of low-level layers is not well represented in the speaker embedding encoding. In this study, we propose Masked Cross Self-Attentive Encoding (MCSAE) using ResNet. It focuses on training the features of both high-level and low-level layers. Based on multi-layer aggregation, the output features of each residual layer are used for the MCSAE. In the MCSAE, the interdependence of each input features is trained by cross self-attention module. A random masking regularization module is also applied to prevent overfitting problem. The MCSAE enhances the weight of frames representing the speaker information. Then, the output features are concatenated and encoded in the speaker embedding. Therefore, a more informative speaker embedding is encoded by using the MCSAE. The experimental results showed an equal error rate of 2.63 % using the VoxCeleb1 evaluation dataset. It improved performance compared with the previous self-attentive encoding and state-of-the-art methods.

Optimal design of impeller in fan motor unit of cordless vacuum cleaner for improving flow performance and reducing aerodynamic noise (무선진공청소기 팬 모터 단품의 유량성능 향상과 공력소음 저감을 위한 임펠라 최적설계)

  • Kim, KunWoo;Ryu, Seo-Yoon;Cheong, Cheolung;Seo, Seongjin;Jang, Cheolmin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.379-389
    • /
    • 2020
  • In this study, the flow and noise performances of high-speed fan motor unit for cordless vacuum cleaner is improved by optimizing the impeller which drives the suction air through flow passage of the cordless vacuum cleaner. Firstly, the unsteady incompressible Reynolds averaged Navier-Stokes (RANS) equations are solved to investigate the flow through the fan motor unit using the computational fluid dynamics techniques. Based on flow field results, the Ffowcs-Williams and Hawkings (FW-H) integral equation is used to predict flow noise radiated from the impeller. Predicted results are compared to the measured ones, which confirms the validity of the numerical method used. It is found that the strong vortex is formed around the mid-chord region of the main blades where the blade curvature change rapidly. Given that vortex acts as a loss for flow and a noise source for noise, impeller blade is redesigned to suppress the identified vortex. The response surface method using two factors is employed to determine the optimum inlet and outlet sweep angles for maximum flow rate and minimum noise. Further analysis of finally selected design confirms the improved flow and noise performance.

I-vector similarity based speech segmentation for interested speaker to speaker diarization system (화자 구분 시스템의 관심 화자 추출을 위한 i-vector 유사도 기반의 음성 분할 기법)

  • Bae, Ara;Yoon, Ki-mu;Jung, Jaehee;Chung, Bokyung;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.461-467
    • /
    • 2020
  • In noisy and multi-speaker environments, the performance of speech recognition is unavoidably lower than in a clean environment. To improve speech recognition, in this paper, the signal of the speaker of interest is extracted from the mixed speech signals with multiple speakers. The VoiceFilter model is used to effectively separate overlapped speech signals. In this work, clustering by Probabilistic Linear Discriminant Analysis (PLDA) similarity score was employed to detect the speech signal of the interested speaker, which is used as the reference speaker to VoiceFilter-based separation. Therefore, by utilizing the speaker feature extracted from the detected speech by the proposed clustering method, this paper propose a speaker diarization system using only the mixed speech without an explicit reference speaker signal. We use phone-dataset consisting of two speakers to evaluate the performance of the speaker diarization system. Source to Distortion Ratio (SDR) of the operator (Rx) speech and customer speech (Tx) are 5.22 dB and -5.22 dB respectively before separation, and the results of the proposed separation system show 11.26 dB and 8.53 dB respectively.

A Development of Telephone for the Hearing Impaired to Improve Listening Ability of Telephone Speech (난청인의 통화 청취도 향상을 위한 전화기 개발)

  • 이상민;송철규;이영묵;김원기
    • Journal of Biomedical Engineering Research
    • /
    • v.18 no.4
    • /
    • pp.457-466
    • /
    • 1997
  • We developed a new hearing aid telephone which helps the hearing impaired person to improve the listening ability of telephone speech. Recently, the hearing impaired person and the elderly who has hearing loss have been continuously increased and their desire for participating society as a producer has been increased also. So they strong1y want the hearing aid devices which make compensation fortheir handicap. The hearing aid telephone is one of the basic aid devices that helps the hearing impaired to communicate well with other poeple and to acquire easily useful information through the phone. We analyze the hearing ability of the hearing impaired, design the new model of the hearing aid telephone and test the telephone in three fields-electrical, word perception, user test. Our new tolephone has lour band pass filter channels and the center frequencies of these filters are 500, 1000, 2000, 3000Hz which are considered psychoacoustic factors and telephone line characteristics. The hearing impaired can adjust the total gain characteristics of receiving sound to his hearing ability by setting four volumes in the telelphone. This procedure is called fitting which is a very important factor for the hearing impaired to take meaning of speech. The total gain of this telephone is over 20dB from 250Hz to 3200Hz range. From the results of the tests we certify that our new model is better for the hearing impaired to understand the meaning or telephone speech than the old general models. The next step of developing the hearing aid telephone is to study about compressing sidetone and noise, dividing frequency bands, selecting hearing aid pattern and compensating psychoacoustic loudness. we expect that the advanced hearing aid telephone can be developed by the research about speech perception characteristics of the hearing impaired in engineering and clinical side.

  • PDF

Identification of the Sectional Distribution of Sound Source in a Wide Duct (넓은 덕트 단면내의 음원 분포 규명)

  • Heo, Yong-Ho;Ih, Jeong-Guon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.2
    • /
    • pp.87-93
    • /
    • 2014
  • If one identifies the detailed distribution of pressure and axial velocity at a source plane, the position and strength of major noise sources can be known, and the propagation characteristics in axial direction can be well understood to be used for the low noise design. Conventional techniques are usually limited in considering the constant source characteristics specified on the whole source surface; then, the source activity cannot be known in detail. In this work, a method to estimate the pressure and velocity field distribution on the source surface with high spatial resolution is studied. The matrix formulation including the evanescent modes is given, and the nearfield measurement method is proposed. Validation experiment is conducted on a wide duct system, at which a part of the source plane is excited by an acoustic driver in the absence of airflow. Increasing the number of evanescent modes, the prediction of pressure spectrum becomes further precise, and it has less than -25 dB error with 26 converged evanescent modes within the Helmholtz number range of interest. By using the converged modal amplitudes, the source parameter distribution is restored, and the position of the driver is clearly identified at kR = 1. By applying the regularization technique to the restored result, the unphysical minor peaks at the source plane can be effectively suppressed with the filtering of the over-estimated pure radial modes.

A study on recognition improvement of velopharyngeal insufficiency patient's speech using various types of deep neural network (심층신경망 구조에 따른 구개인두부전증 환자 음성 인식 향상 연구)

  • Kim, Min-seok;Jung, Jae-hee;Jung, Bo-kyung;Yoon, Ki-mu;Bae, Ara;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.703-709
    • /
    • 2019
  • This paper proposes speech recognition systems employing Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) structures combined with Hidden Markov Moldel (HMM) to effectively recognize the speech of VeloPharyngeal Insufficiency (VPI) patients, and compares the recognition performance of the systems to the Gaussian Mixture Model (GMM-HMM) and fully-connected Deep Neural Network (DNNHMM) based speech recognition systems. In this paper, the initial model is trained using normal speakers' speech and simulated VPI speech is used for generating a prior model for speaker adaptation. For VPI speaker adaptation, selected layers are trained in the CNN-HMM based model, and dropout regulatory technique is applied in the LSTM-HMM based model, showing 3.68 % improvement in recognition accuracy. The experimental results demonstrate that the proposed LSTM-HMM-based speech recognition system is effective for VPI speech with small-sized speech data, compared to conventional GMM-HMM and fully-connected DNN-HMM system.

Acoustic Analysis and Auditory-Perceptual Assessment for Diagnosis of Functional Dysphonia (기능성 음성장애의 진단을 위한 음향학적, 청지각적 평가)

  • Kim, Geun-Hyo;Lee, Yeon-Yoo;Bae, In-Ho;Lee, Jae-Seok;Lee, Chang-Yoon;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Journal of Clinical Otolaryngology Head and Neck Surgery
    • /
    • v.29 no.2
    • /
    • pp.212-222
    • /
    • 2018
  • Background and Objectives : The purpose of this study was to compare the measured values of acoustic and auditory perceptual assessments between normal and functional dysphonia (FD) groups. Materials and Methods : 102 subjects with FD and 59 normal voice groups were participated in this study. Mid-vowel portion of the sustained vowel /a/ and two sentences of 'Sanchaek' were edited, concatenated, and analyzed by Praat script. And then auditory-perceptual (AP) rating was completed by three listeners. Results : The FD group showed higher acoustic voice quality index version 2.02 and version 3.01 (AVQIv2 and AVQIv3), slope, Hammarberg index (HAM), grade (G) and overall severity (OS), values than normal group. Additionally, smoothed cepstral peak prominence in Praat (PraatCPPS), tilt, low-to high spectral band energies (L/H ratio), long-term average spectrum (LTAS) in FD group were lower than normal voice group. And the correlation among measured values ranged from -0.250 to 0.960. In ROC curve analysis, cutoff values of AVQIv2, AVQIv3, PraatCPPS, slope, tilt, L/H ratio, HAM, and LTAS were 3.270, 2.013, 13.838, -22.286, -9.754, 369.043, 27.912, and 34.523, respectively, and the AUC of each analysis was over .890 in AVQIv2, AVQIv3, and PraatCPPS, over 0.731 in HAM, tilt, and slope, over 0.605 in LTAS and L/H ratio. Conclusions : In conclusion, AVQI and CPPS showed the highest predictive power for distinguishing between normal and FD groups. Acoustic analyses and AP rating as noninvasive examination can reinforce the screening capability of FD and help to establish efficient diagnosis and treatment process plan for FD.

Flow-Induced Noise Prediction for Submarines (잠수함 형상의 유동소음 해석기법 연구)

  • Yeo, Sang-Jae;Hong, Suk-Yoon;Song, Jee-Hun;Kwon, Hyun-Wung;Seol, Hanshin
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.7
    • /
    • pp.930-938
    • /
    • 2018
  • Underwater noise radiated from submarines is directly related to the probability of being detected by the sonar of an enemy vessel. Therefore, minimizing the noise of a submarine is essential for improving survival outcomes. For modern submarines, as the speed and size of a submarine increase and noise reduction technology is developed, interest in flow noise around the hull has been increasing. In this study, a noise analysis technique was developed to predict flow noise generated around a submarine shape considering the free surface effect. When a submarine is operated near a free surface, turbulence-induced noise due to the turbulence of the flow and bubble noise from breaking waves arise. First, to analyze the flow around a submarine, VOF-based incompressible two-phase flow analysis was performed to derive flow field data and the shape of the free surface around the submarine. Turbulence-induced noise was analyzed by applying permeable FW-H, which is an acoustic analogy technique. Bubble noise was derived through a noise model for breaking waves based on the turbulent kinetic energy distribution results obtained from the CFD results. The analysis method developed was verified by comparison with experimental results for a submarine model measured in a Large Cavitation Tunnel (LCT).

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.38-44
    • /
    • 2022
  • Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.