• Title/Summary/Keyword: Acoustic signals

Search Result 914, Processing Time 0.021 seconds

Shooting sound analysis using convolutional neural networks and long short-term memory (합성곱 신경망과 장단기 메모리를 이용한 사격음 분석 기법)

  • Kang, Se Hyeok;Cho, Ji Woong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.312-318
    • /
    • 2022
  • This paper proposes a model which classifies the type of guns and information about sound source location using deep neural network. The proposed classification model is composed of convolutional neural networks (CNN) and long short-term memory (LSTM). For training and test the model, we use the Gunshot Audio Forensic Dataset generated by the project supported by the National Institute of Justice (NIJ). The acoustic signals are transformed to Mel-Spectrogram and they are provided as learning and test data for the proposed model. The model is compared with the control model consisting of convolutional neural networks only. The proposed model shows high accuracy more than 90 %.

Linear prediction analysis-based method for detecting snapping shrimp noise (선형 예측 분석 기반의 딱총 새우 잡음 검출 기법)

  • Jinuk Park;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.3
    • /
    • pp.262-269
    • /
    • 2023
  • In this paper, we propose a Linear Prediction (LP) analysis-based feature for detecting Snapping Shrimp (SS) Noise (SSN) in underwater acoustic data. SS is a species that creates high amplitude signals in shallow, warm waters, and its frequent and loud sound is a major source of noise. The proposed feature takes advantage of the characteristic of SSN, which is sudden and rapidly disappearing, by using LP analysis to detect the exact noise interval and reduce the effects of SSN. The error between the predicted and measured value is large and results in effective SSN detection. To further improve performance, a constant false alarm rate detector is incorporated into the proposed feature. Our evaluation shows that the proposed methods outperform the state-of-the-art MultiLayer-Wavelet Packet Decomposition (ML-WPD) in terms of receiver operating characteristic curve and Area Under the Curve (AUC), with the LP analysis-based feature achieving a higher AUC by 0.12 on average and lower computational complexity.

Passive sonar signal classification using attention based gated recurrent unit (어텐션 기반 게이트 순환 유닛을 이용한 수동소나 신호분류)

  • Kibae Lee;Guhn Hyeok Ko;Chong Hyun Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.345-356
    • /
    • 2023
  • Target signal of passive sonar shows narrow band harmonic characteristic with a variation in intensity within a few seconds and long term frequency variation due to the Lloyd's mirror effect. We propose a signal classification algorithm based on Gated Recurrent Unit (GRU) that learns local and global time series features. The algorithm proposed implements a multi layer network using GRU and extracts local and global time series features via dilated connections. We learns attention mechanism to weight time series features and classify passive sonar signals. In experiments using public underwater acoustic data, the proposed network showed superior classification accuracy of 96.50 %. This result is 4.17 % higher classification accuracy compared to existing skip connected GRU network.

Research on Developing a Conversational AI Callbot Solution for Medical Counselling

  • Won Ro LEE;Jeong Hyon CHOI;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.9-13
    • /
    • 2023
  • In this study, we explored the potential of integrating interactive AI callbot technology into the medical consultation domain as part of a broader service development initiative. Aimed at enhancing patient satisfaction, the AI callbot was designed to efficiently address queries from hospitals' primary users, especially the elderly and those using phone services. By incorporating an AI-driven callbot into the hospital's customer service center, routine tasks such as appointment modifications and cancellations were efficiently managed by the AI Callbot Agent. On the other hand, tasks requiring more detailed attention or specialization were addressed by Human Agents, ensuring a balanced and collaborative approach. The deep learning model for voice recognition for this study was based on the Transformer model and fine-tuned to fit the medical field using a pre-trained model. Existing recording files were converted into learning data to perform SSL(self-supervised learning) Model was implemented. The ANN (Artificial neural network) neural network model was used to analyze voice signals and interpret them as text, and after actual application, the intent was enriched through reinforcement learning to continuously improve accuracy. In the case of TTS(Text To Speech), the Transformer model was applied to Text Analysis, Acoustic model, and Vocoder, and Google's Natural Language API was applied to recognize intent. As the research progresses, there are challenges to solve, such as interconnection issues between various EMR providers, problems with doctor's time slots, problems with two or more hospital appointments, and problems with patient use. However, there are specialized problems that are easy to make reservations. Implementation of the callbot service in hospitals appears to be applicable immediately.

Optimizing Wavelet in Noise Canceler by Deep Learning Based on DWT (DWT 기반 딥러닝 잡음소거기에서 웨이블릿 최적화)

  • Won-Seog Jeong;Haeng-Woo Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.113-118
    • /
    • 2024
  • In this paper, we propose an optimal wavelet in a system for canceling background noise of acoustic signals. This system performed Discrete Wavelet Transform(DWT) instead of the existing Short Time Fourier Transform(STFT) and then improved noise cancellation performance through a deep learning process. DWT functions as a multi-resolution band-pass filter and obtains transformation parameters by time-shifting the parent wavelet at each level and using several wavelets whose sizes are scaled. Here, the noise cancellation performance of several wavelets was tested to select the most suitable mother wavelet for analyzing the speech. In this study, to verify the performance of the noise cancellation system for various wavelets, a simulation program using Tensorflow and Keras libraries was created and simulation experiments were performed for the four most commonly used wavelets. As a result of the experiment, the case of using Haar or Daubechies wavelets showed the best noise cancellation performance, and the mean square error(MSE) was significantly improved compared to the case of using other wavelets.

Deep learning-based approach to improve the accuracy of time difference of arrival - based sound source localization (도달시간차 기반의 음원 위치 추정법의 정확도 향상을 위한 딥러닝 적용 연구)

  • Iljoo Jeong;Hyunsuk Huh;In-Jee Jung;Seungchul Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.178-183
    • /
    • 2024
  • This study introduces an enhanced sound source localization technique, bolstered by a data-driven deep learning approach, to improve the precision and accuracy of direction of arrival estimation. Focused on refining Time Difference Of Arrival (TDOA) based sound source localization, the research hinges on accurately estimating TDOA from cross-correlation functions. Accurately estimating the TDOA still remains a limitation in this research field because the measured value from actual microphones are mixed with a lot of noise. Additionally, the digitization process of acoustic signals introduces quantization errors, associated with the sampling frequency of the measurement system, that limit the precision of TDOA estimation. A deep learning-based approach is designed to overcome these limitations in TDOA accuracy and precision. To validate the method, we conduct comprehensive evaluations using both two and three-microphone array configurations. Moreover, the feasibility and real-world applicability of the suggested method are further substantiated through experiments conducted in an anechoic chamber.

Effects of vowel types and sentence positions in standard passage on auditory and cepstral and spectral measures in patients with voice disorders (모음 유형과 표준문단의 문장 위치가 음성장애 환자의 청지각적 및 켑스트럼 및 스펙트럼 분석에 미치는 효과)

  • Mi-Hyeon Choi;Seong Hee Choi
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.81-90
    • /
    • 2023
  • Auditory perceptual assessment and acoustic analysis are commonly used in clinical practice for voice evaluation. This study aims to explore the effects of speech task context on auditory perceptual assessment and acoustic measures in patients with voice disorders. Sustained vowel phonations (/a/, /e/, /i/, /o/, /u/, /ɯ/, /ʌ/) and connected speech (a standardized paragraph 'kaeul' and nine sub-sentences) were obtained from a total of 22 patients with voice disorders. GRBAS ('G', 'R', 'B', 'A', 'S') and CAPE-V ('OS', 'R', 'B', 'S', 'P', 'L') auditory-perceptual assessment were evaluated by two certified speech language pathologists specializing in voice disorders using blind and random voice samples. Additionally, spectral and cepstral measures were analyzed using the analysis of dysphonia in speech and voice model (ADSV).When assessing voice quality with the GRBAS scale, it was not significantly affected by the vowel type except for 'B', while the 'OS', 'R' and 'B' in CAPE-V were affected by the vowel type (p<.05). In addition, measurements of CPP and L/H ratio were influenced by vowel types and sentence positions. CPP values in the standard paragraph showed significant negative correlations with all vowels, with the highest correlation observed for /e/ vowel (r=-.739). The CPP of the second sentence had the strongest correlation with all vowels. Depending on the speech stimulus, CAPE-V may have a greater impact on auditory-perceptual assessment than GRBAS, vowel types and sentence position with consonants influenced the 'B' scale, CPP, and L/H ratio. When using vowels in the voice assessment of patients with voice disorders, it would be beneficial to use not only /a/, but also the vowel /i/, which is acoustically highly correlated with 'breathy'. In addition, the /e/ vowel was highly correlated acoustically with the standardized passage and sub-sentences. Furthermore, given that most dysphonic signals are aperiodic, 2nd sentence of the 'kaeul' passage, which is the most acoustically correlated with all vowels, can be used with CPP. These results provide clinical evidence of the impact of speech tasks on auditory perceptual and acoustic measures, which may help to provide guidelines for voice evaluation in patients with voice disorders.

Condition Monitoring of Low Speed Slewing Bearings Based on Ensemble Empirical Mode Decomposition Method (EEMD법을 이용한 저속 선회베어링 상태감시)

  • Caesarendra, W.;Park, J.H.;Kosasih, P.B.;Choi, B.K.
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.23 no.2
    • /
    • pp.131-143
    • /
    • 2013
  • Vibration condition monitoring of low-speed rotational slewing bearings is essential ever since it became necessary for a proper maintenance schedule that replaces the slewing bearings installed in massive machinery in the steel industry, among other applications. So far, acoustic emission(AE) is still the primary technique used for dealing with low-speed bearing cases. Few studies employed vibration analysis because the signal generated as a result of the impact between the rolling element and the natural defect spots at low rotational speeds is generally weak and sometimes buried in noise and other interference frequencies. In order to increase the impact energy, some researchers generate artificial defects with a predetermined length, width, and depth of crack on the inner or outer race surfaces. Consequently, the fault frequency of a particular fault is easy to identify. This paper presents the applications of empirical mode decomposition(EMD) and ensemble empirical mode decomposition(EEMD) for measuring vibration signals slewing bearings running at a low rotational speed of 15 rpm. The natural vibration damage data used in this paper are obtained from a Korean industrial company. In this study, EEMD is used to support and clarify the results of the fast Fourier transform(FFT) in identifying bearing fault frequencies.

Inverse Estimation of Geoacoustic Parameters in Shallow Water Using tight Bulb Sound Source (천해환경에서 전구음원을 이용한 지음향인자의 역추정)

  • 한주영;이성욱;나정열;김성일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.8-16
    • /
    • 2004
  • An inversion method is presented for the determination of the compressional wave speed, compressional wave attenuation, thickness of the sediment layer and density as a function of depth for a horizontally stratified ocean bottom. An experiment for estimating those properties was conducted in the shallow water of South Sea in Korea. In the experiment, a light bulb implosion and the propagating sound were measured using a VLA (vertical line array). As a method for estimating the geoacoustic properties, a coherent broadband matched field processing combined with Genetic Algorithm was employed. When a time-dependent signal is very short, the Fourier transform results are not accurate, since the frequency components are not locatable in time and the windowed Fourier transform is limited by the length of the window. However, it is possible to do this using the wavelet transform a transform that yields a time-frequency representation of a signal. In this study, this transform is used to identify and extract the acoustic components from multipath time series. The inversion is formulated as an optimization problem which maximizes the cost function defined as a normalized correlation between the measured and modeled signals in the wavelet transform coefficient vector. The experiments and procedures for deploying the light bulbs and the coherent broadband inversion method are described, and the estimated geoacoustic profile in the vicinity of the VLA site is presented.

Impact Monitoring of Composite Structures using Fiber Bragg Grating Sensors (광섬유 브래그 격자 센서를 이용한 복합재 구조물의 충격 모니터링 기법 연구)

  • Jang, Byeong-Wook;Park, Sang-Oh;Lee, Yeon-Gwan;Kim, Chun-Gon;Park, Chan-Yik;Lee, Bong-Wan
    • Composites Research
    • /
    • v.24 no.1
    • /
    • pp.24-30
    • /
    • 2011
  • Low-velocity impact can cause various damages which are mostly hidden inside the laminates or occur in the opposite side. Thus, these damages cannot be easily detected by visual inspection or conventional NDT systems. And if they occurred between the scheduled NDT periods, the possibilities of extensive damages or structural failure can be higher. Due to these reasons, the built-in NDT systems such as real-time impact monitoring system are required in the near future. In this paper, we studied the impact monitoring system consist of impact location detection and damage assessment techniques for composite flat and stiffened panel. In order to acquire the impact-induced acoustic signals, four multiplexed FBG sensors and high-speed FBG interrogator were used. And for development of the impact and damage occurrence detections, the neural networks and wavelet transforms were adopted. Finally, these algorithms were embodied using MATLAB and LabVIEW software for the user-friendly interface.