• Title/Summary/Keyword: 음향 식별

Search Result 177, Processing Time 0.016 seconds

Restoration of damaged speech files using deep neural networks (심층 신경망을 활용한 손상된 음성파일 복원 자동화)

  • Heo, Hee-Soo;So, Byung-Min;Yang, IL-Ho;Yoon, Sung-Hyun;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.2
    • /
    • pp.136-143
    • /
    • 2017
  • In this paper, we propose a method for restoring damaged audio files using deep neural network. It is different from the conventional file carving based restoration. The purpose of our method is to infer lost information which can not be restored by existing techniques such as the file carving. We have devised methods that can automate the tasks which are essential for the restoring but are inappropriate for humans. As a result of this study it has been shown that it is possible to restore the damaged files, which the conventional file carving method could not, by using tasks such as speech or nonspeech decision and speech encoder recognizer using a deep neural network.

Variable Length Optimum Convergence Factor Algorithm for Adaptive Filters (적응 필터를 위한 가변 길이 최적 수렴 인자 알고리듬)

  • Boo, In-Hyoung;Kang, Chul-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.4
    • /
    • pp.77-85
    • /
    • 1994
  • In this study an adaptive algorithm with optimum convergence factor for steepest descent method is proposed, which controls automatically the filter order to take the appropriate level. So far, fixed order filters have been used when adaptive filter is employed according to the priori knowledge or experience in various adaptive signal processing applications. But, it is so difficult to know the filter order needed in real implementations that high order filters have to be performed. As a result, redundant calculations are increased in the case of high order filters. The proposed variable length optimum convergence factor (VLOCF) algorithm takes the appropriated filter order within the given one so that the redundant calculation is decreased to get the enhancement of convergence speed and smaller convergence error during the steady state. The proposed algorithm is evaluated to prove the validity by computer simulation for system Identification.

  • PDF

Speech Recognition on Korean Monosyllable using Phoneme Discriminant Filters (음소판별필터를 이용한 한국어 단음절 음성인식)

  • Hur, Sung-Phil;Chung, Hyun-Yeol;Kim, Kyung-Tae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1
    • /
    • pp.31-39
    • /
    • 1995
  • In this paper, we have constructed phoneme discriminant filters [PDF] according to the linear discriminant function. These discriminant filters do not follow the heuristic rules by the experts but the mathematical methods in iterative learning. Proposed system. is based on the piecewise linear classifier and error correction learning method. The segmentation of speech and the classification of phoneme are carried out simutaneously by the PDF. Because each of them operates independently, some speech intervals may have multiple outputs. Therefore, we introduce the unified coefficients by the output unification process. But sometimes the output has a region which shows no response, or insensitive. So we propose time windows and median filters to remove such problems. We have trained this system with the 549 monosyllables uttered 3 times by 3 male speakers. After we detect the endpoint of speech signal using threshold value and zero crossing rate, the vowels and consonants are separated by the PDF, and then selected phoneme passes through the following PDF. Finally this system unifies the outputs for competitive region or insensitive area using time window and median filter.

  • PDF

A Study on the Underwater Target Detection Using the Waveform Inversion Technique (파형역산 기법을 이용한 수중표적 탐지 연구)

  • Bae, Ho Seuk;Kim, Won-Ki;Kim, Woo Shik;Choi, Sang Moon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.6
    • /
    • pp.487-492
    • /
    • 2015
  • A short-range underwater target detection and identification techniques using mid- and high-frequency bands have been highly developed. However, nowadays the long-range detection using the low-frequency band is requested and one of the most challengeable issues. The waveform inversion technique is widely used and the hottest technology in both academia and industry of the seismic exploration. It is based on the numerical analysis tool, and could construct more than a few kilometers of the subsurface structures and model-parameters such as P-wave velocity using a low-frequency band. By applying this technique to the underwater acoustic circumstance, firstly application of underwater target detection is verified. Furthermore, subsurface structures and it's parameters of the war-field are well reconstructed. We can confirm that this technique greatly reduces the false-alarm rate for the underwater targets because it could accurately reproduce both the shape and the model-parameters at the same time.

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition (다국어 음성 인식을 위한 자동 어휘모델의 생성에 대한 연구)

  • 지원우;윤춘덕;김우성;김석동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.6
    • /
    • pp.434-442
    • /
    • 2003
  • Software internationalization, the process of making software easier to localize for specific languages, has deep implications when applied to speech technology, where the goal of the task lies in the very essence of the particular language. A greatdeal of work and fine-tuning has gone into language processing software based on ASCII or a single language, say English, thus making a port to different languages difficult. The inherent identity of a language manifests itself in its lexicon, where its character set, phoneme set, pronunciation rules are revealed. We propose a decomposition of the lexicon building process, into four discrete and sequential steps. For preprocessing to build a lexical model, we translate from specific language code to unicode. (step 1) Transliterating code points from Unicode. (step 2) Phonetically standardizing rules. (step 3) Implementing grapheme to phoneme rules. (step 4) Implementing phonological processes.

Feature Vector Extraction Method for Transient Sonar Signals Using PR-QMF Wavelet Transform (PR-QMF Wavelet Transform을 이용한 천이 수중 신호의 특징벡타 추출 기법)

  • Jung, Yong-Min;Choi, Jong-Ho;Cho, Yong-Soo;Oh, Won-Tcheon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.1
    • /
    • pp.87-92
    • /
    • 1996
  • Transient signals in underwater show several characterisrics, that is, short duration, strong nonstationarity, various types of transient sources, which make it difficult to analyze and classify transient signals. In this paper, the feature vector extraction method for transient SOMAR signals is discussed by applying digital signal processing methods to the analysis of transient signals. A feature vector extraction methods using wavelet transform, which enable us to obtain better recognition rate than automatic classification using the classical method, are proposed. It is confirmed by simulation that the proposed method using wavelet transform performs better than the classical method even with smaller number of feature vectors. Especially, the feature vector extraction method using PR-QMF wavelet transform with the Daubechies coefficients is shown to perform well in noisy environment with easy implementation.

  • PDF

An Enhanced Affine Projection Sign Algorithm in Impulsive Noise Environment (충격성 잡음 환경에서 개선된 인접 투사 부호 알고리즘)

  • Lee, Eun Jong;Chung, Ik Joo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.6
    • /
    • pp.420-426
    • /
    • 2014
  • In this paper, we propose a new affine projection sign algorithm (APSA) to improve the convergence speed of the conventional APSA which has been proposed to enable the affine projection algorithm (APA) to operate robustly in impulsive noise environment. The conventional APSA has two advantages; it operates robustly against impulsive noise and does not need calculation for the inverse matrix. The proposed algorithm also has the conventional algorithm's advantages and furthermore, better convergence speed than the conventional algorithm. In the conventional algorithm, each input signal is normalized by $l_2$-norm of all input signals, but the proposed algorithm uses input signals normalized by their corresponding $l_2$-norm. We carried out a performance comparison of the proposed algorithm with the conventional algorithm using a system identification model. It is shown that the proposed algorithm has the faster convergence speed than the conventional algorithm.

A Subband Structured Digital Hearing Aid Design for Compensating Sensorineural Hearing Loss (감음성 난청 보상을 위한 부밴드 구조 디지털 보청기 설계)

  • Park Jo-Dong;Choi Hun;Bae Hveon-Deok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.5
    • /
    • pp.238-247
    • /
    • 2005
  • In this Paper. we Presents subband design techniques of a compensating filter and adaptive feedback canceller for the digital hearing aid. The sensorineural hearing loss has a hearing threshold that shows a nonlinear characteristic in frequency domain. and its compensation suffers from an echo that produced by an undesired time varying feedback path. Therefore. the digital hearing aid requires the compensator that can adjust gains nonlinearly in frequency bands and eliminate the echo rapidly In the Proposed digital hearing aid. the compensating filter is designed by the adaptive system identification method in subband structure, and the adaptive feedback canceller is designed by the subband affine projection algorithm. The designed compensation filter can control the nonlinear gain in each subband respectively, therefore precise compensation is possible. And the feedback canceller using the subband adaptive filter achieves fast convergence rate. The Performances of the Proposed method are verified by computer simulations as comparing with the behaviors of the previous trials.

Separation of passive sonar target signals using frequency domain independent component analysis (주파수영역 독립성분분석을 이용한 수동소나 표적신호 분리)

  • Lee, Hojae;Seo, Iksu;Bae, Keunsung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.2
    • /
    • pp.110-117
    • /
    • 2016
  • Passive sonar systems detect and classify the target by analyzing the radiated noises from vessels. If multiple noise sources exist within the sonar detection range, it gets difficult to classify each noise source because mixture of noise sources are observed. To overcome this problem, a beamforming technique is used to separate noise sources spatially though it has various limitations. In this paper, we propose a new method that uses a FDICA (Frequency Domain Independent Component Analysis) to separate noise sources from the mixture. For experiments, each noise source signal was synthesized by considering the features such as machinery tonal components and propeller tonal components. And the results of before and after separation were compared by using LOFAR (Low Frequency Analysis and Recording), DEMON (Detection Envelope Modulation On Noise) analysis.

Segmentation of underwater images using morphology for deep learning (딥러닝을 위한 모폴로지를 이용한 수중 영상의 세그먼테이션)

  • Ji-Eun Lee;Chul-Won Lee;Seok-Joon Park;Jea-Beom Shin;Hyun-Gi Jung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.370-376
    • /
    • 2023
  • In the underwater image, it is not clear to distinguish the shape of the target due to underwater noise and low resolution. In addition, as an input of deep learning, underwater images require pre-processing and segmentation must be preceded. Even after pre-processing, the target is not clear, and the performance of detection and identification by deep learning may not be high. Therefore, it is necessary to distinguish and clarify the target. In this study, the importance of target shadows is confirmed in underwater images, object detection and target area acquisition by shadows, and data containing only the shape of targets and shadows without underwater background are generated. We present the process of converting the shadow image into a 3-mode image in which the target is white, the shadow is black, and the background is gray. Through this, it is possible to provide an image that is clearly pre-processed and easily discriminated as an input of deep learning. In addition, if the image processing code using Open Source Computer Vision (OpenCV)Library was used for processing, the processing speed was also suitable for real-time processing.