Search | Korea Science

Audio signal clustering and separation using a stacked autoencoder (복층 자기부호화기를 이용한 음향 신호 군집화 및 분리)

Jang, Gil-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.4
- /
- pp.303-309
- /
- 2016
This paper proposes a novel approach to the problem of audio signal clustering using a stacked autoencoder. The proposed stacked autoencoder learns an efficient representation for the input signal, enables clustering constituent signals with similar characteristics, and therefore the original sources can be separated based on the clustering results. STFT (Short-Time Fourier Transform) is performed to extract time-frequency spectrum, and rectangular windows at all the possible locations are used as input values to the autoencoder. The outputs at the middle, encoding layer, are used to cluster the rectangular windows and the original sources are separated by the Wiener filters derived from the clustering results. Source separation experiments were carried out in comparison to the conventional NMF (Non-negative Matrix Factorization), and the estimated sources by the proposed method well represent the characteristics of the orignal sources as shown in the time-frequency representation.
https://doi.org/10.7776/ASK.2016.35.4.303 인용 PDF KSCI

A clutter reduction algorithm based on clustering for active sonar systems (능동소나 시스템을 위한 군집화 기반의 클러터 제거 기법)

Kwak, ChulHyun;Cheong, Myoung Jun;Ahn, Jae-Kyun
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.2
- /
- pp.149-157
- /
- 2016
In this paper, we propose a new clutter reduction algorithm, which rejects heavy clutter density in shallow water environments, based on a clustering method. At first, it applies the density-based clustering to active sonar measurements by considering speed of targets, pulse repetition intervals, etc. We assume clustered measurements as target candidates and remove noise, which is a set of unclustered measurements. After clustering, we classify target and clutter measurements by the validation check method. We evaluate the performance of the proposed algorithm on synthetic data and sea-trial data. The results demonstrate that the proposed algorithm provides significantly better performances to reduce clutter than the conventional algorithm.
https://doi.org/10.7776/ASK.2016.35.2.149 인용 PDF KSCI

The reliability analysis of Acoustic Emission(AE) testing for crack detectivity by sensors and materials (AE(음향방출) 검사 시 센서 및 재료에 따른 균열 검출능에 대한 신뢰성 분석)

Nam, Jun-Young;Lee, Sang-Yun;Hwang, Woong-Gi;Lee, Bo-Young
- Proceedings of the Computational Structural Engineering Institute Conference
- /
- 2011.04a
- /
- pp.419-423
- /
- 2011
Unlike other non-destructive inspection method, AE Structural defects that are likely to grow in the operation status can be detected, and the advantage of being due to the continuous monitoring of large structures has been widely used to evaluate the stability. AE sensor used to detect sound wave that occurs between 20kHz to 20MHz. and Sound wave result may vary depending on sensor's sensitivity. In this paper, Tensile test conducted on STS 304 and SS400, and tries to detect the crack signal. In tensile test, specimens were conducted using different sensor sensitivity to the same tensile test condition. The crack signal parameters divided into 4 types of communities by conducting cluster analysis. It was demonstrated that crack signal of two sensor is not different by statistical analysis of null hypotheses. Based on the results, waveform of this tension test is crack signal.
PDF

A method for localization of multiple drones using the acoustic characteristic of the quadcopter (쿼드콥터의 음향 특성을 활용한 다수의 드론 위치 추정법)

In-Jee Jung;Wan-Ho Cho;Jeong-Guon Ih
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.3
- /
- pp.351-360
- /
- 2024
With the increasing use of drone technology, the Unmanned Aerial Vehicle (UAV) is now being utilized in various fields. However, this increased use of drones has resulted in various issues. Due to its small size, the drone is difficult to detect with radar or optical equipment, so acoustical tracking methods have been recently applied. In this paper, a method of localization of multiple drones using the acoustic characteristics of the quadcopter drone is suggested. Because the acoustic characteristics induced by each rotor are differentiated depending on the type of drone and its movement state, the sound source of the drone can be reconstructed by spatially clustering the results of the estimated positions of the blade passing frequency and its harmonic sound source. The reconstructed sound sources are utilized to finally determine the location of multiple-drone sound sources by applying the source localization algorithm. An experiment is conducted to analyze the acoustic characteristics of the test quadcopter drones, and the simulations for three different types of drones are conducted to localize the multiple drones based on the measured acoustic signals. The test result shows that the location of multiple drones can be estimated by utilizing the acoustic characteristics of the drone. Also, one can see that the clarity of the separated drone sound source and the source localization algorithm affect the accuracy of the localization for multiple-drone sound sources.
https://doi.org/10.7776/ASK.2024.43.3.351 인용 PDF

A DB Pruning Method in a Large Corpus-Based TTS with Multiple Candidate Speech Segments (대용량 복수후보 TTS 방식에서 합성용 DB의 감량 방법)

Lee, Jung-Chul;Kang, Tae-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.6
- /
- pp.572-577
- /
- 2009
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. To prune the redundant speech segments in a large speech segment DB, we can utilize a decision-tree based triphone clustering algorithm widely used in speech recognition area. But, the conventional methods have problems in representing the acoustic transitional characteristics of the phones and in applying context questions with hierarchic priority. In this paper, we propose a new clustering algorithm to downsize the speech DB. Firstly, three 13th order MFCC vectors from first, medial, and final frame of a phone are combined into a 39 dimensional vector to represent the transitional characteristics of a phone. And then the hierarchically grouped three question sets are used to construct the triphone trees. For the performance test, we used DTW algorithm to calculate the acoustic similarity between the target triphone and the triphone from the tree search result. Experimental results show that the proposed method can reduce the size of speech DB by 23% and select better phones with higher acoustic similarity. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.7776/ASK.2009.28.6.572 인용 PDF KSCI

Analytic Verification of Optimal Degaussing Technique using a Scaled Model Ship (축소 모델 함정을 이용한 소자 최적화 기법의 해석적 검증)

Cho, Dong-Jin
- Journal of the Korean Magnetics Society
- /
- v.27 no.2
- /
- pp.63-69
- /
- 2017
Naval ships are particularly required to maintain acoustic and magnetic silence due to their operational characteristics. Among them, underwater magnetic field signals derived by ships are likely to be detected by threats such as surveillance systems and mine systems at close distance. In order to increase the survivability of the vessels, various techniques for reducing the magnetic field signal are being studied and it is necessary to consider not only the magnitude of the magnetic field signal but also the gradient of it. In this paper, we use the commercial electromagnetic finite element analysis tool to predict the induced magnetic field signal of ship's scaled model, and arrange the degaussing coil. And the optimum degaussing current of the coil was derived by applying the particle swarm optimization algorithm considering the gradient constraint. The validity of the optimal degaussing technique is verified analytically by comparing the magnetic field signals after the degaussing with or without gradient constraint.
https://doi.org/10.4283/JKMS.2017.27.2.063 인용 PDF KSCI

I-vector similarity based speech segmentation for interested speaker to speaker diarization system (화자 구분 시스템의 관심 화자 추출을 위한 i-vector 유사도 기반의 음성 분할 기법)

Bae, Ara;Yoon, Ki-mu;Jung, Jaehee;Chung, Bokyung;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.5
- /
- pp.461-467
- /
- 2020
In noisy and multi-speaker environments, the performance of speech recognition is unavoidably lower than in a clean environment. To improve speech recognition, in this paper, the signal of the speaker of interest is extracted from the mixed speech signals with multiple speakers. The VoiceFilter model is used to effectively separate overlapped speech signals. In this work, clustering by Probabilistic Linear Discriminant Analysis (PLDA) similarity score was employed to detect the speech signal of the interested speaker, which is used as the reference speaker to VoiceFilter-based separation. Therefore, by utilizing the speaker feature extracted from the detected speech by the proposed clustering method, this paper propose a speaker diarization system using only the mixed speech without an explicit reference speaker signal. We use phone-dataset consisting of two speakers to evaluate the performance of the speaker diarization system. Source to Distortion Ratio (SDR) of the operator (Rx) speech and customer speech (Tx) are 5.22 dB and -5.22 dB respectively before separation, and the results of the proposed separation system show 11.26 dB and 8.53 dB respectively.
https://doi.org/10.7776/ASK.2020.39.5.461 인용 PDF KSCI

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

Lee, Jung-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.7
- /
- pp.91-98
- /
- 2010
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.9708/jksci.2010.15.7.091 인용 PDF KSCI

Search Result 8, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)