• Title/Summary/Keyword: 스펙트로그램

Search Result 136, Processing Time 0.028 seconds

Analyzing performance of time series classification using STFT and time series imaging algorithms

  • Sung-Kyu Hong;Sang-Chul Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.1-11
    • /
    • 2023
  • In this paper, instead of using recurrent neural network, we compare a classification performance of time series imaging algorithms using convolution neural network. There are traditional algorithms that imaging time series data (e.g. GAF(Gramian Angular Field), MTF(Markov Transition Field), RP(Recurrence Plot)) in TSC(Time Series Classification) community. Furthermore, we compare STFT(Short Time Fourier Transform) algorithm that can acquire spectrogram that visualize feature of voice data. We experiment CNN's performance by adjusting hyper parameters of imaging algorithms. When evaluate with GunPoint dataset in UCR archive, STFT(Short-Time Fourier transform) has higher accuracy than other algorithms. GAF has 98~99% accuracy either, but there is a disadvantage that size of image is massive.

Attention Modules for Improving Cough Detection Performance based on Mel-Spectrogram (사전 학습된 딥러닝 모델의 Mel-Spectrogram 기반 기침 탐지를 위한 Attention 기법에 따른 성능 분석)

  • Changjoon Park;Inki Kim;Beomjun Kim;Younghoon Jeon;Jeonghwan Gwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.43-46
    • /
    • 2023
  • 호흡기 관련 전염병의 주된 증상인 기침은 공기 중에 감염된 병원균을 퍼트리며 비감염자가 해당 병원균에 노출된 경우 높은 확률로 해당 전염병에 감염될 위험이 있다. 또한 사람들이 많이 모이는 공공장소 및 실내 공간에서의 기침 탐지 및 조치는 전염병의 대규모 유행을 예방할 수 있는 효율적인 방법이다. 따라서 본 논문에서는 탐지해야 하는 기침 소리 및 일상생활 속 발생할 수 있는 기침과 유사한 배경 소리 들을 Mel-Spectrogram으로 변환한 후 시각화된 특징을 CNN 모델에 학습시켜 기침 탐지를 진행하며, 일반적으로 사용되는 사전 학습된 CNN 모델에 제안된 Attention 모듈의 적용이 기침 탐지 성능 향상에 도움이 됨을 입증하였다.

  • PDF

Experimental Study on Estimation of Flight Trajectory Using Ground Reflection and Comparison of Spectrogram and Cepstrogram Methods (지면 반사효과를 이용한 비행 궤적 추정에 대한 실험적 연구와 스펙트로그램 및 캡스트로그램 방법 비교)

  • Jung, Ookjin;Go, Yeong-Ju;Lee, Jaehyung;Choi, Jong-Soo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.18 no.2
    • /
    • pp.115-124
    • /
    • 2015
  • A methodology is proposed to estimate a trajectory of a flying target and its velocity using the time and frequency analysis of the acoustic signal. The measurement of sound emitted from a flying acoustic source with a microphone above a ground shall receive both direct and ground-reflected sound waves. For certain frequency contents, the destructive interference happens in received signal waveform reflected path lengths are in multiple integers of direct path length. This phenomenon is referred to as the acoustical mirror effect and it can be observed in a spectrogram plot. The spectrogram of acoustic measurement for a flying vehicle measurement shows several orders of destructive interference curves. The first or second order of curve is used to find the best approximate path by using nonlinear least-square method. Simulated acoustic signal is generated for the condition of known geometric of a sensor and a source in flight. The estimation based on cepstrogram analysis provides more accurate estimate than spectrogram.

Intonatin Conversion using the Other Speaker's Excitation Signal (他話者의 勵起信號를 이용한 抑揚變換)

  • Lee, Ki-Young;Choi, Chang-Seok;Choi, Kap-Seok;Lee, Hyun-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.21-28
    • /
    • 1995
  • In this paper an intonation conversion method is presented which provides the basic study on converting the original speech into the artificially intoned one. This method employs the other speaker's excitation signals as intonation information and the original vocal tract spectra, which are warped with the other speaker's ones by using DTW. as vocal features, and intonation converted speech signals are synthesized through short-time inverse Fourier transform(STIFT) of their product. To evaluate the intonation converted speech by this method, we collect Korean single vowels and sentences spoken by 30 males and compare fundamental frequency contours spectrograms, distortion measures and MOS test between the original speech and the converted one. The result shows that this method can convert and speech into the intoned one of the other speaker's.

  • PDF

Characteristics of Dairy Cow's Vocalization in Postpartum Related with Calf Isolation (출산 후 새끼와의 분리에 따른 유우의 발성음 특성)

  • Kim, Min-Jin;Son, Seung-Hun;Rhim, Shin-Jae;Chang, Moon-Baek
    • Journal of Animal Science and Technology
    • /
    • v.52 no.1
    • /
    • pp.51-56
    • /
    • 2010
  • This study was conducted to clarify the characteristics of Holstein dairy cow's vocalization in postpartum related with calf isolation. Vocalizations of 16 individuals of cows were recorded 6 hours per day (1:00am~4:00am and 1:00pm~4:00pm) using digital recorder and microphone during October 2008 and May 2009. Vocalizations were divided into 4 types. Characteristics of frequency, intensity and duration were analyzed by GLM (general linear model) and Duncan's multi-test. There were significant differences in frequency and intensity based on analyses of spectrogram and spectrum among 4 types of vocalizations. Frequencies of vocalizations were dramatically decreased on 2nd and 3rd day. Vocalization would be important factor affecting the motheryoung bond in Holstein dairy cattle.

Characteristics of Estrus-related Vocalizations of Sows after Artificial Insemination (모돈의 인공수정 후 시기별 발성음의 특성)

  • Rhim, Shin-Jae;Kim, Min-Jin;Lee, Ju-Young;Kim, Na Ra;Kang, Jeong-Hoon
    • Journal of Animal Science and Technology
    • /
    • v.50 no.3
    • /
    • pp.401-406
    • /
    • 2008
  • This study was conducted to clarify the characteristics of estrus-related vocalization of sows after artificial insemination. Vocalization of sows in artificial insemination day, and 3 days and 50 days after artificial insemination, were recorded 3 hours per day from September 2006 to March 2007 using the MD Recorder(Marantz PMD-650) and microphone(RF Condesner MIC, MKH 416P48). The shapes of spectrum and spectrogram of vocalization were different in each period after artificial insemination. There were significant differences in frequency and intensity, but not in duration of vocalization. The fact that signal may give a reliable indication of the signaller's needs has suggested that in some circumstances they can provide information on animal welfare.

An Analysis of Preference for Korean Pop Music By Applying Acoustic Signal Analysis Techniques (음향신호분석 기술을 적용한 한국가요의 시대별 선호도 분석)

  • Cho, Dong-Uk;Kim, Bong-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.19D no.3
    • /
    • pp.211-220
    • /
    • 2012
  • Recently K-Pop gained worldwide sensational popularity, no longer limited to the domestic pop music scene. One of the main causes can be that K-Pop mostly are "Hook Song" which has the "hook effect": a certain melody or/and rhythm is repeated up to 70 times in one song so that it hooks the ear of the listener. Also, visual effects by K-Pop dance group are supposed to contribute to gaining the popularity. In this paper, we propose a method which traces the changes of preference for Korean pop music according to the passing of time and investigates the causes using acoustic signal analysis. For this, experiments in acoustic signal analysis are performed on Korean pop music of from popular female singers in 1960s to those as of this date. Experimental results by applying acoustic signal processing techniques show that the periods discrimination is possible based on scientific evidences. Also, quantitative, objective and numerical data based on acoustic signal processing techniques are extracted compared with the pre-existing methods such as subjective and statistical data.

A Study on the Acoustic Characteristics of the Pansori by Voice Signals Analysis (음성신호 분석에 의한 판소리의 음성학적 특징 연구)

  • Kim, HyunSook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.7
    • /
    • pp.3218-3222
    • /
    • 2013
  • Pansori is our traditional vocal sound, originality and excellence in the art of conversation, gesture general became a globally recognized world intangible heritage. Especially, Pansori as shrews and humorous representation of audience participation with a high degree of artistic value and enjoy the arts throughout all layers to be responsible for the social integration of functions is evaluated. Therefore, in this paper, Pansori five yard target speech signal analysis techniques applied to analyze the Pansori acoustic features of a representation of a society and era correlation extraction studies were performed. Pansori on the five yard spectrogram, pitch, stability and strength analysis for this experiment. Pansori through experimental results Comical story while keeping the audience focused and interested to better reflect the characteristics of energy for the wave of voice and vocal cord tremor change the width of a large, stable and voice with a loud voice, that expresses were analyzed.

Speech emotion recognition using attention mechanism-based deep neural networks (주목 메커니즘 기반의 심층신경망을 이용한 음성 감정인식)

  • Ko, Sang-Sun;Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.6
    • /
    • pp.407-412
    • /
    • 2017
  • In this paper, we propose a speech emotion recognition method using a deep neural network based on the attention mechanism. The proposed method consists of a combination of CNN (Convolution Neural Networks), GRU (Gated Recurrent Unit), DNN (Deep Neural Networks) and attention mechanism. The spectrogram of the speech signal contains characteristic patterns according to the emotion. Therefore, we modeled characteristic patterns according to the emotion by applying the tuned Gabor filters as convolutional filter of typical CNN. In addition, we applied the attention mechanism with CNN and FC (Fully-Connected) layer to obtain the attention weight by considering context information of extracted features and used it for emotion recognition. To verify the proposed method, we conducted emotion recognition experiments on six emotions. The experimental results show that the proposed method achieves higher performance in speech emotion recognition than the conventional methods.

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

  • Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.600-605
    • /
    • 2020
  • In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.