• Title/Summary/Keyword: Spectrogram Energy

Search Result 17, Processing Time 0.018 seconds

A Study on Partial Discharge Diagnostic System for Power Cable using RLCR

  • Park, Keeyoung;Choi, Hyungkee;Lee, Chulhee;Hong, Soomi
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.1
    • /
    • pp.43-47
    • /
    • 2016
  • This system is a diagnosis system that checks whether it causes a partial discharge of a power cable or not. It is to classify normal from abnormal-normal, PD (Partial Discharge) sound through analysis of RLCR (Relative Level Crossing Rate) and spectrogram energy algorithm. Partial discharge diagnostic system has a function that stores PD sound and analyzes the data. The wave shape of PD sound is similar to noise and is systematically generated by partial discharge. Therefore, in this paper, we could discreminate between normal and abnormal case using relative level crossing rate (RLCR) and spectrogram of frequency energy rate.

Attention Modules for Improving Cough Detection Performance based on Mel-Spectrogram (사전 학습된 딥러닝 모델의 Mel-Spectrogram 기반 기침 탐지를 위한 Attention 기법에 따른 성능 분석)

  • Changjoon Park;Inki Kim;Beomjun Kim;Younghoon Jeon;Jeonghwan Gwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.43-46
    • /
    • 2023
  • 호흡기 관련 전염병의 주된 증상인 기침은 공기 중에 감염된 병원균을 퍼트리며 비감염자가 해당 병원균에 노출된 경우 높은 확률로 해당 전염병에 감염될 위험이 있다. 또한 사람들이 많이 모이는 공공장소 및 실내 공간에서의 기침 탐지 및 조치는 전염병의 대규모 유행을 예방할 수 있는 효율적인 방법이다. 따라서 본 논문에서는 탐지해야 하는 기침 소리 및 일상생활 속 발생할 수 있는 기침과 유사한 배경 소리 들을 Mel-Spectrogram으로 변환한 후 시각화된 특징을 CNN 모델에 학습시켜 기침 탐지를 진행하며, 일반적으로 사용되는 사전 학습된 CNN 모델에 제안된 Attention 모듈의 적용이 기침 탐지 성능 향상에 도움이 됨을 입증하였다.

  • PDF

A Comparison Study on the Speech Signal Parameters for Chinese Leaners' Korean Pronunciation Errors - Focused on Korean /ㄹ/ Sound (중국인 학습자의 한국어 발음 오류에 대한 음성 신호 파라미터들의 비교 연구 - 한국어의 /ㄹ/ 발음을 중심으로)

  • Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.239-246
    • /
    • 2017
  • This paper compares the speech signal parameters between Korean and Chinese for Korean pronunciation /ㄹ/, which is caused many errors by Chinese leaners. Allophones of /ㄹ/ in Korean is divided into lateral group and tap group. It has been investigated the reasons for these errors by studying the similarity and the differences between Korean /ㄹ/ pronunciation and its corresponding Chinese pronunciation. In this paper, for the purpose of comparison the speech signal parameters such as energy, waveform in time domain, spectrogram in frequency domain, pitch based on ACF, Formant frequencies are used. From the phonological perspective the speech signal parameters such as signal energy, a waveform in the time domain, a spectrogram in the frequency domain, the pitch (F0) based on autocorrelation function (ACF), Formant frequencies (f1, f2, f3, and f4) are measured and compared. The data, which are composed of the group of Korean words by through a philological investigation, are used and simulated in this paper. According to the simulation results of the energy and spectrogram, there are meaningful differences between Korean native speakers and Chinese leaners for Korean /ㄹ/ pronunciation. The simulation results also show some differences even other parameters. It could be expected that Chinese learners are able to reduce the errors considerably by exploiting the parameters used in this paper.

A Study on Speaker Identification Parameter Using Difference and Correlation Coeffieicent of Digit_sound Spectrum (숫자음의 스펙트럼 차이값과 상관계수를 이용한 화자인증 파라미터 연구)

  • Lee, Hoo-Dong;Kang, Sun-Mee;Chang, Moon-Soo;Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.131-142
    • /
    • 2004
  • Speaker identification system basically functions by comparing spectral energy of an individual production model with that of an input signal. This study aimed to develop a new speaker identification system from two parameters from the spectral energy of numeric sounds: difference sum and correlation coefficient. A narrow-band spectrogram yielded more stable spectral energy across time than a wide-band one. In this paper, we collected empirical data from four male speakers and tested the speaker identification system. The subjects produced 18 combinations of three-digit numeric. sounds !en times each. Five productions of each three-digit number were statistically averaged to make a model for each speaker. Then, the remaining five productions were tested on the system. Results showed that when the threshold for the absolute difference sum was set to 1200, all the speakers could not pass the system while everybody could pass if set to 2800. The minimum correlation coefficient to allow all to pass was 0.82 while the coefficient of 0.95 rejected all. Thus, both threshold levels can be adjusted to the need of speaker identification system, which is desirable for further study.

  • PDF

Determining Key Features of Recognition Korean Traditional Music Using Spectrogram

  • Kim Jae Chun;Kwak Kyung Sup
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2E
    • /
    • pp.67-70
    • /
    • 2005
  • To realize a traditional music recognition system, some characteristics pertinent to Far East Asian music should be found. Using Spectrogram, some distinct attributes of Korean traditional music are surveyed. Frequency distribution, beat cycle and frequency energy intensity within samples have distinct characteristics of their own. Experiment is done for pre-experimentation to realize Korean traditional music recognition system. Using characteristics of Korean traditional music, $94.5\%$ of classification accuracy is acquired. As Korea, Japan and China have the same musical roots, both in instruments and playing style, analyzing Korean traditional music can be helpful in the understanding of Far East Asian traditional music.

Estimation of Fundamental Frequency Using an Instantaneous Frequency Based on the Symmetric Higher Order Differential Energy Operator (대칭구조를 갖는 일반적인 고차의 미분 에너지함수를 기반한 순간주파수를 이용한 음성의 기본주파수 추정)

  • Iem, Byeong-Gwan
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.12
    • /
    • pp.2374-2379
    • /
    • 2011
  • The fundamental frequency of the voiced speech is estimated using the instantaneous frequency based on the symmetric higher order differential energy operator. The instantaneous frequency based on the symmetric higher order energy operator shows better frequency estimation result since it is aligned to the time instance of the signal. The speech is pre-processed by a lowpass filter to remove higher frequency components. Then, it is processed by the instantaneous frequency to obtain the fundamental frequency estimates. The symmetric higher order energy operator has been used as an indicator to determine the voiced/unvoiced speech. The fundamental frequency estimates are further processed by a moving average filter to obtain the monotonically changed estimates. The obtained fundamental frequency estimates have been compared with the spectrogram of the speech to confirm its accuracy.

A Study on a Intelligent GIS Monitoring System using the Preventive Diagnostic Technology (예방진단기술을 이용한 지능형 GIS 감시시스템에 관한 연구)

  • Park, Kee-Young;Lee, Jong-Ha;Cho, Sook-Jin;Choi, Hyung-Ki;Jung, Eui-Bung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.6
    • /
    • pp.244-251
    • /
    • 2014
  • In this study, we give a detailed account of normal and abnormal state of GIS(Gas Insulated Switch-gear) using the preventive diagnostic technology. And it is based on the analysis and diagnosis for storing data of GIS by intelligent GIS monitoring system. The wave shape of GIS sound is similar to noise and is systematically generated by discharge and its corona sound. Therefore, in this paper, to classify normal and abnormal GIS sound. We could discriminate between normal and abnormal case using level crossing rate(LCR) and spectrogram energy rate.

An Study on the Correlation between Sound Characteristics and Sasang Constitution by CSL (CSL을 통한 음향특성과 사상체질간의 상관성 연구)

  • Shin, Mi-ran;Kim, Dal-lae
    • Journal of Sasang Constitutional Medicine
    • /
    • v.11 no.1
    • /
    • pp.137-157
    • /
    • 1999
  • The purpose of this study is to help classifying Sasang Constitution through correlation with sound characteristic. This study was done it under the suppose that Sasang Constitution has correlation with sound spectrogram. The following result were obtained about correlation between sound spectrogram and Sasang Constitution by comparison and analysis 1. Soeumin answered his voice low tone, smooth and quiet in the survey. Soyangin answered his voice high, clear, fast and speaking random. Taeumin answered his voice low, thick and muddy. 2. Taeyangin was significantly slow compared with the others in the time of reading composition. Taeyangin was significantly slow compared with the others in Formant frequency 1. Taeyangin was significantly discriminated from Soeumin in Formant frequency 5. Taeyangin was significantly low compared with the others in Bandwidth 2. Soeumln was significantly low compared with Taeyangin in Pitch Maximum and Pitch Maximum-Pitch Minimum. Taeyangin was significantly high compared with the others in Energy mean. 3. In list of specification, the discrimination rate was higher than that by lists of 13 in the results of Multi-dimensional 4-class minimum-distance. The discrimination rate of three disposition except Soyangin was higher than that of four disposition in the results of One way ANOVA and Analysis of dis crimination in SPSS/PC+. In CART, the estimate rate of Sasang Constitution discrimination was higher than any other method. It is considered that there is a correlation between sound spectrogram and Sasang constitution according to the results. And method of Sasang constitution classification through sound spectrogram analysis can be one method as assistant for the objectification of Sasang constitution classification.

  • PDF

Emotion Recognition using Various Combinations of Audio Features and Textual Information (음성특징의 다양한 조합과 문장 정보를 이용한 감정인식)

  • Seo, Seunghyun;Lee, Bowon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.137-139
    • /
    • 2019
  • 본 논문은 다양한 음성 특징과 텍스트를 이용한 멀티 모드 순환신경망 네트워크를 사용하여 음성을 통한 범주형(categorical) 분류 방법과 Arousal-Valence(AV) 도메인에서의 분류방법을 통해 감정인식 결과를 제시한다. 본 연구에서는 음성 특징으로는 MFCC, Energy, Velocity, Acceleration, Prosody 및 Mel Spectrogram 등의 다양한 특징들의 조합을 이용하였고 이에 해당하는 텍스트 정보를 순환신경망 기반 네트워크를 통해 융합하여 범주형 분류 방법과 과 AV 도메인에서의 분류 방법을 이용해 감정을 이산적으로 분류하였다. 실험 결과, 음성 특징의 조합으로 MFCC Energy, Velocity, Acceleration 각 13 차원과 35 차원의 Prosody 의 조합을 사용하였을 때 범주형 분류 방법에서는 75%로 다른 특징 조합들 보다 높은 결과를 보였고 AV 도메인 에서도 같은 음성 특징의 조합이 Arousal 55.3%, Valence 53.1%로 각각 가장 높은 결과를 보였다.

  • PDF

A Study on the Acoustic Characteristics of the Pansori by Voice Signals Analysis (음성신호 분석에 의한 판소리의 음성학적 특징 연구)

  • Kim, HyunSook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.7
    • /
    • pp.3218-3222
    • /
    • 2013
  • Pansori is our traditional vocal sound, originality and excellence in the art of conversation, gesture general became a globally recognized world intangible heritage. Especially, Pansori as shrews and humorous representation of audience participation with a high degree of artistic value and enjoy the arts throughout all layers to be responsible for the social integration of functions is evaluated. Therefore, in this paper, Pansori five yard target speech signal analysis techniques applied to analyze the Pansori acoustic features of a representation of a society and era correlation extraction studies were performed. Pansori on the five yard spectrogram, pitch, stability and strength analysis for this experiment. Pansori through experimental results Comical story while keeping the audience focused and interested to better reflect the characteristics of energy for the wave of voice and vocal cord tremor change the width of a large, stable and voice with a loud voice, that expresses were analyzed.