• Title/Summary/Keyword: Zero-crossing rate

Search Result 113, Processing Time 0.033 seconds

Automatic Vowel Onset Point Detection Based on Auditory Frequency Response (청각 주파수 응답에 기반한 자동 모음 개시 지점 탐지)

  • Zang, Xian;Kim, Hag-Tae;Chong, Kil-To
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.1
    • /
    • pp.333-342
    • /
    • 2012
  • This paper presents a vowel onset point (VOP) detection method based on the human auditory system. This method maps the "perceptual" frequency scale, i.e. Mel scale onto a linear acoustic frequency, and then establishes a series of Triangular Mel-weighted Filter Bank simulate the function of band pass filtering in human ear. This nonlinear critical-band filter bank helps greatly reduce the data dimensionality, and eliminate the effect of harmonic waves to make the formants more prominent in the nonlinear spaced Mel spectrum. The sum of mel spectrum peaks energy is extracted as feature for each frame, and the instinct at which the energy amplitude starts rising sharply is detected as VOP, by convolving with Gabor window. For the single-word database which contains 12 vowels articulated with different kinds of consonants, the experimental results showed a good average detection rate of 72.73%, higher than other vowel detection methods based on short-time energy and zero-crossing rate.

Open-Loop Pipeline ADC Design Techniques for High Speed & Low Power Consumption (고속 저전력 동작을 위한 개방형 파이프라인 ADC 설계 기법)

  • Kim Shinhoo;Kim Yunjeong;Youn Jaeyoun;Lim Shin-ll;Kang Sung-Mo;Kim Suki
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.1A
    • /
    • pp.104-112
    • /
    • 2005
  • Some design techniques for high speed and low power pipelined 8-bit ADC are described. To perform high-speed operation with relatively low power consumption, open loop architecture is adopted, while closed loop architecture (with MDAC) is used in conventional pipeline ADC. A distributed track and hold amplifier and a cascading structure are also adopted to increase the sampling rate. To reduce the power consumption and the die area, the number of amplifiers in each stage are optimized and reduced with proposed zero-crossing point generation method. At 500-MHz sampling rate, simulation results show that the power consumption is 210mW including digital logic with 1.8V power supply. And the targeted ADC achieves ENOB of about 8-bit with input frequency up to 200-MHz and input range of 1.2Vpp (Differential). The ADC is designed using a $0.18{\mu}m$ 6-Metal 1-Poly CMOS process and occupies an area of $900{\mu}m{\times}500{\mu}m$

A Study on Emotion Recognition of Chunk-Based Time Series Speech (청크 기반 시계열 음성의 감정 인식 연구)

  • Hyun-Sam Shin;Jun-Ki Hong;Sung-Chan Hong
    • Journal of Internet Computing and Services
    • /
    • v.24 no.2
    • /
    • pp.11-18
    • /
    • 2023
  • Recently, in the field of Speech Emotion Recognition (SER), many studies have been conducted to improve accuracy using voice features and modeling. In addition to modeling studies to improve the accuracy of existing voice emotion recognition, various studies using voice features are being conducted. This paper, voice files are separated by time interval in a time series method, focusing on the fact that voice emotions are related to time flow. After voice file separation, we propose a model for classifying emotions of speech data by extracting speech features Mel, Chroma, zero-crossing rate (ZCR), root mean square (RMS), and mel-frequency cepstrum coefficients (MFCC) and applying them to a recurrent neural network model used for sequential data processing. As proposed method, voice features were extracted from all files using 'librosa' library and applied to neural network models. The experimental method compared and analyzed the performance of models of recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU) using the Interactive emotional dyadic motion capture Interactive Emotional Dyadic Motion Capture (IEMOCAP) english dataset.

A Design and Implementation of 30w class Er:YAG laser adopted skin and dental clinic. (치과 및 교부과용 30W급 Er:YAG 레이저 설계 및 구현)

  • 김휘영;신경애
    • Proceedings of the IEEK Conference
    • /
    • 2001.06e
    • /
    • pp.211-214
    • /
    • 2001
  • For general laser power supply, the secondary of the power transformer is connected to the rectifier and filter capacitor. The output of a rectifier is connected to a switching element in the secondary of the transformer. So the Dower supply is complicated and the loss of switching is considerably. In addition, according to increasing pulse repetition, charged energy of energy-storage capacitor is not transferred sufficiently to flashlamp, and laser output efficiency decreases. In this raper, to improve laser efficiency, we designed and fabricated the power supply in which the SCR was turned on in zero point by the methods of ZCC(zero crossing control), PFN(pulse forming network) in result, laser output efficiency increased by hte 4% other than conventional supply, when a repetition rate was increased by the 10[pps], In 20(pps), efficiency was increased by about 8%

  • PDF

A Study on Operation Rate and Output of Wave Power Generator by Waves Condition (파랑 조건에 따른 파력발전장치의 가동률과 발전량 산정에 대한 연구)

  • Ryu, Hwang-Jin;Hong, Key-Yong;Shin, Seung-Ho;Kim, Sang-Ho
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2009.06a
    • /
    • pp.615-619
    • /
    • 2009
  • This paper is investigated to variation of wave power generation operation rate, operating capacity and output with the wave conditions represented by wave height-period window. By the use of the long-term wave data from 1979 to 2002 which is provided by Korea Ocean Research & Development Institute(KORDI), we calculated the monthly variation of significant wave height(Hs), zero-up crossing period(Tz) and distribution of wave appearance rate. And using the same wave data, it was charted the Hs-Tz and wave-energy scatter diagrams.

  • PDF

Continuous digits recognition using spatio-temporal neural network (시공간 신경회로망을 이용한 연속 숫자음 인식)

  • 이종식;정재호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.7
    • /
    • pp.1605-1612
    • /
    • 1996
  • In this paper, a new approach for continuous digits recognition using the Spatio-Temporal Neural Network (STNN) is reported. The continuous seven digits are gargeted to recognize, and our initial recognition rate was 28%. In this paper, to increase the recognition rate, two methods are proposed. In the first method, to compensated the STNN's own defect as well as to emphasize the Korean digits' phonic characteristics, the starting point ofeach digit is detected using the energy and zero-crossing rate, but the ending point is detectedonly using the energy value. In this case, the seven digits recognition reate increased to 61%. Furthermore, in the second method, considering the fact that a same digit could be pronounced differently in continuously spoken environment, the number of STNNs used to represent each digit is increased from one to five. Consequently, the same digit but pronounced differently could be handled well in the new system. As a result of that, the continuously spoken seven digits recognition rate increased to 89%.

  • PDF

Engine Fault Diagnosis Using Sound Source Analysis Based on Hidden Markov Model (HMM기반 소음분석에 의한 엔진고장 진단기법)

  • Le, Tran Su;Lee, Jong-Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39A no.5
    • /
    • pp.244-250
    • /
    • 2014
  • The Most Serious Engine Faults Are Those That Occur Within The Engine. Traditional Engine Fault Diagnosis Is Highly Dependent On The Engineer'S Technical Skills And Has A High Failure Rate. Neural Networks And Support Vector Machine Were Proposed For Use In A Diagnosis Model. In This Paper, Noisy Sound From Faulty Engines Was Represented By The Mel Frequency Cepstrum Coefficients, Zero Crossing Rate, Mean Square And Fundamental Frequency Features, Are Used In The Hidden Markov Model For Diagnosis. Our Experimental Results Indicate That The Proposed Method Performs The Diagnosis With A High Accuracy Rate Of About 98% For All Eight Fault Types.

Speech Feature Selection of Normal and Autistic children using Filter and Wrapper Approach

  • Akhtar, Muhammed Ali;Ali, Syed Abbas;Siddiqui, Maria Andleeb
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.5
    • /
    • pp.129-132
    • /
    • 2021
  • Two feature selection approaches are analyzed in this study. First Approach used in this paper is Filter Approach which comprises of correlation technique. It provides two reduced feature sets using positive and negative correlation. Secondly Approach used in this paper is the wrapper approach which comprises of Sequential Forward Selection technique. The reduced feature set obtained by positive correlation results comprises of Rate of Acceleration, Intensity and Formant. The reduced feature set obtained by positive correlation results comprises of Rasta PLP, Log energy, Log power and Zero Crossing Rate. Pitch, Rate of Acceleration, Log Power, MFCC, LPCC is the reduced feature set yield as a result of Sequential Forwarding Selection.

Implementation of Variable Threshold Dual Rate ADPCM Speech CODEC Considering the Background Noise (배경잡음을 고려한 가변임계값 Dual Rate ADPCM 음성 CODEC 구현)

  • Yang, Jae-Seok;Han, Kyong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2000.07d
    • /
    • pp.3166-3168
    • /
    • 2000
  • This paper proposed variable threshold dual rate ADPCM coding method which is modified from the standard ADPCM of ITU G.726 for speech quality improvement. The speech quality of variable threshold dual rate ADPCM is better than single rate ADPCM at noisy environment without increasing the complexity by using ZCR(Zero Crossing Rate). In this case, ZCR is used to divide input signal samples into two categories(noisy & speech). The samples with higher ZCR is categorized as the noisy region and the samples with lower ZCR is categorized as the speech region. Noisy region uses higher threshold value to be compressed by 16Kbps for reduced bit rates and the speech region uses lower threshold value to be compressed by 40Kbps for improved speech quality. Comparing with the conventional ADPCM, which adapts the fixed coding rate. the proposed variable threshold dual rate ADPCM coding method improves noise character without increasing the bit rate. For real time applications, ZCR calculation was considered as a simple method to obtain the background noise information for preprocess of speech analysis such as FFT and the experiment showed that the simple calculation of ZCR can be used without complexity increase. Dual rate ADPCM can decrease the amount of transferred data efficiently without increasing complexity nor reducing speech quality. Therefore result of this paper can be applied for real-time speech application such as the internet phone or VoIP.

  • PDF

An acoustic sensor fault detection method based on root-mean-square crossing-rate analysis for passive sonar systems (수동 소나 시스템을 위한 실효치교차율 분석 기반 음향센서 결함 탐지 기법)

  • Kim, Yong Guk;Park, Jeong Won;Kim, Young Shin;Lee, Sang Hyuck;Kim, Hong Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.1
    • /
    • pp.30-38
    • /
    • 2017
  • In this paper, we propose an underwater acoustic sensor fault detection method for passive sonar systems. In general, a passive sonar system displays processed results of array signals obtained from tens of the acoustic sensors as a two-dimensional image such as displays for broadband or narrowband analysis. Since detection result display in the operation software is to display the accumulated result through the array signal processing, it is difficult to determine the possibility where signal may be contaminated by the fault or failure of a single channel sensor. In this paper, accordingly, we propose a detection method based on the analysis of RMSCR (Root Mean Square Crossing-Rate), and the processing techniques for the faulty sensors are analyzed. In order to evaluate the performance of the proposed method, the precision of detecting fault sensors is measured by using signals acquired from real array being operated in several coastal areas. Besides, we compare performance of fault processing techniques. From the experiments, it is shown that the proposed method works well in underwater environments with high average RMS, and mute (set to zero) shows the best performance with regard to fault processing techniques.