• Title/Summary/Keyword: multi-pitch extraction

Search Result 7, Processing Time 0.019 seconds

A Study on Multi-Pulse Speech Coding Method by Using Individual Pitch Information (개별 피치정보를 이용한 멀티펄스 음성부호화 방식에 관한 연구)

  • Lee, See-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.2
    • /
    • pp.59-64
    • /
    • 2006
  • In this paper, 1 propose a new method of Multi-Pulse Coding(IP-MPC) use individual pitch pulses in order to accommodate the changes in each pitch interval and reduce pitch errors. The extraction rate of individual pitch pulses was $85\%$ for female voice and $96\%$ for male voice respectively, 1 evaluate the MPC by using pitch information of autocorrelation method and the IP-MPC by using individual pitch pulses. As a result, 1 knew that synthesis speech of the IP-MPC was better in speech quality than synthesis speech of the MPC.

  • PDF

A Study on Multi-Pulse Speech Coding Method by using Individual Pitch Pulses (개별 피치펄스를 이용한 멀티펄스 음성부호화 방식에 관한 연구)

  • 이시우
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.5
    • /
    • pp.977-982
    • /
    • 2004
  • In this paper, I propose a new method of Multi-Pulse Coding(IP-MPC) use individual pitch pulses in order to accommodate the changes in each pitch interval and reduce pitch errors. The extraction rate of individual pitch pulses was 85% for female voice and 96% for male voice respectively. 1 evaluate the MPC by using pitch information of autocorrelation method and the IP-MPC by using individual pitch pulses. As a result, I knew that synthesis speech of the IP-MPC was better in speech quality than synthesis speech of the MPC.

Extracting Predominant Melody from Polyphonic Music using Harmonic Structure (하모닉 구조를 이용한 다성 음악의 주요 멜로디 검출)

  • Yoon, Jea-Yul;Lee, Seok-Pil;Seo, Kyeung-Hak;Park, Ho-Chong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.109-116
    • /
    • 2010
  • In this paper, we propose a method for extracting predominant melody of polyphonic music based on harmonic structure. Since polyphonic music contains multiple sound sources, the process of melody detection consists of extraction of multiple fundamental frequencies and determination of predominant melody using those fundamental frequencies. Harmonic structure is an important feature parameter of monophonic signal that has spectral peaks at the integer multiples of its fundamental frequency. We extract all fundamental frequency candidates contained in the polyphonic signal by verifying the required condition of harmonic structure. Then, we combine those harmonic peaks corresponding to each extracted fundamental frequency and assign a rank to each after calculating its harmonic average energy. We finally run pitch tracking based on the rank of extracted fundamental frequency and continuity of fundamental frequency, and determine the predominant melody. We measure the performance of proposed method using ADC 2004 DB and 100 Korean pop songs in terms of MIREX 2005 evaluation metrics, and pitch accuracy of 90.42% is obtained.

A Study on Multi-Pulse Speech Coding Method by Using V/S/TSIUVC (V/S/TSIUVC를 이용한 멀티펄스 음성부호화 방식에 관한 연구)

  • Lee See-Woo
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.9
    • /
    • pp.1233-1239
    • /
    • 2004
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. This paper present a new multi-pulse coding method by using V/S/TSIUVC switching, individual pitch pulses and TSIUVC approximation-synthesis method in order to restrict a distortion of speech quality. The TSIUVC is extracted by using the zero crossing rate and individual pitch pulse. And the TSIUVC extraction rate was 91% for female voice and 96.2% for male voice respectively. The important thing is that the frequency information of 0.347kHz below and 2.813kHz above can be made with high quality synthesis waveform within TSIUVC. I evaluate the MPC use V/UV and the FBD-MPC use V/S/TSIUVC. As a result, I knew that synthesis speech of the FBD-MPC was better in speech quality than synthesis speech of the MPC.

  • PDF

Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks

  • Farhadipour, Aref;Veisi, Hadi;Asgari, Mohammad;Keyvanrad, Mohammad Ali
    • ETRI Journal
    • /
    • v.40 no.5
    • /
    • pp.643-652
    • /
    • 2018
  • Dysarthria is a degenerative disorder of the central nervous system that affects the control of articulation and pitch; therefore, it affects the uniqueness of sound produced by the speaker. Hence, dysarthric speaker recognition is a challenging task. In this paper, a feature-extraction method based on deep belief networks is presented for the task of identifying a speaker suffering from dysarthria. The effectiveness of the proposed method is demonstrated and compared with well-known Mel-frequency cepstral coefficient features. For classification purposes, the use of a multi-layer perceptron neural network is proposed with two structures. Our evaluations using the universal access speech database produced promising results and outperformed other baseline methods. In addition, speaker identification under both text-dependent and text-independent conditions are explored. The highest accuracy achieved using the proposed system is 97.3%.

A Study on ACFBD-MPC in 8kbps (8kbps에 있어서 ACFBD-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.49-53
    • /
    • 2016
  • Recently, the use of signal compression methods to improve the efficiency of wireless networks have increased. In particular, the MPC system was used in the pitch extraction method and the excitation source of voiced and unvoiced to reduce the bit rate. In general, the MPC system using an excitation source of voiced and unvoiced would result in a distortion of the synthesis speech waveform in the case of voiced and unvoiced consonants in a frame. This is caused by normalization of the synthesis speech waveform in the process of restoring the multi-pulses of the representation segment. This paper presents an ACFBD-MPC (Amplitude Compensation Frequency Band Division-Multi Pulse Coding) using amplitude compensation in a multi-pulses each pitch interval and specific frequency to reduce the distortion of the synthesis speech waveform. The experiments were performed with 16 sentences of male and female voices. The voice signal was A/D converted to 10kHz 12bit. In addition, the ACFBD-MPC system was realized and the SNR of the ACFBD-MPC estimated in the coding condition of 8kbps. As a result, the SNR of ACFBD-MPC was 13.6dB for the female voice and 14.2dB for the male voice. The ACFBD-MPC improved the male and female voice by 1 dB and 0.9 dB, respectively, compared to the traditional MPC. This method is expected to be used for cellular telephones and smartphones using the excitation source with a low bit rate.

A study on Speech Coding Method using V/S/TSIUVC Switching (V/S/TSIUVC 스위칭을 이용한 음성부호화 방식에 관한 연구)

  • Lee, See-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.6
    • /
    • pp.1180-1184
    • /
    • 2006
  • In a speech coding system using excitation source of voiced and unvoiced, it would be a distortion of speech quality in a voiced and an unvoiced consonants in a frame. In this paper, I propose a new multi-pulse coding method make use of V/S/TSIUVC switching and TSIUVC approximation-synthesis method in order to restrict a distortion of speech quality. The TSIUVC is extracted by using the zero crossing rate and individual pitch pulse. And the TSIUVC extraction rate was 91% for female voice and 96.2% for male voice. The important thing is that the frequency information of 0.547kHz below and 2.813kHz above can be made with high quality synthesis waveform within TSIUVC. I evaluated the MPC of V/UV and FBD-MPC of V/S/TSIUVC. As a result, the synthesis speech of FBD-MPC was better in speech quality than the MPC.

  • PDF