Search | Korea Science

A High Speed Pitch Extraction Method Based on Peak Detection and AMDF (Peak 검출과 AMDF에 의한 고속도 음성주기 추출방법)

성원용;은종관
- Journal of the Korean Institute of Telematics and Electronics
- /
- 제17권4호
- /
- pp.38-44
- /
- 1980
We present a high speed pitch estimation algorithm that is based on peak detection and average magnitude difference function (AMDF). A few pitch candidates are first estimated from the low-pass filtered (800 Hz) speech by a peak detection algorithm. AMDF values of the pitch candidatestare then calculated, and the pitch candidate that yields the minimum AMDF value is chosen as the desired pitch period. The new method requires far less computation time than other pitch estimation algorithms, while it yields fairly accurate results.
PDF

The Pitch Beginning Point Extraction Using Property of G-peak (G-Peak의 특성에 의한 피치시점검출)

이해군
- Proceedings of the Acoustical Society of Korea Conference
- /
- 한국음향학회 1993년도 학술논문발표회 논문집 제12권 1호
- /
- pp.259-262
- /
- 1993
In this paper, a new pitch beginning point detection method by extracting the G-peak, is proposed. By the speech production model, the area of the first peak on a pitch interval of speech signals is emphasized. By using the above characteristics, this method have more advantages than the others for pitch beginning point detection. The defective decision caused by an impulsive noise is minimized and the pre-filtering is not necessary for this method, because the integration of signals takes place in the process.
PDF

Pitch Detection Using Variable Bandwidth LPF (가변 대역폭 LPF를 이용한 피치 검출)

Keum, Hong;Baek, Guem-Ran;Bae, Myung-Jin;Jang, Ho-Sung
- The Journal of the Acoustical Society of Korea
- /
- 제13권5호
- /
- pp.77-82
- /
- 1994
In speech signal processing, it is very important to detect the pitch exactly. Although various methods for detecting the pitch of speech signals have been developed, it is difficult to exactly extract the pitch for wide range of speakers and various utterances. Thus we propose a new pitch detection algorithm which takes advantage of the G-peak extraction. It is a method to detect the pitch period of the voiced signals by finding MZCI (maximum zero-crossing interval) of the G-peak which is defined as cut-off bandwidth rate of LPF (low pass filter). This algorithm performs robustly with a gross error rate of 3.63% even in 0 dB SNR environement. The gross error rate for clean speech is only 0.18%. Also it is able to process all courses with high speed.
PDF

A Study on the Energy Extraction Using G-peak from the Speech Production Model (음성발생 모델로부터의 G-peak를 이용한 음성에너지 추출에 관한 연구)

Bae, Myungjin;Rheem, Jaeyeol;ANN, Souguil
- Journal of the Korean Institute of Telematics and Electronics
- /
- 제24권3호
- /
- pp.381-386
- /
- 1987
By the speech production model, the first positive peak in a pitch interval of the voiced speech is mainly affected by the glottis and the first formant component, known as a typical energy source of the voiced speech. From these characteristics, the energy parameter can be replaced by the area of the area of the positve peak in a pitch interval, which parameter is generally used for classification of speech signals. In this method, the changed energy parameter is independent of window length applied for analysis, and the pitch can be extracted smultaneously. Furthermore, the energy can be extracted in the pitch period unit.
PDF

Characteristics on the response of the stern trawler according to the state of its operation (선미트롤어선의 운항 형태에 따른 거동 특성)

PARK, Chi-Wan;KIM, Jong-Wha;KIM, Hyong-Seok;KANG, Il-Kwon
- Journal of the Korean Society of Fisheries and Ocean Technology
- /
- 제52권4호
- /
- pp.339-346
- /
- 2016
The aim of this research was to the experimental data using statistical and spectral analyzing method to get the motion reponses of a stern trawler in operation states such as drifting, sailing and trawling according to the wave height. In drifting, the significant and the maximum valuer of roll in beam sea increased according to the wave height, but those of pitch decreased. The response and the period of peak of roll in beam sea were increased, but those of pitch decreased. In navigation, the significant and maximum values of roll increased remarkably according to the wave height, but those of pitch changed a little. The response of roll was highest in quartering sea, beam sea and then following sea, but those of pitch was highest in bow sea, head sea and then beam sea in the order of all wave heights. The period of peak of roll due to the wave height and the wave direction changed from 3.8 to 9.9 seconds, and those of pitch changed from 3.3 to 10.4 seconds. In trawling, the significant and maximum values of roll increased a little according to the wave height, but those of pitch increased significantly. The response of roll was highest in beam sea, bow sea and then quartering sea, but those of pitch was highest in head sea, following sea, and then beam sea in the order. The period of peak of roll due to the wave height and the direction changed from 6.6 to 10.9 seconds, and those of pitch changed from 6.7 to 11.2 seconds.
https://doi.org/10.3796/KSFT.2016.52.4.339 인용 PDF KSCI

Pitch Accent Realization in North Kyungsang Korean: Tonal Alignment as a Function of Nasal Position in Syllables

Sohn, Hyang-Sook
- Phonetics and Speech Sciences
- /
- 제3권2호
- /
- pp.37-52
- /
- 2011
This study investigates patterns of the alignment of the accentual peaks in bisyllabic words of the CVNCV, CVNV, and CVNNV structures in North Kyungsang Korean. Based on the tonal alignment, patterns of the F0 pitch excursion are discussed relative to one another. Issues are addressed concerning how the tonal targets are aligned, and how the tonal specifications of nasals in postvocalic, intervocalic, and prevocalic environments are supplied in the LH, HL, and HH classes. Tonal specification of nasals in various environments is accounted for by extension of the L target, displacement of the pitch peak, and interpolation between two tonal targets, depending on the tonal class. The results in this study provide preliminary evidence that the categorical alignment of the tonal targets is implemented by simply checking the presence or absence of a nasal before or after the nucleus vowel on the segmental string, without reference to the constituency of the nasal in the syllable structure. However, the prosodic structure has a key role to play in explaining speaker-dependent variations in the tonal alignment. Sensitivity to tautosyllabicity has an effect on the shape of the F0 contour, and disparity in the patterns of the pitch excursion is represented as a function of syllable structure correlated with segmental composition of the nasal.
PDF

A Study of the Prosodic Characteristics of Homographs with Context Cues by Subjects with Right and Left Hemisphere Damage (문맥 내에서 좌우반구 손상자의 동음어에 대한 운율 산출 비교)

Lee, Myoung-Soon
- Phonetics and Speech Sciences
- /
- 제2권1호
- /
- pp.13-21
- /
- 2010
The purpose of this study was to examine the prosody characteristics of sentence-level utterances which contain homographs with context cues in patients with neurogenic communication disorders. Homographs which may be affected by prosody, especially tonic length features, were used to investigate this matter. The characteristics of tone, duration, pitch, and pitch peak were analyzed to examine the characteristics of prosody in patients with lesions in the left or right hemisphere and normal controls. The whole process was recorded using Praat 4.3.14 and for statistical analyses, three-way ANOVA and multiple comparative analyses, Chi-Square tests, and a one-way ANOVA were carried out using SPSS 12.0 for Windows. The conclusions of this study are as follows. First, the length of syllables and vowels in homographs in Korean was different depending on the meaning and was not significant between groups. Second, it was found that patients with lesions in the right hemisphere had significant difference on pitch. Third, it was found that frequency of pitch peak and tone in 'short' tone syllables were different between groups. The conclusion of this study found that the prosody of homographs between groups absolutely was not differentiated. Accordingly, more detailed studies of acoustic parameters and other parameters which the prosody characteristic between groups could be found are needed in the future.
PDF

A Study on Speaker Recognition using the Peak and valley pitch detection and the Fuzzy (국부 봉우리와 골에 의한 피치 검출과 퍼지를 이용한 화자 인식에 관한 연구)

김연숙;김희주;김경재
- Journal of the Korea Institute of Information and Communication Engineering
- /
- 제8권1호
- /
- pp.213-219
- /
- 2004
This paper proposes speaker recognition algorithm which includes the pitch parameter for the peak and valley. The time-frequency hybrid method for pitch extraction is valuable in that it can improve resolution in the time domain and accuracy in the frequency domain at the same time. It makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance for proposed method, speaker recognition experiments are carried out using vowels and number sounds.
PDF KSCI

A Study on the Pitch Detection of Speech Harmonics by the Peak-Fitting (음성 하모닉스 스펙트럼의 피크-피팅을 이용한 피치검출에 관한 연구)

Kim, Jong-Kuk;Jo, Wang-Rae;Bae, Myung-Jin
- Speech Sciences
- /
- 제10권2호
- /
- pp.85-95
- /
- 2003
In speech signal processing, it is very important to detect the pitch exactly in speech recognition, synthesis and analysis. If we exactly pitch detect in speech signal, in the analysis, we can use the pitch to obtain properly the vocal tract parameter. It can be used to easily change or to maintain the naturalness and intelligibility of quality in speech synthesis and to eliminate the personality for speaker-independence in speech recognition. In this paper, we proposed a new pitch detection algorithm. First, positive center clipping is process by using the incline of speech in order to emphasize pitch period with a glottal component of removed vocal tract characteristic in time domain. And rough formant envelope is computed through peak-fitting spectrum of original speech signal infrequence domain. Using the roughed formant envelope, obtain the smoothed formant envelope through calculate the linear interpolation. As well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. Inverse fast fourier transform (IFFT) compute this flattened harmonics. After all, we obtain Residual signal which is removed vocal tract element. The performance was compared with LPC and Cepstrum, ACF. Owing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.
PDF

A Study on the Charateristics of the Korean Adult Female Sound According to Sasang Constitution Using PSSC with a Sentence (사상체질음성분석기(四象體質音聲分析機)(PSSC)를 통한 한국인 성인여성(成人女性)의 체질별(體質別) 음향특성연구(音響特性硏究) - 단문(短文)을 중심으로 -)

Youn, Ji-Young;Yoon, Woo-Young;Cho, Sung-Eon;Wang, Hyang-Lan;Jeon, Jong-Weon;Kim, Dal-Rae;Yoo, Jun-Sang
- Journal of Sasang Constitutional Medicine
- /
- 제18권3호
- /
- pp.75-93
- /
- 2006
1. Objectives and Methods Sasang Constitutional Medicine is the original Korean Medicine. The purpose of this study was to objectify the diagnosis of Sasang Constitution. 212 Women's sentences were analyzed into 228 factors like Pitch, APQ, Shimmer, Octave and Energy, etc. Women's sentences were classified into 3 categories: total group, under 54 years old group and over 55 years old group. 2. Results 1) In Total group Soyangin's Center feq.(3) was significantly high compared with Taeyangin and Taeumin groups. Taeumin's Pitch2 was significantly high compared with Soeumin and Taeyangin groups. Taeyangin's Pitch S.D. was significantly high compared with Soyangin group. Taeyangin's Octave6 was significantly high compared with Soeumin group. There were no significant differences among constitutional groups in APQ and Shimmer segment. On the point of Energy, Taeyangin's G Tot E(1), G# Tot E(1), G dev.(1), G# dev.(1), G Tot E(2), G# Tot E(2), G dev.(4) and G# dev.(4) were significantly high compared with other groups. Soyangin's A#S.D.(2) was significantly high compared with Taeyangin group. Taeyangin's A#S.D.(3) was significantly high compared with Taeumin group. Taeyangin's F S.D.(5), F# S.D.(5) and Max Average were significantly high compared with Soeumin group. Taeumin's Peak3 and Peak4 were significantly high compared with Taeyangin group. Taeumin's PeakValue1 was significantly high compared with Soeumin group. Taeyangin's PeakValue2 was significantly high compared with Soeumin group. Taeyangin's PeakValue3 and PeakValue5 were significantly high compared with Other groups. 2) In Under 54 years old group, there were no significant differences among constitutional groups in APQ, Shimmer and Octave segment. Taeumin's Center freq.(2) was significantly high compared with Taeyangin and Soyangin groups. Taeumin's Pitch(2) and Pitch(3) were significantly high compared with Taeyangin and Soeumin groups. Taeyangin's and Taeumin's Pitch S.D. were significantly high compared with Soyangin group. Taeyangin's and Soyangin's Octave2 were significantly high compared with Taeumin group. On the point of Energy, Taeyangin's and Soyangin's A# S.D.(2) were significantly high compared with Soeumin group. Taeyangin's and Soyangin's G# dev.(1), G# dev.(2) were significantly high compared with Taeumin group. Taeyangin's and Taeumin's F# S.D.(3) were significantly high compared with Soeumin group. Taeyangin's and Soyangin's Max Average were significantly high compared with Soeumin group. Taeumin's Peak3 was significantly high compared with Taeyangin and Soeumin groups. Taeyangin's and Taeumin's PeakValue2 were significantly high compared with Soeumin group. Taeyangin's and Soeumin's PeakValue3 were significantly high compared with Taeumin group. Taeyangin's and Soyangin's PeakValue5 were significantly high compared with Soeumin group. Taeyangin's and Soyangin's PeakValue9 were significantly high compared with Taeumin group 3) In Over 55 years old group, there were no significant differences among constitutional groups in Pitch, APQ, and Peak segment. Soeumin's F Shimmer(1) and F Shimmer(2) were significantly high compared with Taeyangin and Taeumin groups. Soeumin's G# Shimmer(1) and G# Shimmer(2) were significantly high compared with Soyangin group. Taeyangin's Octave5 and Octave6 were significantly high compared with Soeumin group. On the point of Energy, Soyangin's C S.D., F# S.D.(1), F# S.D.(2) and G dev.(2) were significantly high compared with other groups. Soyangin's F# S.D.(3) was significantly high compared with Taeumin and Soeumin groups. Taeyangin's and Taeumin's G# S.D.(2) and G# S.D.(3) were significantly high compared with Soyangin group 3. Conclusions From above result, there is the possibility of efficient standard guide for constitution diagnosis by analysis of voice
PDF

검색결과 139건 처리시간 0.029초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)