Search | Korea Science

A Korean Speech Recognition Using Fuzzy Rule Base (Fuzzy Rule Base를 이용한 한국어 연속 음성인식)

Song, Jeong-Young
- The Journal of Engineering Research
- /
- v.2 no.1
- /
- pp.13-21
- /
- 1997
This paper describes how to represent varations of feature parameters to improve recognition of continuous speech. For speech recognition, feature parameters, which are formant frequencies, pitches, logarithmic energies and zero crossing retes are used in general. But, their values and variations depend on speakers, for example disparities between man and woman, and on their age. It is difficult to decide a priority the value of the variation width. Hence, we try to represent this variation by introducing fuzziness and recognize a continuous speech by fuzzy inference using fuzzy production rules.
PDF

Threshold Crossing Rate, Phase Distribution and Group Properties of Nonlinear Random Waves of finite Bandwidth (유한한 Bandwidth를 갖는 비선형 불규칙 파열에서의 Threshold Crossing Rate, 위상분포와 파군특성)

Jo, Yong-Jun
- Journal of Korea Water Resources Association
- /
- v.30 no.3
- /
- pp.225-233
- /
- 1997
The nonlinear effects on the statistical properties of wave groups in terms of the average nomber of waves in a group and the mean number of waves in a high run is studied in this paper utilizing the complex envelope and total phase function, random variable transformation technique and perturbation method. It tures out that the phase distribution is modified significantly by nonlinearities and it show systematic excess of values near the mean phase and the corresponding symmetrical deficiency on both sides away from the mean. for the case of threshold crossing rate, it turns out that threshold crossing rate reaches its maxima just below the mean water level rather than zero and considerable amount of probability mass is shifted toward the larger values of water surface elevation as nonlinearity is getting profound. Furthermore, the mean waves in a high run associated with nonlinear wave are shown to have larger values than the linear counterpart. Similar trend can also be found in the average number of waves in a group.
PDF

Voiced, Unvoiced, and Silence Classification of human speech signals by enphasis characteristics of spectrum (Spectrum 강조특성을 이용한 음성신호에서 Voicd - Unvoiced - Silence 분류)

배명수;안수길
- The Journal of the Acoustical Society of Korea
- /
- v.4 no.1
- /
- pp.9-15
- /
- 1985
In this paper, we describe a new algorithm for deciding whether a given segment of a speech signal is classified as voiced speech, unvoiced speech, or silence, based on parameters made on the signal. The measured parameters for the voiced-unvoiced classfication are the areas of each Zero crossing interval, which is given by multiplication of the magnitude by the inverse zero corssing rate of speech signals. The employed parameter for the unvoiced-silence classification, also, are each of positive area summation during four milisecond interval for the high frequency emphasized speech signals.
PDF

Recognition of Korean Isolated Digits Using a Pole-Zero Model (Polo-Zero 모델을 이용한 한국어 단독 숫자음 인식)

;;Alan Conrad Bovik
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.25 no.4
- /
- pp.356-365
- /
- 1988
In this paper, we describe an isolated words recognition system for Korean isolated digits based on a voiced -unvoiced decision algorithm and a frequency domain analysis. The algorithm first performs a voiced-unvoiced decision procedure for the begtinning part of each uttered work using the normalized log energy and zero crossing rate as decision parameters. Based on this decision,. each word is assigned to one of two classes. In order to identify the uttered word within each class, a dynamic time warping algorithm is applied using formant frequencies as the basis for the distance measure. We exploit a pole-zero analysis to measure formant frequencies in each frame. We have observed that pole-zero analysis can provide more accurate estimation of formant frequencies than analysis based on poles only. Experimental recognition rates of 97.3% illustrating the performance of the recognition system was achieved.
PDF

A New Endpoint Detection Method Based on Chaotic System Features for Digital Isolated Word Recognition System

Zang, Xian;Chong, Kil-To
- Proceedings of the IEEK Conference
- /
- 2009.05a
- /
- pp.37-39
- /
- 2009
In the research of speech recognition, locating the beginning and end of a speech utterance in a background of noise is of great importance. Since the background noise presenting to record will introduce disturbance while we just want to get the stationary parameters to represent the corresponding speech section, in particular, a major source of error in automatic recognition system of isolated words is the inaccurate detection of beginning and ending boundaries of test and reference templates, thus we must find potent method to remove the unnecessary regions of a speech signal. The conventional methods for speech endpoint detection are based on two simple time-domain measurements - short-time energy, and short-time zero-crossing rate, which couldn't guarantee the precise results if in the low signal-to-noise ratio environments. This paper proposes a novel approach that finds the Lyapunov exponent of time-domain waveform. This proposed method has no use for obtaining the frequency-domain parameters for endpoint detection process, e.g. Mel-Scale Features, which have been introduced in other paper. Comparing with the conventional methods based on short-time energy and short-time zero-crossing rate, the novel approach based on time-domain Lyapunov Exponents(LEs) is low complexity and suitable for Digital Isolated Word Recognition System.
PDF

Optimal Characteristics of a Long-pulse $CO_2$Laser by Controlling SCR Firing Angle in AC Power Line

Noh, Ki-Kyung;Kim, Geun-Yong;Chung, Hyun-Ju;Min, Byoung-Dae;Song, Keun-Ju;Kim, Hee-Je
- KIEE International Transactions on Electrophysics and Applications
- /
- v.2C no.6
- /
- pp.304-308
- /
- 2002
We demonstrate a simple pulsed $CO_2$ laser with millisecond long pulse duration in a tube at a low pressure of less than 30 Torr. The novel power supply for our laser system switches the voltage of the AC power line (60Hz) directly. The power supply doesn't need elements such as a rectifier bridge, energy-storage capacitors, or a current-limiting resistor in the discharge circuit. To control the laser output power, the pulse repetition rate is adjusted up to 60Hz and the firing angle of SCR(Silicon Controlled Rectifier) gate is varied from 30。 to 150。. A ZCS (Zero Crossing Switch) circuit and a PIC one-chip microprocessor are used to control precisely the gate signal of the SCR. The maximum laser output of 35 W is obtained at a total pressure of 18 Torr, a pulse repetition rate of 60 Hz, and a SCR gate firing angle of 90。 . In addition, the resulting laser pulse width is approximately 3㎳(FWHM). This is a relatively long pulse width, compared with other repetitively pulsed $CO_2$ lasers.
PDF KSCI

A Study on TSIUVC Approximate-Synthesis Method using Least Mean Square (최소 자승법을 이용한 TSIUVC 근사합성법에 관한 연구)

Lee, See-Woo
- The KIPS Transactions:PartB
- /
- v.9B no.2
- /
- pp.223-230
- /
- 2002
In a speech coding system using excitation source of voiced and unvoiced, it would be involves a distortion of speech waveform in case coexist with a voiced and an unvoiced consonants in a frame. This paper present a new method of TSIUVC (Transition Segment Including Unvoiced Consonant) approximate-synthesis by using Least Mean Square. The TSIUVC extraction is based on a zero crossing rate and IPP (Individual Pitch Pulses) extraction algorithm using residual signal of FIR-STREAK Digital Filter. As a result, This method obtain a high Quality approximation-synthesis waveform by using Least Mean Square. The important thing is that the frequency signals in a maximum error signal can be made with low distortion approximation-synthesis waveform. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.
https://doi.org/10.3745/KIPSTB.2002.9B.2.223 인용 PDF KSCI

Stereo Matching using Dynamic Programming in Scale-Space (스케일 공간에서 동적 계획을 이용한 스테레오 정합)

최우영;박래홍
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.29B no.8
- /
- pp.44-53
- /
- 1992
In this paper, a matching method is proposed to improve the correct matching rate in stereo correspondence matching in which the fingerprint of zero-crossing points on the scale-space is used as the robust matching feature. The dynamic programming, which is appropriate for the fingerprint feature, is introduced for correspondence matching. We also improve the matching rate by using the post-processing for correcting mismatched points. In simulation, we apply the proposed algorithm to the synthetic and real images and obtain good matching results.
PDF

A Novel Algorithm for Discrimination of Voiced Sounds (유성음 구간 검출 알고리즘에 관한 연구)

Jang, Gyu-Cheol;Woo, Soo-Young;Yoo, Chang-D.
- Speech Sciences
- /
- v.9 no.3
- /
- pp.35-45
- /
- 2002
A simple algorithm for discriminating voiced sounds in a speech is proposed. In addition to low-frequency energy and zero-crossing rate (ZCR), both of which have been widely used in the past for identifying voiced sounds, the proposed algorithm incorporates pitch variation to improve the discrimination rate. Based on TIMIT corpus, evaluation result shows an improvement of 13% in the discrimination of voiced phonemes over that of the traditional algorithm using only energy and ZCR.
PDF

A Study on the Endpoint Detection Algorithm (끝점 검출 알고리즘에 관한 연구)

양진우
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1984.12a
- /
- pp.66-69
- /
- 1984
This paper is a study on the Endpoint Detection for Korean Speech Recognition. In speech signal process, analysis parameter was classification from Zero Crossing Rate(Z.C.R), Log Energy(L.E), Energy in the predictive error(Ep) and fundamental Korean Speech digits, /영/-/구/ are selected as date for the Recognition of Speech. The main goal of this paper is to develop techniques and system for Speech input ot machine. In order to detect the Endpoint, this paper makes choice of Log Energy(L.E) from various parameters analysis, and the Log Energy is very effective parameter in classifying speech and nonspeech segments. The error rate of 1.43% result from the analysis.
PDF

Search Result 113, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)