Search | Korea Science

Design of A Speech Recognition System using Hidden Markov Models (은닉 마코프 모델을 이용한 음성 인식 시스템 설계)

Lee, Chul-Won;Lim, In-Chil
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.1
- /
- pp.108-115
- /
- 1996
This paper proposes an algorithm and a model topology for the connected speech recognition using Discrete Hidden Markov Models. A proposed model uses diphone and triphone model which consider the recognition rate and recognisable vocabulary. Considering more exact inter- phoneme segmentation and execution speed of algorithm, 4 states have to exist in diphone model where the first state and the last state are keeping a steady state, the other states hold a transient state. 7 states have to exist in triphone model where 7 states are specified and improved to 3 steady states and 4 transition states. Also, the proposed speech recognition algorithm is designed to detect the inter-phoneme segmentation during the recognition processing.
PDF

Korean Word Recognition Using Diphone- Level Hidden Markov Model (Diphone 단위 의 hidden Markov model을 이용한 한국어 단어 인식)

Park, Hyun-Sang;Un, Chong-Kwan;Park, Yong-Kyu;Kwon, Oh-Wook
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.1
- /
- pp.14-23
- /
- 1994
In this paper, speech units appropriate for recognition of Korean language have been studied. For better speech recognition, co-articulatory effects within an utterance should be considered in the selection of a recognition unit. One way to model such effects is to use larger units of speech. It has been found that diphone is a good recognition unit because it can model transitional legions explicitly. When diphone is used, stationary phoneme models may be inserted between diphones. Computer simulation for isolated word recognition was done with 7 word database spoken by seven male speakers. Best performance was obtained when transition regions between phonemes were modeled by two-state HMM's and stationary phoneme regions by one-state HMM's excluding /b/, /d/, and /g/. By merging rarely occurring diphone units, the recognition rate was increased from $93.98\%$ to $96.29\%$. In addition, a local interpolation technique was used to smooth a poorly-modeled HMM with a well-trained HMM. With this technique we could get the recognition rate of $97.22\%$ after merging some diphone units.
PDF

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair (반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자음 인식의 성능 향상)

Kim Dong-Ok;Park No-Jin
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.8
- /
- pp.1625-1631
- /
- 2005
This paper describes an optimization of a language model and an acoustic model that improve the ability of speech recognition with Korean nit digit. Recognition errors of the language model are decreasing by analysis of the grammatical feature of korean unit digits, and then is made up of fsn-node with a disyllable. Acoustic model make use of demi-syllable pair to decrease recognition errors by inaccuracy division of a phone, a syllable because of a monosyllable, a short pronunciation and an articulation. we have used the k-means clustering algorithm with the transformed successive state splining in feature level for the efficient modelling of the feature of recognition unit . As a result of experimentations, $10.5\%$ recognition rate is raised in the case of the proposed language model. The demi-syllable pair with an acoustic model increased $12.5\%$ recognition rate and $1.5\%$ recognition rate is improved in transformed successive state splitting.
PDF KSCI

A novel method to aging state recognition of viscoelastic sandwich structures

Qu, Jinxiu;Zhang, Zhousuo;Luo, Xue;Li, Bing;Wen, Jinpeng
- Steel and Composite Structures
- /
- v.21 no.6
- /
- pp.1183-1210
- /
- 2016
Viscoelastic sandwich structures (VSSs) are widely used in mechanical equipment, but in the service process, they always suffer from aging which affect the whole performance of equipment. Therefore, aging state recognition of VSSs is significant to monitor structural state and ensure the reliability of equipment. However, non-stationary vibration response signals and weak state change characteristics make this task challenging. This paper proposes a novel method for this task based on adaptive second generation wavelet packet transform (ASGWPT) and multiwavelet support vector machine (MWSVM). For obtaining sensitive feature parameters to different structural aging states, the ASGWPT, its wavelet function can adaptively match the frequency spectrum characteristics of inspected vibration response signal, is developed to process the vibration response signals for energy feature extraction. With the aim to improve the classification performance of SVM, based on the kernel method of SVM and multiwavelet theory, multiwavelet kernel functions are constructed, and then MWSVM is developed to classify the different aging states. In order to demonstrate the effectiveness of the proposed method, different aging states of a VSS are created through the hot oxygen accelerated aging of viscoelastic material. The application results show that the proposed method can accurately and automatically recognize the different structural aging states and act as a promising approach to aging state recognition of VSSs. Furthermore, the capability of ASGWPT in processing the vibration response signals for feature extraction is validated by the comparisons with conventional second generation wavelet packet transform, and the performance of MWSVM in classifying the structural aging states is validated by the comparisons with traditional wavelet support vector machine.
https://doi.org/10.12989/scs.2016.21.6.1183 인용 KSCI

A Study on the Speech Recognition for Commands of Ticketing Machine using CHMM (CHMM을 이용한 발매기 명령어의 음성인식에 관한 연구)

Kim, Beom-Seung;Kim, Soon-Hyob
- Journal of the Korean Society for Railway
- /
- v.12 no.2
- /
- pp.285-290
- /
- 2009
This paper implemented a Speech Recognition System in order to recognize Commands of Ticketing Machine (314 station-names) at real-time using Continuous Hidden Markov Model. Used 39 MFCC at feature vectors and For the improvement of recognition rate composed 895 tied-state triphone models. System performance valuation result of the multi-speaker-dependent recognition rate and the multi-speaker-independent recognition rate is 99.24% and 98.02% respectively. In the noisy environment the recognition rate is 93.91%.
PDF KSCI

Application of SA-SVM Incremental Algorithm in GIS PD Pattern Recognition

Tang, Ju;Zhuo, Ran;Wang, DiBo;Wu, JianRong;Zhang, XiaoXing
- Journal of Electrical Engineering and Technology
- /
- v.11 no.1
- /
- pp.192-199
- /
- 2016
With changes in insulated defects, the environment, and so on, new partial discharge (PD) data are highly different from the original samples. It leads to a decrease in on-line recognition rate. The UHF signal and pulse current signal of four kinds of typical artificial defect models in gas insulated switchgear (GIS) are obtained simultaneously by experiment. The relationship map of ultra-high frequency (UHF) cumulative energy and its corresponding apparent discharge of four kinds of typical artificial defect models are plotted. UHF cumulative energy and its corresponding apparent discharge are used as inputs. The support vector machine (SVM) incremental method is constructed. Examples show that the PD SVM incremental method based on simulated annealing (SA) effectively speeds up the data update rate and improves the adaptability of the classifier compared with the original method, in that the total sample is constituted by the old and new data. The PD SVM incremental method is a better pattern recognition technology for PD on-line monitoring.
https://doi.org/10.5370/JEET.2016.11.1.192 인용 PDF KSCI KPUBS

A Study on Speech Recognition Using the HM-Net Topology Design Algorithm Based on Decision Tree State-clustering (결정트리 상태 클러스터링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구)

정현열;정호열;오세진;황철준;김범국
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.199-210
- /
- 2002
In this paper, we carried out the study on speech recognition using the KM-Net topology design algorithm based on decision tree state-clustering to improve the performance of acoustic models in speech recognition. The Korean has many allophonic and grammatical rules compared to other languages, so we investigate the allophonic variations, which defined the Korean phonetics, and construct the phoneme question set for phonetic decision tree. The basic idea of the HM-Net topology design algorithm is that it has the basic structure of SSS (Successive State Splitting) algorithm and split again the states of the context-dependent acoustic models pre-constructed. That is, it have generated. the phonetic decision tree using the phoneme question sets each the state of models, and have iteratively trained the state sequence of the context-dependent acoustic models using the PDT-SSS (Phonetic Decision Tree-based SSS) algorithm. To verify the effectiveness of the above algorithm we carried out the speech recognition experiments for 452 words of center for Korean language Engineering (KLE452) and 200 sentences of air flight reservation task (YNU200). Experimental results show that the recognition accuracy has progressively improved according to the number of states variations after perform the splitting of states in the phoneme, word and continuous speech recognition experiments respectively. Through the experiments, we have got the average 71.5%, 99.2% of the phoneme, word recognition accuracy when the state number is 2,000, respectively and the average 91.6% of the continuous speech recognition accuracy when the state number is 800. Also we haute carried out the word recognition experiments using the HTK (HMM Too1kit) which is performed the state tying, compared to share the parameters of the HM-Net topology design algorithm. In word recognition experiments, the HM-Net topology design algorithm has an average of 4.0% higher recognition accuracy than the context-dependent acoustic models generated by the HTK implying the effectiveness of it.
PDF KSCI

A Rule-based Approach for the recognition of system isolation state using information on circuit breakers (차단기 정보를 이용한 계통의 분리 상태 인식의 룰-베이스적 접근)

Park, Y.M.;Lee, J.H.
- Proceedings of the KIEE Conference
- /
- 1988.07a
- /
- pp.841-842
- /
- 1988
For determination of black-out area and restoration area by an expert system for fault section estimation and power system restoration using information from circuit breakers, it is necessary that the recognition of system isolation state and a method of finding the change of system isolation state by the state transition of breakers in isolated system. This paper presents a method of resolving the above problem by rule-based approach.
PDF

Speech Recognition Using the Energy and VQ (에너지와 VQ를 이용한 음성 인식)

Hwang, Young-Soo
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.6 no.3
- /
- pp.87-94
- /
- 2007
In this paper, the performance of the speech recognition and speaker adaptation methods are studied. The speech recognition using energy state and VQ(Vector Quantization) is suggested and the speaker adaptation methods(Maximum a posteriori probability estimation, linear specrum estimation) are considered. The experimental results show that recognition ration using energy state is 2-3 % better than that of general VQ.
PDF

Optimization of State-Based Real-Time Speech Endpoint Detection Algorithm (상태변수 기반의 실시간 음성검출 알고리즘의 최적화)

Kim, Su-Hwan;Lee, Young-Jae;Kim, Young-Il;Jeong, Sang-Bae
- Phonetics and Speech Sciences
- /
- v.2 no.4
- /
- pp.137-143
- /
- 2010
In this paper, a speech endpoint detection algorithm is proposed. The proposed algorithm is a kind of state transition-based ones for speech detection. To reject short-duration acoustic pulses which can be considered noises, it utilizes duration information of all detected pulses. For the optimization of parameters related with pulse lengths and energy threshold to detect speech intervals, an exhaustive search scheme is adopted while speech recognition rates are used as its performance index. Experimental results show that the proposed algorithm outperforms the baseline state-based endpoint detection algorithm. At 5 dB input SNR for the beamforming input, the word recognition accuracies of its outputs were 78.5% for human voice noises and 81.1% for music noises.
PDF

Search Result 1,016, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)