Search | Korea Science

Performance Evaluation of HM-Net Speech Recognition System using Korea Large Vocabulary Speech DB (한국어 대어휘 음성DB를 이용한 HM-Net 음성인식 시스템의 성능평가)

오세진;김광동;노덕규;송민규;김범국;황철준;정현열
- Proceedings of the IEEK Conference
- /
- 2003.07e
- /
- pp.2443-2446
- /
- 2003
본 논문에서는 한국전자통신연구원에서 제공된 대어휘 음성DB를 이용하여 HM-Net(Hidden Markov Network) 음성인식 시스템의 성능평가를 수행하였다. 음향모델 작성은 음성인식에서 널리 사용되고 있는 통계적인 모델링 방법인 HMM(Hidden Markov Model)을 개량한 HM-Net을 도입하였다 HM-Net은 PDT-SSS 알고리즘에 의해 문맥방향과 시간방향의 상태분할을 수행하여 생성되는데, 특히 문맥방향 상태분할의 경우 학습 음성데이터에 출현하지 않는 문맥정보를 효과적으로 표현하기 위해 음소결정트리를 채용하고 있으며, 시간방향 상태분할의 경우 학습 음성데이터에서 각 음소별 지속시간 정보를 효과적으로 표현하기 위한 상태분할을 수행한다. 이러한 상태분할을 수행하여 파라미터를 공유하게 되며 최적인 모델 네트워크를 작성하게 된다. 대어휘 음성데이터를 이용하여 음향모델을 작성하고 인식실험을 수행한 결과, 100명의 100단어와 60문장에 대해 평균 97.5％, 96.7％의 인식률을 보였다.
PDF

A Study on Performance Evaluation of HM-Net Adaptation System Using the State Level Sharing (상태레벨 공유를 이용한 HM-Net 적응화 시스템의 성능평가에 관한 연구)

오세진;김광동;노덕규;황철준;김범국;김광수;성우창;정현열
- Proceedings of the IEEK Conference
- /
- 2003.11a
- /
- pp.397-400
- /
- 2003
본 연구에서는 KM-Net(Hidden Markov Network)을 다양한 태스크에의 적용과 화자의 특성을 효과적으로 나타내기 위해 HM-Net 음성인식 시스템에 MLLR(Maximum Likelihood Linear Regression) 적응방법을 도입하였으며, HM-Net 학습 알고리즘을 개량하여 회귀클래스 생성방법을 제안한다. 제안방법은 PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) 알고리즘의 문맥방향 상태분할에 의한 상태레벨 공유를 이용한 방법으로 새로운 화자로부터 문맥정보와 적응화 데이터의 발성 양에 의존하여 결정된 많은 적응 파라미터들을(평균, 분산) 자유롭게 제어할 수 있게 된다. 제안방법의 유효성을 확인하기 위해 국어공학센터(KLE) 452 음성 데이터와 항공편 예약관련 연속음성을 대상으로 인식실험을 수행한 결과, 전체적으로 음소인식의 경우 평균 34-37％, 단어인식의 경우 평균 9％, 연속음성인식의 경우 평균 7-8％의 인식성능 향상을 각각 보였다. 또한 적응화 데이터의 양에 따른 인식성능 비교에서, 제안방법을 적용한 인식 시스템이 적응 데이터의 양이 적은 경우에도 향상된 인식률을 보였으며. 잡음을 부가한 음성에 대한 적응화 실험에서도 향상된 인식성능을 보여 MLLR 적응방법의 특성을 만족하였다. 따라서 MLLR 적응방법을 도입한 HM-Net 음성인식 시스템에 제안한 회귀클래스 생성방법이 유효함을 확인한 수 있었다.
PDF

A Study on Speech Recognition Using the HM-Net Topology Design Algorithm Based on Decision Tree State-clustering (결정트리 상태 클러스터링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구)

정현열;정호열;오세진;황철준;김범국
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.199-210
- /
- 2002
In this paper, we carried out the study on speech recognition using the KM-Net topology design algorithm based on decision tree state-clustering to improve the performance of acoustic models in speech recognition. The Korean has many allophonic and grammatical rules compared to other languages, so we investigate the allophonic variations, which defined the Korean phonetics, and construct the phoneme question set for phonetic decision tree. The basic idea of the HM-Net topology design algorithm is that it has the basic structure of SSS (Successive State Splitting) algorithm and split again the states of the context-dependent acoustic models pre-constructed. That is, it have generated. the phonetic decision tree using the phoneme question sets each the state of models, and have iteratively trained the state sequence of the context-dependent acoustic models using the PDT-SSS (Phonetic Decision Tree-based SSS) algorithm. To verify the effectiveness of the above algorithm we carried out the speech recognition experiments for 452 words of center for Korean language Engineering (KLE452) and 200 sentences of air flight reservation task (YNU200). Experimental results show that the recognition accuracy has progressively improved according to the number of states variations after perform the splitting of states in the phoneme, word and continuous speech recognition experiments respectively. Through the experiments, we have got the average 71.5%, 99.2% of the phoneme, word recognition accuracy when the state number is 2,000, respectively and the average 91.6% of the continuous speech recognition accuracy when the state number is 800. Also we haute carried out the word recognition experiments using the HTK (HMM Too1kit) which is performed the state tying, compared to share the parameters of the HM-Net topology design algorithm. In word recognition experiments, the HM-Net topology design algorithm has an average of 4.0% higher recognition accuracy than the context-dependent acoustic models generated by the HTK implying the effectiveness of it.
PDF KSCI

A Study on Performance Evaluation of Hidden Markov Network Speech Recognition System (Hidden Markov Network 음성인식 시스템의 성능평가에 관한 연구)

오세진;김광동;노덕규;위석오;송민규;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.4 no.4
- /
- pp.30-39
- /
- 2003
In this paper, we carried out the performance evaluation of HM-Net(Hidden Markov Network) speech recognition system for Korean speech databases. We adopted to construct acoustic models using the HM-Nets modified by HMMs(Hidden Markov Models), which are widely used as the statistical modeling methods. HM-Nets are carried out the state splitting for contextual and temporal domain by PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) algorithm, which is modified the original SSS algorithm. Especially it adopted the phonetic decision tree to effectively express the context information not appear in training speech data on contextual domain state splitting. In case of temporal domain state splitting, to effectively represent information of each phoneme maintenance in the state splitting is carried out, and then the optimal model network of triphone types are constructed by in the parameter. Speech recognition was performed using the one-pass Viterbi beam search algorithm with phone-pair/word-pair grammar for phoneme/word recognition, respectively and using the multi-pass search algorithm with n-gram language models for sentence recognition. The tree-structured lexicon was used in order to decrease the number of nodes by sharing the same prefixes among words. In this paper, the performance evaluation of HM-Net speech recognition system is carried out for various recognition conditions. Through the experiments, we verified that it has very superior recognition performance compared with the previous introduced recognition system.
PDF

A Study on Regression Class Generation of MLLR Adaptation Using State Level Sharing (상태레벨 공유를 이용한 MLLR 적응화의 회귀클래스 생성에 관한 연구)

오세진;성우창;김광동;노덕규;송민규;정현열
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.8
- /
- pp.727-739
- /
- 2003
In this paper, we propose a generation method of regression classes for adaptation in the HM-Net (Hidden Markov Network) system. The MLLR (Maximum Likelihood Linear Regression) adaptation approach is applied to the HM-Net speech recognition system for expressing the characteristics of speaker effectively and the use of HM-Net in various tasks. For the state level sharing, the context domain state splitting of PDT-SSS (Phonetic Decision Tree-based Successive State Splitting) algorithm, which has the contextual and time domain clustering, is adopted. In each state of contextual domain, the desired phoneme classes are determined by splitting the context information (classes) including target speaker's speech data. The number of adaptation parameters, such as means and variances, is autonomously controlled by contextual domain state splitting of PDT-SSS, depending on the context information and the amount of adaptation utterances from a new speaker. The experiments are performed to verify the effectiveness of the proposed method on the KLE (The center for Korean Language Engineering) 452 data and YNU (Yeungnam Dniv) 200 data. The experimental results show that the accuracies of phone, word, and sentence recognition system increased by 34∼37%, 9%, and 20%, respectively, Compared with performance according to the length of adaptation utterances, the performance are also significantly improved even in short adaptation utterances. Therefore, we can argue that the proposed regression class method is well applied to HM-Net speech recognition system employing MLLR speaker adaptation.
PDF KSCI

Definition and Evaluation of Korean Phone-Like Units using Hidden Markov Network (HM-Net을 이용한 한국어 유사음소 단위의 재 정의와 평가)

Lim Young-Chun;Oh Se-Jin;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.183-186
- /
- 2002
최근 음성인식의 인식 단위로서 문맥의존 음향 모델이 널리 사용되고 있다. 이는 음소의 음향학적 특징, 즉 선행 및 후행음소에 의한 중심 음소의 변이음 모델이 문맥독립 모델보다 좀 더 정확하게 모델링 될 수 있기 때문이다. 하지만 강건한 문맥의존 음향 모델을 작성하기 위해서는 모델 파라미터의 병합(tying)과 미지의 문맥(unseen context)의 처리를 위한 좀더 정교한 해결 방법이 필요하다. 따라서 본 논문에서는 이점을 고려하여 음향학적 특징과 언어학적 특징을 결합하여 상태 분할을 수행할 수 있도록 SSS(Successive State Splitting) 알고리즘의 문맥 방향 상태 분할에 음소결정트리를 접목한 HM-Net(Hidden Markov Network) 구조 결정법을 도입하였다. 또한 HM-Net은 연속적인 상태 분할에 의해 한국어에서 많이 발생하는 변이음들을 효과적으로 모델링 할 수 있다는 점을 고려하여 본 연구실에서 기존에 사용하던 48 유사음소 단위에서 문맥의존 음향 모델 작성에 불필요한 변이음을 제거하여 39 유사음소 단위를 재 정의하였다. 도입한 방법과 새로 정의한 유사음소 단위의 유효성을 확인하기 위해 고립 단어, 4연속 숫자음, 연속 음성인식에 대해 인식 실험을 수행한 결과, 모든 실험에서 재 정의한 39 유사음소 단위가 문맥종속형 HM-Net 음향모델을 이용한 한국어 음성인식에 효과적임을 확인할 수 있었다. 특히 연속 음성인식 실험의 경우, 기존의 48 유사음소 단위보다 평균 $15.08\%$의 인식률 향상이 있었다.
PDF

Economic analysis of Phellinus spp. cultivation (진흙버섯속(상품명: 상황버섯) 재배방법에 따른 경제적 효과 분석)

Chang, Hyun-You;Lee, Young-Suk
- Journal of Mushroom
- /
- v.2 no.2
- /
- pp.76-87
- /
- 2004
This experiments were conducted to study on the economic analysis of Phellinus spp.(Comercial name: Sanghwang mushroom). These results were as follows: Phellinus spp. can be cultivated about 4 years by one time inoculation. This mushroom has been cultivated by the method of burying log into the soil(BM) at the first time. Recently, however, the method of hanging log on the shelves in the house(HM) is used, because HM has more advantage than BM that HM can be cultivated more pieces of logs than BM. On the other hand, HM is required to invest more 5,678,230Won for the equipments than the BM. And also, HM is required 14,400 pieces(2.8 times) more log numbers than BM 5,000 pieces. Also, HM is required more 3,680,000Won to purchase log, 1,104,000Won to purchase spawn. The cost of production is required to 20,180,971Won for BM, and 37,953,825Won for HM. Accordingly, product cost of HM is 1.9 times higher than BM. The operating cost is required to 1,207,712Won for BM, and 24,075,432Won for HM. Accordingly, operating cost of HM is 2.0 times higher than BM. The net income is 580,940,000Won for BM, and 1,683,300,000Won for HM. Accordingly, net income of HM is 2.9 times higher than BM. The income is 589,040,000Won for BM, and 169,718,000Won for HM. Accordingly, income of HM is 2.9 times higher than BM. In conclusion, HM is required 2.8 times more logs. HM has 1.03 times more products per a piece of log. HM has 1.9 times more production cost, and 2.0 times more operating cost. As you read above, HM and BM have two different aspects. BM is required less investment cost than HM, but BM has lower income because of the different capacity of production. By the comparing those two methods, HM is resulted more efficient method for the producing mushroom. Only in the side of cash flowing, the cash expenditure of BM is required less money at first year. But it has no production at first year. BM would get the income after 2 years buring the logs. The cash expenditure of HM is required much money for the equipments and the logs at first year, but HM would get the income at first year.
PDF

ASSESSMENT OF ACTIVITY-BASED PYROPROCESS COSTS FOR AN ENGINEERING-SCALE FACILITY IN KOREA

KIM, SUNGKI;KO, WONIL;BANG, SUNGSIG
- Nuclear Engineering and Technology
- /
- v.47 no.7
- /
- pp.849-858
- /
- 2015
This study set the pyroprocess facility at an engineering scale as a cost object, and presented the cost consumed during the unit processes of the pyroprocess. For the cost calculation, the activity based costing (ABC) method was used instead of the engineering cost estimation method, which calculates the cost based on the conceptual design of the pyroprocess facility. The calculation results demonstrate that the pyroprocess facility's unit process cost is $194/kgHM for pretreatment, $298/kgHM for electrochemical reduction, $226/kgHM for electrorefining, and $299/kgHM for electrowinning. An analysis demonstrated that the share of each unit process cost among the total pyroprocess cost is as follows: 19% for pretreatment, 29% for electrochemical reduction, 22% for electrorefining, and 30% for electrowinning. The total unit cost of the pyroprocess was calculated at $1,017/kgHM. In the end, electrochemical reduction and the electrowinning process took up most of the cost, and the individual costs for these two processes was found to be similar. This is because significant raw material cost is required for the electrochemical reduction process, which uses platinum as an anode electrode. In addition, significant raw material costs are required, such as for $Li_3PO_4$, which is used a lot during the salt purification process.
https://doi.org/10.1016/j.net.2015.07.002 인용 KSCI

Performance Improvement of Microphone Array Speech Recognition Using Features Weighted Mahalanobis Distance (가중특징 Mahalanobis거리를 이용한 마이크 어레이 음석인식의 성능향상)

Nguyen, Dinh Cuong;Chung, Hyun-Yeol
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.1E
- /
- pp.45-53
- /
- 2010
In this paper, we present the use of the Features Weighted Mahalanobis Distance (FWMD) in improving the performance of Likelihood Maximizing Beamforming (Limabeam) algorithm in speech recognition for microphone array. The proposed approach is based on the replacement of the traditional distance measure in a Gaussian classifier with adding weight for different features in the Mahalanobis distance according to their distances after the variance normalization. By using Features Weighted Mahalanobis Distance for Limabeam algorithm (FWMD-Limabeam), we obtained correct word recognition rate of 90.26% for calibrate Limabeam and 87.23% for unsupervised Limabeam, resulting in a higher rate of 3% and 6% respectively than those produced by the original Limabearn. By implementing a HM-Net speech recognition strategy alternatively, we could save memory and reduce computation complexity.
PDF KSCI

Recognition Performance Improvement of Unsupervised Limabeam Algorithm using Post Filtering Technique

Nguyen, Dinh Cuong;Choi, Suk-Nam;Chung, Hyun-Yeol
- IEMEK Journal of Embedded Systems and Applications
- /
- v.8 no.4
- /
- pp.185-194
- /
- 2013
Abstract- In distant-talking environments, speech recognition performance degrades significantly due to noise and reverberation. Recent work of Michael L. Selzer shows that in microphone array speech recognition, the word error rate can be significantly reduced by adapting the beamformer weights to generate a sequence of features which maximizes the likelihood of the correct hypothesis. In this approach, called Likelihood Maximizing Beamforming algorithm (Limabeam), one of the method to implement this Limabeam is an UnSupervised Limabeam(USL) that can improve recognition performance in any situation of environment. From our investigation for this USL, we could see that because the performance of optimization depends strongly on the transcription output of the first recognition step, the output become unstable and this may lead lower performance. In order to improve recognition performance of USL, some post-filter techniques can be employed to obtain more correct transcription output of the first step. In this work, as a post-filtering technique for first recognition step of USL, we propose to add a Wiener-Filter combined with Feature Weighted Malahanobis Distance to improve recognition performance. We also suggest an alternative way to implement Limabeam algorithm for Hidden Markov Network (HM-Net) speech recognizer for efficient implementation. Speech recognition experiments performed in real distant-talking environment confirm the efficacy of Limabeam algorithm in HM-Net speech recognition system and also confirm the improved performance by the proposed method.
https://doi.org/10.14372/IEMEK.2013.8.4.185 인용 PDF KSCI

Search Result 28, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)