Search | Korea Science

A Study-on Context-Dependent Acoustic Models to Improve the Performance of the Korea Speech Recognition (한국어 음성인식 성능향상을 위한 문맥의존 음향모델에 관한 연구)

황철준;오세진;김범국;정호열;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.2 no.4
- /
- pp.9-15
- /
- 2001
In this paper we investigate context dependent acoustic models to improve the performance of the Korean speech recognition . The algorithm are using the Korean phonological rules and decision tree, By Successive State Splitting(SSS) algorithm the Hidden Merkov Netwwork(HM-Net) which is an efficient representation of phoneme-context-dependent HMMs, can be generated automatically SSS is powerful technique to design topologies of tied-state HMMs but it doesn't treat unknown contexts in the training phoneme contexts environment adequately In addition it has some problem in the procedure of the contextual domain. In this paper we adopt a new state-clustering algorithm of SSS, called Phonetic Decision Tree-based SSS (PDT-SSS) which includes contexts splits based on the Korean phonological rules. This method combines advantages of both the decision tree clustering and SSS, and can generated highly accurate HM-Net that can express any contexts To verify the effectiveness of the adopted methods. the experiments are carried out using KLE 452 word database and YNU 200 sentence database. Through the Korean phoneme word and sentence recognition experiments. we proved that the new state-clustering algorithm produce better phoneme, word and continuous speech recognition accuracy than the conventional HMMs.
PDF

A Study on Construction of Acoustical Phoneme Models Using Hidden Markov Network (Hidden Markov Network를 이용한 음향학적 음소모델 작성에 관한 검토)

Oh Se-Jin;Lim Young-Choon;Hwang Cheol-Jun;Kim Bum-Koog;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.29-32
- /
- 2000
본 논문에서는 음성인식 시스템의 음향모델 개선을 위한 기초적 연구로서, 문맥적인 요소를 필요로 하는 SSS(Successive State Splitting)와 필요로 하지 않는 SSS-free 알고리즘을 이용한 HMnet(Hidden Markov Network) 음향모델 작성방법에 대해 검토하고 작성한 음향모델을 한국어에 적용하여 그 유효성을 확인하였다. HMnet을 이용한 음소모델의 작성방법은 전체 학습 데이터에 대해서 각각 2개의 상태를 가지는 초기 모델을 작성한 후, 이를 시간과 문맥방향으로의 최대 분포를 가지는 상태를 재분할한 후 임의의 상태수가 될 때까지 상태분할을 계속적으로 수행케 하여 각 음소모델을 작성하게 된다. 작성한 HMnet 음향모델의 유효성을 확인하기 위해 ETRI 445 단어의 3인에 대한 화자종속 음소인식 실험을 수행하였다. 인식실험 결과, SSS 알고리즘을 이용한 화자종속실험의 경우 상태수 520에서 평균 $62.8\%$의 인식률을, SSS-free 알고리즘의 경우 상태수 420에서 평균 $64.2\%$의 인식률을 얻었다. 이 결과는 HMM을 이용한 경우(약$43.4\%$)보다 $20\%$이상의 인식률 향상을 보여 이 알고리즘의 유효성을 확인할 수 있었다. SSS와 SSS-free를 비교한 경우, SSS-free가 SSS보다 낮은 상태수에서 평균 $1.4\% 향상된 인식률을 보였다.
PDF

A Study on the Korean Continuous Speech Recognition using Adaptive Pruning Algorithm and PDT-SSS Algorithm (적응 프루닝 알고리즘과 PDT-SSS 알고리즘을 이용한 한국어 연속음성인식에 관한 연구)

황철준;오세진;김범국;정호열;정현열
- Journal of Korea Multimedia Society
- /
- v.4 no.6
- /
- pp.524-533
- /
- 2001
Efficient continuous speech recognition system for practical applications requires that the processing be carried out in real time and high recognition accuracy. In this paper, we study the acoustic models by adopting the PDT-SSS algorithm and the language models by iterative learning so as to improve the speech recognition accuracy. And the adaptive pruning algorithm is applied to the continuous speech. To verify the effectiveness of proposed method, we carried out the continuous speech recognition for the Korean air flight reservation task. Experimental results show that the adopted algorithm has the average 90.9% for continuous speech recognition and the average 90.7% for word recognition accuracy including continuous speech. And in case of adopting the adaptive pruning algorithm to continuous speech, it reduces the recognition time of about 1.2 seconds(15%) without any loss of accuracy. From the result, we proved the effectiveness of the PDT-SSS algorithm and the adaptive pruning algorithm.
PDF

Complexity Reduced CP Length Pre-decision Algorithm for SSS Detection at Initial Cell Searcher of 3GPP LTE Downlink System (3GPP LTE 하향링크 시스템의 초기 셀 탐색기 SSS 검출 시 복잡도 최소화를 위한 CP 길이 선 결정 알고리즘)

Kim, Young-Bum;Kim, Jong-Hun;Chang, Kyung-Hi
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.34 no.9A
- /
- pp.656-663
- /
- 2009
In 3GPP LTE system downlink, PSS (primary synchronization signal) and SSS (secondary synchronization signal) sequences are used for initial cell search and synchronization. UE (user equipment) detects slot timing, frequency offset, and cell ID by using PSS. After that it should detect frame timing, cell group ID, and CP length by using SSS. But in 3GPP LTE, there are two kinds of CP length, so we should operate FFT twice. In this paper, to minimize SSS detection complexity in cell searcher, we propose a CP length pre-decision algorithm that reduces the arithmetical complexity by half at most, with negligible performance degradation.
PDF KSCI

A Study on Speech Recognition Using the HM-Net Topology Design Algorithm Based on Decision Tree State-clustering (결정트리 상태 클러스터링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구)

정현열;정호열;오세진;황철준;김범국
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.199-210
- /
- 2002
In this paper, we carried out the study on speech recognition using the KM-Net topology design algorithm based on decision tree state-clustering to improve the performance of acoustic models in speech recognition. The Korean has many allophonic and grammatical rules compared to other languages, so we investigate the allophonic variations, which defined the Korean phonetics, and construct the phoneme question set for phonetic decision tree. The basic idea of the HM-Net topology design algorithm is that it has the basic structure of SSS (Successive State Splitting) algorithm and split again the states of the context-dependent acoustic models pre-constructed. That is, it have generated. the phonetic decision tree using the phoneme question sets each the state of models, and have iteratively trained the state sequence of the context-dependent acoustic models using the PDT-SSS (Phonetic Decision Tree-based SSS) algorithm. To verify the effectiveness of the above algorithm we carried out the speech recognition experiments for 452 words of center for Korean language Engineering (KLE452) and 200 sentences of air flight reservation task (YNU200). Experimental results show that the recognition accuracy has progressively improved according to the number of states variations after perform the splitting of states in the phoneme, word and continuous speech recognition experiments respectively. Through the experiments, we have got the average 71.5%, 99.2% of the phoneme, word recognition accuracy when the state number is 2,000, respectively and the average 91.6% of the continuous speech recognition accuracy when the state number is 800. Also we haute carried out the word recognition experiments using the HTK (HMM Too1kit) which is performed the state tying, compared to share the parameters of the HM-Net topology design algorithm. In word recognition experiments, the HM-Net topology design algorithm has an average of 4.0% higher recognition accuracy than the context-dependent acoustic models generated by the HTK implying the effectiveness of it.
PDF KSCI

A Development for Sea Surface Salinity Algorithm Using GOCI in the East China Sea (GOCI를 이용한 동중국해 표층 염분 산출 알고리즘 개발)

Kim, Dae-Won;Kim, So-Hyun;Jo, Young-Heon
- Korean Journal of Remote Sensing
- /
- v.37 no.5_2
- /
- pp.1307-1315
- /
- 2021
The Changjiang Diluted Water (CDW) spreads over the East China Sea every summer and significantly affects the sea surface salinity changes in the seas around Jeju Island and the southern coast of Korea peninsula. Sometimes its effect extends to the eastern coast of Korea peninsula through the Korea Strait. Specifically, the CDW has a significant impact on marine physics and ecology and causes damage to fisheries and aquaculture. However, due to the limited field surveys, continuous observation of the CDW in the East China Sea is practically difficult. Many studies have been conducted using satellite measurements to monitor CDW distribution in near-real time. In this study, an algorithm for estimating Sea Surface Salinity (SSS) in the East China Sea was developed using the Geostationary Ocean Color Imager (GOCI). The Multilayer Perceptron Neural Network (MPNN) method was employed for developing an algorithm, and Soil Moisture Active Passive (SMAP) SSS data was selected for the output. In the previous study, an algorithm for estimating SSS using GOCI was trained by 2016 observation data. By comparison, the train data period was extended from 2015 to 2020 to improve the algorithm performance. The validation results with the National Institute of Fisheries Science (NIFS) serial oceanographic observation data from 2011 to 2019 show 0.61 of coefficient of determination (R²) and 1.08 psu of Root Mean Square Errors (RMSE). This study was carried out to develop an algorithm for monitoring the surface salinity of the East China Sea using GOCI and is expected to contribute to the development of the algorithm for estimating SSS by using GOCI-II.
https://doi.org/10.7780/kjrs.2021.37.5.2.8 인용 PDF KSCI HTML

The Application of the Spectral Similarity Scale Algorithm and Expectation-Maximization for Unsupervised Change Detection using Hyperspectral Image (하이퍼스펙트럴 영상의 무감독 변화탐지를 위한 SSS 알고리즘과 기대최대화 기법의 적용)

Kim, Yong-Hyun;Kim, Dae-Sung;Kim, Yong-Il;Yu, Ki-Yun
- 한국공간정보시스템학회:학술대회논문집
- /
- 2007.06a
- /
- pp.139-144
- /
- 2007
Recording data in hundreds of narrow contiguous spectral intervals, hyperspectral images have provided the opportunity to detect small differences in material composition. But a limitation of a hyperspectral image is the signal to noise ratio (SNR) lower than that of a multispectral image. This paper presents the efficiency of Spectral Similarity Scale (SSS) in change detection of hyperspectral image and the experiment was performed with Hyperion data. SSS is an algorithm that objectively quantifies differences between reflectance spectra in both magnitude and direction dimensions. The thresholds for detecting the change area were determined through Expectation-Maximization (EM) algorithm. The experimental result shows that the SSS algorithm and EM algorithm are efficient enough to be applied to the unsupervised change detection of hyperspectral images.
PDF

A study on the robust context-dependent acoustic models by considering the state splitting and the time variant of speech (음성의 시간변이와 상태분할을 고려한 강건한 문맥의존 음향모델에 관한 연구)

오세진;김광동;노덕규;정현열
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04c
- /
- pp.229-231
- /
- 2003
일반적으로 음성은 시간함수로 표현되며 음성인식에서 표준모델을 모델링하는 것은 매우 중요한 문제이다. 음절 단어, 연속음성을 발성할 때 자음과 모음에 따라 발성시간에 차이가 있으며 이를 잘 모델링하는 것 또한 음성인식에서는 중요한 문제라고 할 수 있다. 따라서 본 연구에서는 강건한 음향모델을 학습하기 위해 시간의 변화와 상태분할과정에서의 모델의 변화를 고려하여 다양한 구조의 초기모델을 작성하였다. 각 초기모델에 의한 HM-Net 문맥의존 음향모델은 음소결정트리 기반 SSS 알고리즘(PDT-SSS)을 이용하였다. PDT-SSS 알고리즘은 미지의 문맥정보를 해결하기 위해 문맥방향과 시간방향으로 목표 상태수에 도달할 때까지 상태분할을 수행하여 모델을 작성하는 방법이다. 음성의 시간변이를 고려한 강건한 문맥의존 음향모델을 작성하기 위해 설정한 각 모델의 구조에 대한 유효성을 확인하기 위해 국어공학센터의 452 단어를 대상으로 음소와 단어인식 실험을 수행한 결과. 음소인식의 경우 상태수 2000개에서 2상태 구조의 모델에 비해 4상태 구조가 약 11.4% 향상된 인식성능과 39.2초의 인식시간을 단축할 수 있었다. 또한 단어인식의 경우 상태수 2000개에서 1상태 구조의 모델에 비해 4상태 구조가 약 5% 향상된 인식성능과 4상태 구조에서 한 단어를 인식하는데 평균 0.8초가 소요되었다. 따라서 강건한 문맥의존 음향모델을 작성하기 위해 수행한 초기모델의 구조에 관한 연구가 향후 음성인식 시스템을 구축하는데 유효함을 확인할 수 있었다.
PDF

A Study on Context Environment and Model State for Robustness Acoustic Models (강건한 음향모델을 위한 모델의 상태와 문맥환경에 관한 연구)

최재영;오세진;황도삼
- Proceedings of the Korea Multimedia Society Conference
- /
- 2003.05b
- /
- pp.366-369
- /
- 2003
본 연구에서는 강건한 문맥의존 음향모델을 작성하기 위한 기초적인 연구로서 문맥환경과 상태수의 변화에 따른 음향모델의 성능을 고찰하고자 한다. 음성은 시간함수로 표현되며 음절, 단어, 연속음성을 발성할때 자음과 모음에 따라 발성시간에 차이가 있으며 음성인식의 최소 인식단위로 널리 사용되는 음소의 앞과 뒤에 오는 문맥환경에 따라 인식성능에 많은 차이를 보이고 있다. 따라서 본 연구에서는 시간의 변화(상태수의 변화)와 상태분할 과정에서 문맥환경의 변화를 고려하여 다양한 형태의 문맥의존 음향모델을 작성하였다. 모델학습은 음소결정트리 기반 SSS 알고리즘(Phonetic Decision Tree-based Successive State Splitting： PDT-555)을 이용하였다 PDT-SSS 알고리즘은 미지의 문맥정보를 해결하기 위해 문맥방향과 시간방향으로 목표 상태수에 도달할 때까지 상태분할을 수행하여 모델을 작성하는 방법이다. 본 연구에서 강건한 문맥의존 음향모델을 학습하기 위한 방법의 유효성을 확인하기 위해 국어공학센터의 452 단어를 대상으로 음소와 단어인식 실험을 수행하였다. 실험결과, 음성의 시간변이에 따른 모델의 상태수와 각 음소의 문맥환경에 따라 인식성능의 변화를 고찰할 수 있었다. 따라서 본 연구는 향후 음성인식 시스템의 강건한 문맥의존 음향모델을 작성하는데 유효할 것으로 기대된다.
PDF

A Study on Performance Evaluation of Hidden Markov Network Speech Recognition System (Hidden Markov Network 음성인식 시스템의 성능평가에 관한 연구)

오세진;김광동;노덕규;위석오;송민규;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.4 no.4
- /
- pp.30-39
- /
- 2003
In this paper, we carried out the performance evaluation of HM-Net(Hidden Markov Network) speech recognition system for Korean speech databases. We adopted to construct acoustic models using the HM-Nets modified by HMMs(Hidden Markov Models), which are widely used as the statistical modeling methods. HM-Nets are carried out the state splitting for contextual and temporal domain by PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) algorithm, which is modified the original SSS algorithm. Especially it adopted the phonetic decision tree to effectively express the context information not appear in training speech data on contextual domain state splitting. In case of temporal domain state splitting, to effectively represent information of each phoneme maintenance in the state splitting is carried out, and then the optimal model network of triphone types are constructed by in the parameter. Speech recognition was performed using the one-pass Viterbi beam search algorithm with phone-pair/word-pair grammar for phoneme/word recognition, respectively and using the multi-pass search algorithm with n-gram language models for sentence recognition. The tree-structured lexicon was used in order to decrease the number of nodes by sharing the same prefixes among words. In this paper, the performance evaluation of HM-Net speech recognition system is carried out for various recognition conditions. Through the experiments, we verified that it has very superior recognition performance compared with the previous introduced recognition system.
PDF

Search Result 22, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)