Search | Korea Science

A Noble Decoding Algorithm Using MLLR Adaptation for Speaker Verification (MLLR 화자적응 기법을 이용한 새로운 화자확인 디코딩 알고리듬)

김강열;김지운;정재호
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.190-198
- /
- 2002
In general, we have used the Viterbi algorithm of Speech recognition for decoding. But a decoder in speaker verification has to recognize same word of every speaker differently. In this paper, we propose a noble decoding algorithm that could replace the typical Viterbi algorithm for the speaker verification system. We utilize for the proposed algorithm the speaker adaptation algorithms that transform feature vectors into the region of the client' characteristics in the speech recognition. There are many adaptation algorithms, but we take MLLR (Maximum Likelihood Linear Regression) and MAP (Maximum A-Posterior) adaptation algorithms for proposed algorithm. We could achieve improvement of performance about 30% of EER (Equal Error Rate) using proposed algorithm instead of the typical Viterbi algorithm.
PDF KSCI

Examining Line-breaks in Korean Language Textbooks: the Promotion of Word Spacing and Reading Skills (한국어 교재의 행 바꾸기 -띄어쓰기와 읽기 능력의 계발 -)

Cho, In Jung;Kim, Danbee
- Journal of Korean language education
- /
- v.23 no.1
- /
- pp.77-100
- /
- 2012
This study investigates issues in relation to text segmenting, in particular, line breaks in Korean language textbooks. Research on L1 and L2 reading has shown that readers process texts by chunking (grouping words into phrases or meaningful syntactic units) and, therefore, phrase-cued texts are helpful for readers whose syntactic knowledge has not yet been fully developed. In other words, it would be important for language textbooks to avoid awkward syntactic divisions at the end of a line, in particular, those textbooks for beginners and intermediate level learners. According to our analysis of a number of major Korean language textbooks for beginner-level learners, however, many textbooks were found to display line-breaks of awkward syntactic division. Moreover, some textbooks displayed frequent instances where a single word (or eojeol in the case of Korean) is split between different lines. This can hamper not only learners' learning of the rules of spaces between eojeols in Korean, but also learners' development in automatic word recognition, which is an essential part of reading processes. Based on the findings of our textbook analysis and of existing research on reading, this study suggests ways to overcome awkward line-breaks in Korean language textbooks.

Analysis a LDPC code in the VDSL system (VDSL 시스템에서의 LDPC 코드 연구)

Joh, Kyung-Hyun;Kang, Hee-Hoon;Yi, Sang-Hoi;Na, Kuk-Hwan
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.999-1000
- /
- 2006
The LDPC Code is focusing a powerful FEC(Forward Error Correction) codes for 4G Mobile Communication system. LDPC codes are used minimizing channel errors by modeling AWGN Channel as VDSL system. The performance of LDPC code is better than that of turbo code in long code word on iterative decoding algorithm. LDPC code are encoded by sparse parity check matrix. there are decoding algorithms for a LDPC code, Bit Flipping, Message passing, Sum-Product. Because LDPC Codes use low density parity bit, mathematical complexity is low and relating processing time becomes shorten.
PDF

Acoustic Characteristics and Pitch Accent Realization in English Elliptical Sentences - VP-ellipsis, sluicing, gapping - (영어 생략구문의 음성적 특성과 피치악센트 실현 양상-동사구 생략, 슬루싱, 공소화를 중심으로-)

Kim, Hee-Sung
- Speech Sciences
- /
- v.11 no.2
- /
- pp.119-136
- /
- 2004
Ellipsis is the figure of speech characterized by the deliberate omission of words that are obviously understood, but that must be supplied to make a construction grammatically or semantically complete. The purpose of this study is to examine how ellipsis affects its adjacent elements acoustically and phonologically in English VP-ellipsis, sluicing and gapping. In the experiment, the realizations by English native speakers are set as the criteria for the observing point and are compared to Korean speakers' realizations. For the results, while English native speakers utilized various acoustic information such as word duration and pitch range and phonological information such as pith accent realization in order to intend the cues for decoding the missing constituent, Korean English learners relied on only duration information and could not use various information effectively.
PDF

A high speed huffman decoder using new ternary CAM (새로운 Ternary CAM을 이용한 고속 허프만 디코더 설계)

이광진;김상훈;이주석;박노경;차균현
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.21 no.7
- /
- pp.1716-1725
- /
- 1996
In this paper, the huffman decoder which is a part of the decoder in JPEG standard format is designed by using a new Ternary CAM. First, the 256 word * 16 bit-size new bit-word all parallel Ternary CAM system is designed and verified using SPICE and CADENCE Verilog-XL, and then the verified novel Ternary CAM is applied to the new huffman decoder architecture of JPEG. So the performnce of the designed CAM cell and it's block is verified. The new Ternary CAM has various applications because it has search data mask and storing data mask function, which enable bit-wise search and don't care state storing. When the CAM is used for huffman look-up table in huffman decoder, the CAM is partitioned according to the decoding symbol frequency. The scheme of partitioning CAM for huffman table overcomes the drawbacks of all-parallel CAM with much power and load. So operation speed and power consumption are improved.
PDF

Implementation of a 16-Bit Fixed-Point MPEG-2/4 AAC Decoder for Mobile Audio Applications

Kim, Byoung-Eul;Hwang, Sun-Young
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.3C
- /
- pp.240-246
- /
- 2008
An MPEG-2/4 AAC decoder on 16-bit fixed-point processor is presented in this paper. To meet audio quality criteria, despite small word length, special design methods for 16-bit foxed-point AAC decoder were devised. This paper presents particular algorithms for 16-bit AAC decoding. We have implemented an efficient AAC decoder using the proposed algorithms. Audio contents can be replayed in the decoder without quality degradation.
PDF KSCI

Encoding and Decoding using Cyclic Product Code (순환곱셈코드를 이용한 인코딩 및 디코딩)

김신령;강창언
- Proceedings of the Korean Institute of Communication Sciences Conference
- /
- 1984.10a
- /
- pp.11-14
- /
- 1984
When the received sequence is not identical to the transmitted code word due to the channel nose effect, it is necessary to detect and correct errors. In this paper, it is shown how to construct the encoder and the decoder using cyclic product codes. this system combines random and burst error correction and is easily decodable. Performance has been obtained as expected.
PDF

Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary

Yang, Byunggon
- Phonetics and Speech Sciences
- /
- v.8 no.2
- /
- pp.11-16
- /
- 2016
This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.
https://doi.org/10.13064/KSSS.2016.8.2.011 인용 PDF KSCI

LSTM based sequence-to-sequence Model for Korean Automatic Word-spacing (LSTM 기반의 sequence-to-sequence 모델을 이용한 한글 자동 띄어쓰기)

Lee, Tae Seok;Kang, Seung Shik
- Smart Media Journal
- /
- v.7 no.4
- /
- pp.17-23
- /
- 2018
We proposed a LSTM-based RNN model that can effectively perform the automatic spacing characteristics. For those long or noisy sentences which are known to be difficult to handle within Neural Network Learning, we defined a proper input data format and decoding data format, and added dropout, bidirectional multi-layer LSTM, layer normalization, and attention mechanism to improve the performance. Despite of the fact that Sejong corpus contains some spacing errors, a noise-robust learning model developed in this study with no overfitting through a dropout method helped training and returned meaningful results of Korean word spacing and its patterns. The experimental results showed that the performance of LSTM sequence-to-sequence model is 0.94 in F1-measure, which is better than the rule-based deep-learning method of GRU-CRF.
https://doi.org/10.30693/SMJ.2018.7.4.17 인용 PDF KSCI

Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition (타언어권 화자 음성 인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법)

Kim, Min-A;Oh, Yoo-Rhee;Kim, Hong-Kook;Lee, Yeon-Woo;Cho, Sung-Eui;Lee, Seong-Ro
- MALSORI
- /
- no.65
- /
- pp.93-103
- /
- 2008
In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.
PDF

Search Result 57, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)