• Title/Summary/Keyword: word decoding

Search Result 56, Processing Time 0.022 seconds

HMM with Global Path constraint in Viterbi Decoding for Insolated Word Recognition (전체 경로 제한 조건을 갖는 HMM을 이용한 단독음 인식)

  • Kim, Weon-Goo;Ahn, Dong-Soon;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1E
    • /
    • pp.11-19
    • /
    • 1994
  • Hidden Markov Models (HMM's) with explicit state duration density (HMM/SD) can represent the time-varying characteristics of speech signals more accurately. However, such an advantage is reduced in relatively smooth state duration densities or ling bounded duration. To solve this problem, we propose HMM's with global path constraint (HMM/GPC) where the transition between states occur only within prescribed time slots. HMM/GPC explicitly limits state durations and accurately describes the temproal structure of speech simply and efficiently. HMM's formed by combining HMM/GPC with HMM/SD are also presented (HMM/SD+GPC) and performances are compared. HMM/GPC can be implemented with slight modifications to the conventional Viterbi algorithm. HMM/GPC and HMM/SD_GPC not only show superior performance than the conventional HMM and HMM/SD but also require much less computation. In the speaket independent isolated word recognition experiments, the minimum recognition eror rate of HMM/GPC(1.6%) is 1.1% lower than the conventional HMM's and the required computation decreased about 57%.

  • PDF

An Amplitude Warping Approach to Intra-Speaker Normalization for Speech Recognition (음성인식에서 화자 내 정규화를 위한 진폭 변경 방법)

  • Kim Dong-Hyun;Hong Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.4 no.3
    • /
    • pp.9-14
    • /
    • 2003
  • The method of vocal tract normalization is a successful method for improving the accuracy of inter-speaker normalization. In this paper, we present an intra-speaker warping factor estimation based on pitch alteration utterance. The feature space distributions of untransformed speech from the pitch alteration utterance of intra-speaker would vary due to the acoustic differences of speech produced by glottis and vocal tract. The variation of utterance is two types: frequency and amplitude variation. The vocal tract normalization is frequency normalization among inter-speaker normalization methods. Therefore, we have to consider amplitude variation, and it may be possible to determine the amplitude warping factor by calculating the inverse ratio of input to reference pitch. k, the recognition results, the error rate is reduced from 0.4% to 2.3% for digit and word decoding.

  • PDF

H/W Design and Implementations of the Wideband Data Processing system for the AMPS (이동통신 AMPS에서 광대역 데이터 송.수신을 위한 하드웨어 설계에 관한 연구)

  • 이준동;김대중;김종일;이영천;조형래;강창언
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.3
    • /
    • pp.247-259
    • /
    • 1992
  • In this paper, the types of the data exchange between a cell site and a cobile phonefor the call processing on the AMPS(Advanced Mobile Phone Service) are investigated, and the circuit for processing the wideband data stream according to the data types is designed and implemented. The circuit for detecting the Busy / Idle bit which is needed for determining the channel access, the circuit for detecting the word sync and the circuit for transmitting and receiving the wideband data is designed. The 3-out-of-5 majority vote of the 5received data is performed to reduce error and an algorithm requiring a small buffer size for real time processing of voting process is proposed. The method to overcome the computational complexity and the real time constraint of the conventional BCH decoding is proposed.

  • PDF

Maritime English vs Maritime English Communication

  • Choe, Seung-Hui
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2015.07a
    • /
    • pp.272-274
    • /
    • 2015
  • Success of communication at sea is directly linked with clear and complete delivery and receipt of the target message between interlocutors. It can be said that speakers' effective delivery of their intended message and listeners' precise decoding and accurate understanding are the keys to successful maritime communication. From this perspective, the scope of maritime English education and training needs to be reconceptualized and expanded into the area of communication itself, beyond the simple acquisition of, and familiarization with, IMO Standard Maritime Communication Phrases (SMCP). Therefore, in order to make learners' acquisition of marine communication knowledge more feasible, and the knowledge learned more practically applicable, training on effective and clear oral delivery should be also considered within the frame of maritime English education. Thus, critical training elements to realize this goal need to be suggested as guidelines. In this presentation, the theoretical background on this will be introduced in terms of English as a Lingua Franca (ELF) and Lingua Franca Core (LFC), which are the current mainstream forms of English communication in the international business context. Based on this, six key training elements will be discussed; that is, speech rate, word groups, pauses, nuclear stresses, consonants (including consonant clusters), and vowels (specifically long and short vowels). Finally, the practical pedagogical methods of each element, and its actual application into a real ESP classroom, will be suggested.

  • PDF

Decoder Design of a Nonbinary Code in the System with a High Code Rate (코드 레이트가 높은 시스템에 있어서의 비이진코드의 디코더 설계)

  • 정일석;강창언
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.11 no.1
    • /
    • pp.53-63
    • /
    • 1986
  • In this paper the decoder of nonbinary code satisfying R>1/t has been designed and constructed, where R is the code rate and t is the error correcting capability. In order to design the error trapping decoder, the concept of covering monomial is used and them the decoder system using the (15, 11) Reed-Solomon code is implemented. Without Galois Fiedl multiplication and division circuits, the decoder system is simply constructed. In the decoding process, it takes 60clocks to decode one code word. Two symbol errors and eight binary burst errors are simultaneously corrected. This coding system is shown to be efficient when the channel error probability is approximately from $5{\times}10^-4$~$5{\times}10^-5$.

  • PDF

High-Throughput Low-Complexity Successive-Cancellation Polar Decoder Architecture using One's Complement Scheme

  • Kim, Cheolho;Yun, Haram;Ajaz, Sabooh;Lee, Hanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.3
    • /
    • pp.427-435
    • /
    • 2015
  • This paper presents a high-throughput low-complexity decoder architecture and design technique to implement successive-cancellation (SC) polar decoding. A novel merged processing element with a one's complement scheme, a main frame with optimal internal word length, and optimized feedback part architecture are proposed. Generally, a polar decoder uses a two's complement scheme in merged processing elements, in which a conversion between two's complement and sign-magnitude requires an adder. However, the novel merged processing elements do not require an adder. Moreover, in order to reduce hardware complexity, optimized main frame and feedback part approaches are also presented. A (1024, 512) SC polar decoder was designed and implemented using 40-nm CMOS standard cell technology. Synthesis results show that the proposed SC polar decoder can lead to a 13% reduction in hardware complexity and a higher clock speed compared to conventional decoders.

A Study on the Realization of Digital Multimedia Broadcast Receiving System using Conditional Access System (제한수신시스템을 적용한 디지털 멀티미디어방송 수신시스템 구현에 관한 연구)

  • Kim, Young-Bin;Ryu, Kwang-Ryol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.340-343
    • /
    • 2005
  • A realization for digital multimedia receiving system using Conditional Access System is presented in this paper. The key word for descrambling is make from smart card and Conditional Access System, a Stabilization is grow up in the method. It is possible to decoding that of average 15 fame/second of H.264 video format and that 24Khz${\sim}$48Khz audio sample rate using dual processor that of high performance DSP and RISC. This system is evaluated correct descrambling procedure in test stream added that signed user data.

  • PDF

Performance of speech recognition unit considering morphological pronunciation variation (형태소 발음변이를 고려한 음성인식 단위의 성능)

  • Bang, Jeong-Uk;Kim, Sang-Hun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.111-119
    • /
    • 2018
  • This paper proposes a method to improve speech recognition performance by extracting various pronunciations of the pseudo-morpheme unit from an eojeol unit corpus and generating a new recognition unit considering pronunciation variations. In the proposed method, we first align the pronunciation of the eojeol units and the pseudo-morpheme units, and then expand the pronunciation dictionary by extracting the new pronunciations of the pseudo-morpheme units at the pronunciation of the eojeol units. Then, we propose a new recognition unit that relies on pronunciation by tagging the obtained phoneme symbols according to the pseudo-morpheme units. The proposed units and their extended pronunciations are incorporated into the lexicon and language model of the speech recognizer. Experiments for performance evaluation are performed using the Korean speech recognizer with a trigram language model obtained by a 100 million pseudo-morpheme corpus and an acoustic model trained by a multi-genre broadcast speech data of 445 hours. The proposed method is shown to reduce the word error rate relatively by 13.8% in the news-genre evaluation data and by 4.5% in the total evaluation data.

Encoding & Decoding of Radix 4 Polar Code (Radix 4 Polar code의 부호 및 복호)

  • Lee, Moon-Ho;Choi, Eun-Ji;Yang, Jae-Seung;Park, Ju-Yong
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.46 no.10
    • /
    • pp.14-27
    • /
    • 2009
  • Polar Code was proposed by Turkish professor Erdal Arikan in 2006 as an idea that splitted input channel is increasing the cutoff rate. The channel polarization consisted of code sequences with symmetric high rate capacity in a given B-DMC(Binary-input Discrete Memoryless Channel) W. The symmetric capacity is the highest rate achievable subject to using the input letters of the channel with equal probability. The channel polarization is said to a set of given N independent outputs of B-DMC W. In other word, N increases when N is a set of binary-input channels {$W^{(i)}_N\;:\;1{\leq}\;i\;{\leq}\;N$}, in I{WN(i)} as the fraction of indices is near to 1, which is approaching to I(W), and it is near to 0, then to 1-I(W), where I(W) presents high rates in reliable wireless communication channel as inputs of W with equal frequences. After all, {WN(i)} is shown to be a state of channel coding. On the based on this Polar codes, this paper analyzes Polar coding and decoding of Arikan and propose Radix4 Polar coding newly.

Automatic Text Summarization based on Selective Copy mechanism against for Addressing OOV (미등록 어휘에 대한 선택적 복사를 적용한 문서 자동요약)

  • Lee, Tae-Seok;Seon, Choong-Nyoung;Jung, Youngim;Kang, Seung-Shik
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.58-65
    • /
    • 2019
  • Automatic text summarization is a process of shortening a text document by either extraction or abstraction. The abstraction approach inspired by deep learning methods scaling to a large amount of document is applied in recent work. Abstractive text summarization involves utilizing pre-generated word embedding information. Low-frequent but salient words such as terminologies are seldom included to dictionaries, that are so called, out-of-vocabulary(OOV) problems. OOV deteriorates the performance of Encoder-Decoder model in neural network. In order to address OOV words in abstractive text summarization, we propose a copy mechanism to facilitate copying new words in the target document and generating summary sentences. Different from the previous studies, the proposed approach combines accurate pointing information and selective copy mechanism based on bidirectional RNN and bidirectional LSTM. In addition, neural network gate model to estimate the generation probability and the loss function to optimize the entire abstraction model has been applied. The dataset has been constructed from the collection of abstractions and titles of journal articles. Experimental results demonstrate that both ROUGE-1 (based on word recall) and ROUGE-L (employed longest common subsequence) of the proposed Encoding-Decoding model have been improved to 47.01 and 29.55, respectively.