Browse > Article

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition  

지원우 (호서대학교 컴퓨터학부)
윤춘덕 (호서대학교 컴퓨터학부)
김우성 (호서대학교 컴퓨터학부)
김석동 (호서대학교 컴퓨터학부)
Abstract
Software internationalization, the process of making software easier to localize for specific languages, has deep implications when applied to speech technology, where the goal of the task lies in the very essence of the particular language. A greatdeal of work and fine-tuning has gone into language processing software based on ASCII or a single language, say English, thus making a port to different languages difficult. The inherent identity of a language manifests itself in its lexicon, where its character set, phoneme set, pronunciation rules are revealed. We propose a decomposition of the lexicon building process, into four discrete and sequential steps. For preprocessing to build a lexical model, we translate from specific language code to unicode. (step 1) Transliterating code points from Unicode. (step 2) Phonetically standardizing rules. (step 3) Implementing grapheme to phoneme rules. (step 4) Implementing phonological processes.
Keywords
IPE; PLI; Lexicon; Character set; Phoneme set; Pronunciation rules; Software internationalization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Statistical language modeling using the CMU-cambridge toolkit /
[ P.Clark;R.Rosenfeld ] / EUROSPEECH '97
2 Large-vocabulary continuous- speech recognition using a japanese business newspaper (NIKKEI) /
[ T.Matsuoka;K.Ohtsuki;T.Mori;S.Furui;K.Shirai;Austin,T.X.;Morgan Kaufmann;Cohen(ed.) ] / Proc. Of the ARPA Workshop on Spoken Language Technology
3 Integrated-multilingual speech recognition using universal features in a functional speech production model /
[ L.Deng ] / ICASSP '97
4 A real-time mandarin dictation machine for chinese language with unilimited texts and very large vocabulary /
[ L.S.Lee;C.Y.Tseng;H.Y.Gu;F.H.Liu;C.H.Chang;S.H.Hsieh;C.H.Chen ] / ICASSP '90
5 Self-learning and connectionst approaches to text-to-phoneme conversion /
[ R.I.Damper;Levy,J.(ed.);Bairaktaris,J.(ed.);Bullinaria,J.(ed.);Cairns,P.(ed.) ] / Connectionst Models of Memory and Language
6 /
[ The Unicode Consortium ] / The unicode standard, version 2.0
7 Nettalk: a parallel network that learns to read aloud /
[ T.J.Sejnowski;C.R.Rosenberg ] / The Johns Hopkins University Electrical Engineering and Computer Science Technical Report JHU/EECS-86/01