[KSCI] Korea Science Citation Index Service

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition

지원우 (호서대학교 컴퓨터학부)
윤춘덕 (호서대학교 컴퓨터학부)
김우성 (호서대학교 컴퓨터학부)
김석동 (호서대학교 컴퓨터학부)

Publication Information

The Journal of the Acoustical Society of Korea / v.22, no.6, 2003 , pp. 434-442 More about this Journal

Abstract

Software internationalization, the process of making software easier to localize for specific languages, has deep implications when applied to speech technology, where the goal of the task lies in the very essence of the particular language. A greatdeal of work and fine-tuning has gone into language processing software based on ASCII or a single language, say English, thus making a port to different languages difficult. The inherent identity of a language manifests itself in its lexicon, where its character set, phoneme set, pronunciation rules are revealed. We propose a decomposition of the lexicon building process, into four discrete and sequential steps. For preprocessing to build a lexical model, we translate from specific language code to unicode. (step 1) Transliterating code points from Unicode. (step 2) Phonetically standardizing rules. (step 3) Implementing grapheme to phoneme rules. (step 4) Implementing phonological processes.

Keywords

IPE; PLI; Lexicon; Character set; Phoneme set; Pronunciation rules; Software internationalization;

Citations & Related Records

Reference

1	Statistical language modeling using the CMU-cambridge toolkit / [ P.Clark;R.Rosenfeld ] / EUROSPEECH '97
2	Large-vocabulary continuous- speech recognition using a japanese business newspaper (NIKKEI) / [ T.Matsuoka;K.Ohtsuki;T.Mori;S.Furui;K.Shirai;Austin,T.X.;Morgan Kaufmann;Cohen(ed.) ] / Proc. Of the ARPA Workshop on Spoken Language Technology
3	Integrated-multilingual speech recognition using universal features in a functional speech production model / [ L.Deng ] / ICASSP '97
4	A real-time mandarin dictation machine for chinese language with unilimited texts and very large vocabulary / [ L.S.Lee;C.Y.Tseng;H.Y.Gu;F.H.Liu;C.H.Chang;S.H.Hsieh;C.H.Chen ] / ICASSP '90
5	Self-learning and connectionst approaches to text-to-phoneme conversion / [ R.I.Damper;Levy,J.(ed.);Bairaktaris,J.(ed.);Bullinaria,J.(ed.);Cairns,P.(ed.) ] / Connectionst Models of Memory and Language
6	/ [ The Unicode Consortium ] / The unicode standard, version 2.0
7	Nettalk: a parallel network that learns to read aloud / [ T.J.Sejnowski;C.R.Rosenberg ] / The Johns Hopkins University Electrical Engineering and Computer Science Technical Report JHU/EECS-86/01

KSCI

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition 다국어 음성 인식을 위한 자동 어휘모델의 생성에 대한 연구

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition