Search | Korea Science

Lee, Kong Joo;Choi, Yong Seok
- KIPS Transactions on Software and Data Engineering
- /
- v.6 no.5
- /
- pp.267-278
- /
- 2017
As the same letter can be pronounced differently depending on word contexts, one should refer to a lexicon in order to pronounce a word correctly. Phonetic alphabets that lexicons adopt as well as pronunciations that lexicons describe for the same word can be different from lexicon to lexicon. In this paper, we use a sequence-to-sequence model that is widely used in deep learning research area in order to convert automatically from one pronunciation to another. The 12 seq2seq models are implemented based on pronunciation training data collected from 4 different lexicons. The exact accuracy of the models ranges from 74.5% to 89.6%. The aim of this study is the following two things. One is to comprehend a property of phonetic alphabets and pronunciations used in various lexicons. The other is to understand characteristics of seq2seq models by analyzing an error.
https://doi.org/10.3745/KTSDE.2017.6.5.267 인용 PDF KSCI

Lee, Kong-Joo;Kim, Jae-Hoon
- The KIPS Transactions:PartB
- /
- v.11B no.3
- /
- pp.387-394
- /
- 2004
In this paper, we propose a modified unsupervised linear alignment algorithm for building an aligned corpus. The original algorithm inserts null characters into both of two aligned strings (source string and target string), because the two strings are different from each other in length. This can cause some difficulties like the search space explosion for applications using the aligned corpus with null characters and no possibility of applying to several machine learning algorithms. To alleviate these difficulties, we modify the algorithm not to contain null characters in the aligned source strings. We have shown the usability of our approach by applying it to different areas such as Korean-English back-trans literation, English grapheme-phoneme conversion, and Korean morphological analysis.
https://doi.org/10.3745/KIPSTB.2004.11B.3.387 인용 PDF KSCI