Computer Codes for Korean Sounds: K-SAMPA

  • Published : 2001.12.01

Abstract

An ASCII encoding of Korean has been developed for extended phonetic transcription of the Speech Assessment Methods Phonetic Alphabet (SAMPA). SAMPA is a machine-readable phonetic alphabet used for multilingual computing. It has been developed since 1987 and extended to more than twenty languages. The motivating factor for creating Korean SAMPA (K-SAMPA) is to label Korean speech for a multilingual corpus or to transcribe native language (Ll) interfered pronunciation of a second language learner for bilingual education. Korean SAMPA represents each Korean allophone with a particular SAMPA symbol. Sounds that closely resemble it are represented by the same symbol, regardless of the language they are uttered in. Each of its symbols represents a speech sound that is spectrally and temporally so distinct as to be perceptually different when the components are heard in isolation. Each type of sound has a separate IPA-like designation. Korean SAMPA is superior to other transcription systems with similar objectives. It describes better the cross-linguistic sound quality of Korean than the official Romanization system, proclaimed by the Korean government in July 2000, because it uses an internationally shared phonetic alphabet. It is also phonetically more accurate than the official Romanization in that it dispenses with orthographic adjustments. It is also more convenient for computing than the International Phonetic Alphabet (IPA) because it consists of the symbols on a standard keyboard. This paper demonstrates how the Korean SAMPA can express allophonic details and prosodic features by adopting the transcription conventions of the extended SAMPA (X-SAMPA) and the prosodic SAMPA(SAMPROSA).

Keywords