Browse > Article

Hangul Encoding Standard based on Unicode  

Ahn, Dae-Hyuk (한국마이크로소프트 소프트웨어연구소, 명지대학교 컴퓨터공학과)
Park, Young-Bae (명지대학교 컴퓨터공학과)
Abstract
In Unicode, two types of Hangul encoding schemes are currently in use, namely, the "precomposed modern Hangul syllables" model and the "conjoining Hangul characters" model. The current Unicode Hangul conjoining rules allow a precomposed Hangul syllable to be a member of a syllable which includes conjoining Hangul characters; this has resulted in a number of different Hangul encoding implementations. This unfortunate problem stems from an incomplete understanding of the Hangul writing system when the normalization and encoding schemes were originally designed. In particular, the extended use of old Hangul was not taken into consideration. As a result, there are different ways to represent Hangul syllables, and this cause problem in the processing of Hangul text, for instance in searching, comparison and sorting functions. In this paper, we discuss the problems with the normalization of current Hangul encodings, and suggest a single efficient rule to correctly process the Hangul encoding in Unicode.
Keywords
Hangul; Old Hangul; Hangul Code; Unicode; Encoding; Normalization;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 산업표준심의회, '국제 문자 부호 계 KS X 1005', 한국표준협회, 2002
2 Matrin Durst, François Yergeau, Richard Ishida, Misha Wolf, Tex Texin, 'Character Model for the World Wide Web 1.0: Normalization,' The World Wide Web Consortium (W3C), 2005
3 Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau, 'Extensible Markup Language (XML) 1.0,' The World Wide Web Consortium (W3C), 2006, Appendix B. Definitions for Character Normalization
4 안대혁, 박영배, '유니코드 환경에서의 올바른 한글 정규화를 위한 수정 방안', 정보과학회 논문지 제34권 2호, 2007, pp. 169-177   과학기술학회마을
5 이승호, 이수연, 정호원, 강태진, 김경석, 변정용, 이동철, 이준희, 안대혁, 조증성, '단일문자 표준 연구', 한국전산원, 1993
6 홍윤표, '한글코드에 관한 연구', 국립국어연구원, 1995
7 Mark Davis, 'Draft Unicode Technical Report #15, Revision 11,' The Unicode Consortium, 1999, Section 'Hangul Composition'
8 Mike Ksar, 'Resolutions of WG 2 meeting 50,' ISO/IEC SC2/WG2 N3254, 2007, Resolution M50.34 (Hangul Jamo additions)
9 산업표준심의회, '정보 교환용 부호계(한글 및 한자) KS X 1001', 한국표준협회, 2004
10 Patrick Faltstrom and Paul Hoffman, Adam M. Costello, 'Internationalizing Domain Names in Applications (IDNA) - RFC 3490,' The Internet Engineering Task Force (IETF), 2003, Section 10. Security Considerations
11 정우봉, '문자코드 표준화 연구', 국립국어원, 2004, pp. 11-19
12 ISO/IEC SC2, 'ISO/IEC 10646: 2003/PDAM 5 Ballot,' ISO/IEC SC 2 N 3940, 2007
13 안상규, 김성재, 신병훈, '마이크로소프트 워드2002에서의 옛한글 구현', 한국마이크로소프트, 2001
14 기술표준원, '국제문자부호계 KS규격의 국제규격부합화 연구', 한국표준협회, 2000
15 안대혁, 김경석, 'A Proposal to add new Hangul Jamo extended characters to BMP of UCS,' ISO/IEC SC2/WG2 N3168, 2006
16 (주)한글과컴퓨터, '한글코드와 자판에 대한 기초 연구', 문화부, 1992, pp. 15-72
17 Microsoft, 'Creating and Supporting OpenType fonts for Old Hangul,' Microsoft Corp, 2000
18 Unicode, 'Unicode Standard Annex #28 - Unicode 3.2,' The Unicode Consortium, 2002, Section 3.11 Conjoining Jamo Behavior (revision)
19 The Unicode Consortium, 'The Unicode Standard 5.0,' Addison-Wesley, 2006
20 Mark Davis, Matrin Durst, 'Unicode Normalization Forms 5.0.0 - UAX #15,' The Unicode Consortium, 2006