Browse > Article

Correction for Hangul Normalization in Unicode  

Ahn, Dae-Hyuk (한국마이크로소프트 소프트웨어연구소)
Park, Young-Bae (명지대학교 컴퓨터공학과)
Abstract
Hangul text normalization in current Unicode makes wrong Hangul syllable problems when using with precomposed modern Hangul syllables and composing old Hangul by using conjoining-Hangul Jamo and compatibility Hangul Jamo. This problem comes from allowing incorrect normalization form of compatibility Hangul Jamo and Hangul Symbol and also permitting to use conjoining-Hangul Jamo mixture with precomposed Hangul syllable in Unicode Hangul composing rule. It is caused by lack of consideration of old Hangul and/or insufficient understanding of Hangul code processing when writing specification for normalization forms in Unicode. Therefore on this paper, we study Hangul code in Unicode environment, specifically problems of normalization used for Web and XML, IDN in nowadays. Also we propose modification of Hangul normalization methods and Hangul composing rules for correct processing of Hangul normalization in Unicode.
Keywords
Hangul; Old Hangul; Hangul Code; Unicode; Normalization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Matrin Durst and others, 'Character Model for the World Wide Web 1.0: Normalization,' W3C, 2005
2 Tim Bray and others, 'Extensible Markup Language (XML) 1.0,' W3C, 2004
3 산업표준심의회, '정보 교환용 부호계(한글 및 한자) KS X 1001', 한국표준협회, 2004
4 홍윤표, '한글코드에 관한 연구', 국립국어연구원, 1995
5 안상규, 김성재, 신병훈, '마이크로소프트 워드2002에서의 옛한글 구현', 한국마이크로소프트, 2001
6 (주)한글과컴퓨터, '한글코드와 자판에 대한 기초 연구', 문화부, 1992
7 Patrick Faltstrom and others, 'Internationalizing Domain Names in Applications (IDNA) - RFC 3490,' IETF, 2003
8 정우봉, '문자코드 표준화 연구', 국립국어원, 2004
9 Mark Davis, Matrin Dürst, 'Unicode Normalization Forms 4.1.0 - UAX #15,' The Unicode Consortium, 2005
10 KyongSok Kim, 'New, Canonical decomposition and composition processes for Hangeul,' ISO/IEC SC22/WG20 N954, 2002
11 Microsoft, 'Creating and Supporting OpenType fonts for Old Hangul', Microsoft Corp, 2000
12 안대혁외, '단일문자 표준 연구', 한국전산원, 1993
13 기술표준원, '국제문자부호계 KS규격의 국제규격부합화 연구', 한국표준협회, 2000
14 Unicode, 'Unicode Standard Annex #28 - Unicode 3.2,' The Unicode Consortium, 2002
15 산업표준심의회, '국제 문자 부호 계 KS X 1005', 한국표준협회, 2002
16 The Unicode Consortium, 'The Unicode Standard 4.0,' Addison-Wesley, 2003
17 Mark Davis, 'Draft Unicode Technical Report #15, Revision 11,' The Unicode Consortium, 1999