• Title/Summary/Keyword: Unicode

Search Result 68, Processing Time 0.026 seconds

Unicode based Classics Archive Management System (Unicode 기반 고전문서 편찬 관리시스템)

  • 최윤수;진두석;안성수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.115-117
    • /
    • 2002
  • 고전문서는 우리 가 상상할 수 없을 만큼의 문화와 지식의 깊이를 지니고 있다. 이러한 문화와 지식을 바탕으로 새로운 지식을 창출해내기 위한 고전문서의 전산화 작업은 필수적인 과제이다. 따라서, 최근 대규모의 고전문서 전산화 작업이 많이 진행되고 있다. 이러한 수백만 혹은 수천만 페이지에 달하는 대규모 고전문서 전산화 작업에서 가장 어렵고 비용이 많이 소요되는 분야는 고전문서의 의미적 특징을 최대한 손상시키지 않고 데이터베이스를 구축하는 일이다. 그러므로 본 논문에서는 고전문서의 특성을 고려하여 데이터베이스를 구축하고 관리할 수 있는 고전문서 편찬 관리시스템에 대하여 소개한다. 특히 고전문서 전산화에 반드시 필요한 확장 한자의 입력 및 검색기능과 문서의 전후관계를 고러만 문서 구조정보의 처리, 그리고 이러한 모든 기능을 효율적으로 수행하기 위한 정보검색 시스템에 대하여 소개한다.

  • PDF

Analysis of Korean Language to Optimize the Hangul Character Coding for Information Processing and Communication (한글의 정보처리 및 통신용 부호 최적화를 위한 한국어 분석)

  • Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.3
    • /
    • pp.375-380
    • /
    • 2015
  • This paper is studied the Korean language to optimize the Hangul character coding for information processing in information terminal device and transmission in network. The paper analyzed Hangul character in Korean language and use frequency of each character. The paper also compared the analysis result to Hangul characters which are coded in standard in Korean character and Unicode. This study referred "Modern Korean Use Frequency Rate Survey Result" issued by The National Institute of the Korean Language. There are total 58,437 Korean words in the report. As a result of this paper, the Korean word 58,437ea are consisted of Hangul character total 1,540ea. The highest use frequency character is "다" and its use frequency to total use frequency rate is 15%. The lowest use character is "휫"and its use frequency to total use frequency rate is 0.00003%. The number of analyzed Hangul character 1,540 is less 7.2 times and 1.5 times than Korean and Unicode standard respectively.

Consideration of CJK Joint Hanja Unicode when is used in AMI/HDB-3 Line Coding (AMI/HDB-3 회선부호화와 한·중·일 한자 유니코드 체계 고찰)

  • Tai, Dong-Zhen;Hong, Wan Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.7
    • /
    • pp.1011-1015
    • /
    • 2013
  • This paper analyses the violation rate of CJK joint Chines character Unicode to the source code rule. In the paper, Chinese character 150ea in Chinese Unicode which have relatively a higher frequency in use of a character was chosen to study. The frequency rate in use of the 150ea characters is about 50% of the total frequency rate of the Chinese characters. The study was applied the AMI/HDB-3 line coding/scrambling and HDLC protocol, According to the analyses, the number of violated characters were 77ea of 150 ea, frequency rate in use 29%. Therefore, when the violated 77ea characters are replaced to the matched character codes to the source coding rule, the processing rate of the line coder can be improved about 37%.

Automatic translation system for hangul's romanization Based on the World Wide Web (웹 기반하의 국어의 로마자 전사 표기 자동 변환 시스템)

  • 김홍섭
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.4
    • /
    • pp.108-114
    • /
    • 2002
  • After automatic translation system for hangul's romanization based on the World Wide Web converting korean-word, sentence, document to Transliteration letters by applying algorithm based phonological principles. even though a user do not know the basic principles of the usage of Korean-to-Romanization notations, It refers to corresponding character table that has been currently adopted the authority's standard proposition for Korean-to-Romanization notation rule concurrently, add to make possible to convert a machinized code as well. It provides font for toggling Korean-English mode, insert-edit mode by assigning ASCII codes and Unicode are hardly used to them. This program could be made in C++ progamming language and Unified Modeling Language to implement various font. font-expanding and condensing. alternative printing.

  • PDF

Improved Pattern Recoginition Coding System of a Handwriting Character with 3D (3D Magnetic Ball을 이용한 필기체 인식 향상 Coding System)

  • Sim, Kyu Seung;Lee, Jae Hong;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.9
    • /
    • pp.10-19
    • /
    • 2013
  • This Paper proposed the development of new magnetic sensor and recognition system to expendite pattern recognition of a handwriting character. Received character graphics should be performed the session and balancing and no extraction of end points, bend points and juntions separately. The Artifical intelligence algorithm is adapted to structure snalysis and recognition process by individual basic letter dictionary except for the handwriing character graphic dictionaryimproving error of recognition algorithm and enomous dictionary for generalization. In this Paper, recognition rate of the received character are compared with pre registered character at letter dictionary for performance test of magnetic ball sensor. As a result of unicode conversion and eomparison, the artificial intelligence study have recognition rate more than 95% at initial recognition rate of 70%.

Hangul Porting and Display Performance Comparison of an Embedded System (임베디드 시스템을 위한 한글 포팅 및 출력 성능 비교)

  • Oh, Sam-Kweon;Park, Geun-Duk;Kim, Byoung-Kuk
    • Journal of Digital Contents Society
    • /
    • v.10 no.4
    • /
    • pp.493-499
    • /
    • 2009
  • Three methods frequently used for Hangul display in computer systems are Standard Johab Code in which each of Hangul consonants and vowels is given a 5-bit code and each syllable created by combining them forms a 2-byte code, Standard Wansung Code in which each of all the syllables generally used for Hangul presentation forms a 2-byte code, and Unicode in which each syllable in most of the world's language systems is given a unique code so that it allows computers to consistently represent and manipulate them in a unified manner. An embedded system in general has a lower processing power and a limited amount of storage space, compared to a personal compute(PC) system. According to its usage, however, the former may have a processing power equal to that of the latter. Hence, when Hangul display needs to be adopted, an embedded system must choose a display method suitable for its own resource environment. This paper introduces a TFT LCD initialization method and pixel display functions of an LN2440SBC embedded board on which an LP35, a 3.5" TFT LCD kit, is attached. Using the initialization and pixel display functions, in addition, we compare three aforementioned Hangul display methods, in terms of their processing speeds and amounts of memory space required. According to experiments, Standard Johab Code requires less amount of memory space but more processing time than Standard Wansung Code, and Unicode requires the largest amount of memory space but the least processing time.

  • PDF

효율적인 다국어 프로그래밍 방법에 관한 연구

  • Park, Moon-Mok;Kim, Jin-Soo
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2007.05a
    • /
    • pp.69-74
    • /
    • 2007
  • 세계 시장에 진출하기 위해서 다국어를 지원하는 소프트웨어 제품을 개발하는 겨우 기존에 알려진 방법들에 비하여 개발 비용을 최소화하고 유지보수가 쉬운 효율적인 다국어 프로그래밍 방법을 제시한다.

  • PDF

An Arrangement of Hangul Codes in Unicode (유니코드에서 한글코드 정비 방안)

  • Byun, Jeongyong
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.234-236
    • /
    • 2018
  • 유니코드에 있는 3가지 한글코드를 훈민정음의 과학적 원리를 기준으로 정비하기 위하여 각각을 분석해서 평가한 다음 훈민정음 창제원리를 반영한 정음형 코드 즉 한글자모 코드가 나머지 음절표현을 포괄한다는 결과에 따라서 U+1100만 남기고 나머지 공간은 반납해야 한다는 정비 방향을 제안한다.

  • PDF

A Study on the Unicode Architecture (유니코드의 구조와 문제점)

  • 주리정
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2001.08a
    • /
    • pp.23-28
    • /
    • 2001
  • 유니코드는 현재 또는 과거에 존재했던 다양한 언어의 문자를 처리하기 위한 국제 표준코드이며 2바이트로 형성될 수 있는 65,000여 개의 영역에 전세계문자들을 차례대로 배열, 각 문자에 일련의 코드값을 지정하였다. 우리나라는 ISO 10646-1:1993의 유니코드 체계를 그대로 수용하여 1995년 KS C 5700-1995 표준규격으로 채택하였다. 이러한 유니코드의 경우 한글과 한자의 정렬문제, 옛한글이나 구결문자의 표현에 있어 제약이 있다. 이에 본고에서는 유니코드의 기본적인 개념, 그리고 한글 유니코드와 그 문제점에 대해 고찰하였다.

  • PDF

Study on the Chinese Character Use in Acupuncture & Moxibustion Textbook (침구학 교재에서의 한자사용 분석연구)

  • Chae, Han;Hwang, Sang-Moon;Lee, Byung-Wook;Yang, Gi-Young;Lee, Byung-Ryul;Kim, Jae-Kyu
    • Journal of Acupuncture Research
    • /
    • v.27 no.4
    • /
    • pp.187-194
    • /
    • 2010
  • Objectives : There has been a need for establishing operational curriculum for chinese characters and chinese writing used by traditional Korean medicine(TKM), but it was not thoroughly recognized so far. Methods : We analysed the usage of unicode chinese characters of acupuncture & moxibustion textbook to recognize the prerequisite chinese characters for TKM studies as clinical perspectives. Results : It was found that 穴, 經, 鍼, 法, 寸, 部, 分, 刺, 下, 上, 中, 位, 氣, 陽, 灸, 脈, 陰, 治, 足, 主 are the most frequently used 20 chinese characters. We also showed that adequate prerequisite chinese character should be designated for the more efficient education of TKM. Conclusions : This study was the first systematic approach to get essential and prerequisite chinese characters for the education of TKM especially for the acupuncture & moxibustion. The prerequisite characters by this study will be used for the development of KEET (Korean Medicine Education Eligibility Test), entrance exam to the Colleges of Oriental Medicine and textbooks, and educational curriculum of premed students.