• Title/Summary/Keyword: ideographs

Search Result 7, Processing Time 0.017 seconds

Problems with Chinese Ideographs Search in Unicode and Solutions to Them (유니코드 한자 검색의 문제점 및 개선방안)

  • Lee, Jeong-hyeon
    • Informatization Policy
    • /
    • v.19 no.3
    • /
    • pp.50-63
    • /
    • 2012
  • This thesis is designed to analyze how the search for Chinese ideographs is done in Koreanology-related domestic databases, domestic library databases, domestic academic databases, and overseas library databases, with a view to identifying problems and suggesting solutions to them. The major reasons that impede Chinese ideographs search in Unicode are classified as 'multicode characters', 'simplified characters', and 'variant characters', and three characters are chosen as samples to describe the current practice. Thirteen Koreanology-related databases, five domestic library databases, five domestic academic databases and two overseas library databases are analyzed in terms of Chinese ideographs search. To support search for multicode characters, the open source of the Unicode consortium must be applied. To improve search for simplified and variant characters, a matching table must be standardized and proposed to the Unicode consortium.

  • PDF

Improvement plan for 'Newly found ideographs(新出漢字)' in the digitalizing business of the old Korean documents (고전 자료 디지털화사업에서의 신출한자 처리 개선방안)

  • Lee, Jeong-Hwa
    • Korean Journal of Oriental Medicine
    • /
    • v.10 no.1
    • /
    • pp.1-14
    • /
    • 2004
  • As entering the information age of the 21st century, Korea is actively processing many digitalizing businesses related to information source of the Korean academic science at the government level based on the Korean advanced digital technologies, which makes them more evolved through the internet networks in Korea. The definition of 'Newly found ideographs(新出漢字)' are made by researching and extracting from the old Chinese documents through the digitalizing process and they are not registered yet among the block of Unicode & extended Chinese characters those are existent international standard. Presently Korea is in the middle of brisk developing computerized old documents in the huge scale. Meanwhile, the international standard of Chinese characters in mostly Asian countries where using them is processing and developing by IRG. Therefore, Korean processing works is very important which are included extracting precisely 'Newly found ideographs' founded from building its database, organizing as an international standard code, submitting the International organization and finally registering as the best standard code.

  • PDF

Improvement plan for 'Newly found ideographs(新出漢字)' in the digitalizing business of the old Korean Medicine documents - with 'knowledge of oriental web service' - (한의학고전문헌 DB구축과 신출자 처리 - 한의학지식정보자원웹서비스를 중심으로 -)

  • Lee, Jeonghwa;Kim, Hong Jun
    • The Journal of Korean Medical History
    • /
    • v.18 no.1
    • /
    • pp.127-141
    • /
    • 2005
  • As we enter the 21st century, the Information Era, we are making a national effort to digitalize the information resources of Korean Studies, based on our leading digital technology. However, there is much difficulty of computerizing Chinese characters used in Korea, China, and Japan, with the of technologies developed by the West. This paper gives an example of how to register and process the Newly found ideographs(新出漢字) put forth by Digitalization of Knowledge information resource on Korean oriental medicine.

  • PDF

A Study of the Identity of Hangul Typography (한글 타이포그라피의 정체성에 관한 연구)

  • 안상수
    • Archives of design research
    • /
    • v.13 no.1
    • /
    • pp.103-110
    • /
    • 2000
  • Hangul came to life as part of the East Asian culture of the Chinese ideograph. Korean letter-culture is starkly different from that of Western letter-culture. In the Orient, letters were sacred and incantory; they were objects of awe, which incorporated elements of the majestic, mysterious, and of ritual. Here we had cultural tradition that acknowledged the intrinsic value of the letters. And it was in this context that Hangul was born as a completely phonetic system of writing. However, the characteristics of Hangul are quite different from those of Chinese ideographs, which are designed to convey a certain meaning. Despite the fact that Hangul is phonetic, its roots lie most definitely in the image of Chinese ideographs. This is something that contrasts with the roots of the Latin alphabet, which have been lost in its long journey of evolution. As a phonetic writing system, a notable characteristic of Hangul is that it has this and the attributes of image. In other words, in that Hangul is a compound, it shares some of the same attributes as Chinese ideographs, but also in that it is a phonetic writing system it is dose to the Latin alphabet. Hangul is definitely a visual writing system that has its origins in the visual culture of Chinese characters as well as being functionally a highly developed phonetic writing system. In short, Hangul has both of these attributes in one writing system. These characteristics of Hangul, for us living in the era of the image, are parts that awaken us to the meaning of existence in our visual culture. Unique among the world's writing systems, the identity of Hangul typography will become none other than the essence of our visual culture.

  • PDF

Distance Measures in HMM Clustering for Large-scale On-line Chinese Character Recognition (대용량 온라인 한자 인식을 위한 클러스터링 거리계산 척도)

  • Kim, Kwang-Seob;Ha, Jin-Young
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.9
    • /
    • pp.683-690
    • /
    • 2009
  • One of the major problems that prevent us from building a good recognition system for large-scale on-line Chinese character recognition using HMMs is increasing recognition time. In this paper, we propose a clustering method to solve recognition speed problem and an efficient distance measure between HMMs. From the experiments, we got about twice the recognition speed and 95.37% 10-candidate recognition accuracy, which is only 0.9% decrease, for 20,902 Chinese characters defined in Unicode CJK unified ideographs.

A Comparative Study of Aphasics' Abilities in Reading and Writing Hangul and Hanja

  • Kim, Heui-Beom
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.289-293
    • /
    • 1996
  • In Korean, as with Kana and Kanji in Japanese, two kinds of word-writing systems--Hangul (the Korean alphabet) and Hanja (the Chinese character; Kanji in Japanese)--have been and still are being used. Hangul is phonetic while Hanja is ideographic. A phonetic alphabet represents the pronunciation of words, wheras ideographs are where a character of a writing system represents a concept. Aphasics suffer from language disorders following brain damage. The reading and writing of Hangul and Hanja by two Korean Broca's aphasics were analyzed with two goals. The first goal was to confirm the functional autonomy of reading and writing systems in the brain that has been argued by other researchers. The second goal was to reveal what difference the subjects show in reading and writing Hangul and Hanja. As experimental materials, 50 monosyllabic words were chosen in Hangul and Hanja respectively. The 50 word pairs of Hangul and Hanja have the same meaning and are also the most familiar monosyllabic words for a group of normal adults in their fifties and sixties. The errors that the aphasic subjects made in performing the experimental materials are analyzed and discussed here. This analysis has confirmed that reading and writing systems are located in different parts in the brain. Furthemore, it seems clear that the two writing systems of Hangul and Hanja have their own respective processes.

  • PDF

Korean-Chinese Person Name Translation for Cross Language Information Retrieval

  • Wang, Yu-Chun;Lee, Yi-Hsun;Lin, Chu-Cheng;Tsai, Richard Tzong-Han;Hsu, Wen-Lian
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.489-497
    • /
    • 2007
  • Named entity translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating person names, the most common type of name entity in Korean-Chinese cross language information retrieval (KCIR). Unlike other languages, Chinese uses characters (ideographs), which makes person name translation difficult because one syllable may map to several Chinese characters. We propose an effective hybrid person name translation method to improve the performance of KCIR. First, we use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. Second, we adopt the Naver people search engine to find the query name's Chinese or English translation. Third, we extract Korean-English transliteration pairs from Google snippets, and then search for the English-Chinese transliteration in the database of Taiwan's Central News Agency or in Google. The performance of KCIR using our method is over five times better than that of a dictionary-based system. The mean average precision is 0.3490 and the average recall is 0.7534. The method can deal with Chinese, Japanese, Korean, as well as non-CJK person name translation from Korean to Chinese. Hence, it substantially improves the performance of KCIR.

  • PDF