• Title/Summary/Keyword: Chinese Character Font

Search Result 10, Processing Time 0.028 seconds

A study on compression and decompression of hanguel and chinese character bit map font (한글 한자 비트 맵 폰트의 압축과 복원에 관한연구)

  • 조경윤
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.4
    • /
    • pp.63-71
    • /
    • 1996
  • In this paper, a variable length block code for real time compression and decompression of hanguel and chinese character bit map font is proposed. The proposed code shows a good compression ratio in complete form of hangeul myoungjo and godik style and chinese batang and doddum style bit map font. Besides, a compression and decompression ASIC is designed and simulated on CAD. The 0.8 micron CMOS sea of gate is used to implement the ASIC in amount of 5,200 gates, and it runs at simple hardware and compress and decompress at 33M bit/sec at maximum, which is ideal for real time applications.

  • PDF

A Distinction of the Korean Character, Chinese Character and English Character using the Threshold Stroke Density (임계 획 밀도를 이용한 한글, 한자, 영문구분)

  • 원남식
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.5 no.4
    • /
    • pp.32-38
    • /
    • 2000
  • It is an important factor to distinguish the kind of the character for increasing recognition rate before the character recognition in the document recognition system composed of the multi-font and multi-letter. All the letters of each country have a various men characteristic in the each composition. In this paper, we used the stroke density as a method to distinguish the letter, and it has been adopted Korean, English and Chinese character. Input data is processed by the normalization to adopt multi-font document. Proposed method has been proved by the results of experiment the fact that the distinction probability of the Korean and English is more than 80%.

  • PDF

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition

  • Liu, Jingxin;Cheng, Jieren;Peng, Xin;Zhao, Zeli;Tang, Xiangyan;Sheng, Victor S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1833-1848
    • /
    • 2022
  • Named entity recognition (NER) is an important basic task in the field of Natural Language Processing (NLP). Recently deep learning approaches by extracting word segmentation or character features have been proved to be effective for Chinese Named Entity Recognition (CNER). However, since this method of extracting features only focuses on extracting some of the features, it lacks textual information mining from multiple perspectives and dimensions, resulting in the model not being able to fully capture semantic features. To tackle this problem, we propose a novel Multi-view Semantic Feature Fusion Model (MSFM). The proposed model mainly consists of two core components, that is, Multi-view Semantic Feature Fusion Embedding Module (MFEM) and Multi-head Self-Attention Mechanism Module (MSAM). Specifically, the MFEM extracts character features, word boundary features, radical features, and pinyin features of Chinese characters. The acquired font shape, font sound, and font meaning features are fused to enhance the semantic information of Chinese characters with different granularities. Moreover, the MSAM is used to capture the dependencies between characters in a multi-dimensional subspace to better understand the semantic features of the context. Extensive experimental results on four benchmark datasets show that our method improves the overall performance of the CNER model.

Few-Shot Content-Level Font Generation

  • Majeed, Saima;Hassan, Ammar Ul;Choi, Jaeyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1166-1186
    • /
    • 2022
  • Artistic font design has become an integral part of visual media. However, without prior knowledge of the font domain, it is difficult to create distinct font styles. When the number of characters is limited, this task becomes easier (e.g., only Latin characters). However, designing CJK (Chinese, Japanese, and Korean) characters presents a challenge due to the large number of character sets and complexity of the glyph components in these languages. Numerous studies have been conducted on automating the font design process using generative adversarial networks (GANs). Existing methods rely heavily on reference fonts and perform font style conversions between different fonts. Additionally, rather than capturing style information for a target font via multiple style images, most methods do so via a single font image. In this paper, we propose a network architecture for generating multilingual font sets that makes use of geometric structures as content. Additionally, to acquire sufficient style information, we employ multiple style images belonging to a single font style simultaneously to extract global font style-specific information. By utilizing the geometric structural information of content and a few stylized images, our model can generate an entire font set while maintaining the style. Extensive experiments were conducted to demonstrate the proposed model's superiority over several baseline methods. Additionally, we conducted ablation studies to validate our proposed network architecture.

A Character Shape Encoding Method to Input Chinese Characters in Old Documents (고문헌 벽자(僻字) 입력을 위한 한자 자형 부호화 방법)

  • Kim, Kiwang
    • Journal of Korean Medical classics
    • /
    • v.32 no.1
    • /
    • pp.105-116
    • /
    • 2019
  • Objectives : There are many secluded Chinese characters - so called Byeokja (僻字) in ancient classic literature, and Chinese characters that are not registered in Unicode and Variant characters (heterogeneous characters) that cannot be found in the current font sets often appear. In order to register all possible Chinese characters including such characters as units of information exchange, this study attempts to propose a method to encode the morphological information of Chinese characters according to certain rules. Methods : This study suggests the methods to encode the connection between the nodules constituting the Chinese character and the coordinates of the nodules. In addition to that, rules for expressing information about curves, expressions of aspect ratios of characters, rules for minimizing coordinate lines, and rules for expressing aggregation status of character components are added. Results : Through the proposed method, it is possible to generate codes of a certain length by extracting only information expressing the morphological configuration of characters. Conclusions : The method of character encoding proposed in this study can be used to distinguish variant characters with small variations in Byeokja, new Chinese characters and character strokes and to store and search them.

Animation Generation for Chinese Character Learning on Mobile Devices (모바일 한자 학습 애니메이션 생성)

  • Koo, Sang-Ok;Jang, Hyun-Gyu;Jung, Soon-Ki
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.894-906
    • /
    • 2006
  • There are many difficulties to develop a mobile contents due to many constraints on mobile environments. It is difficult to make a good mobile contents with only visual reduction of existing contents on wire Internet. Therefore, it is essential to devise the data representation and to develop the authoring tool to meet the needs of the mobile contents market. We suggest the compact mobile contents to learn Chinese characters and developed its authoring tool. The animation which our system produces is realistic as if someone writes letters with pen or brush. Moreover, our authoring tool makes a user generate a Chinese character animation easily and rapidly although she or he has not many knowledge in computer graphics, mobile programming or Chinese characters. The method to generate the stroke animation is following: We take basic character shape information represented with several contours from TTF(TrueType Font) and get the information for the stroke segmentation and stroke ordering from simple user input. And then, we decompose whole character shape into some strokes by using polygonal approximation technique. Next, the stroke animation for each stroke is automatically generated by the scan line algorithm ordered by the stroke direction. Finally, the ordered scan lines are compressed into some integers by reducing coordinate redundancy As a result, the stroke animation of our system is even smaller than GIF animation. Our method can be extended to rendering and animation of Hangul or general 2D shape based on vector graphics. We have the plan to find the method to automate the stroke segmentation and ordering without user input.

한중한자자형비교연구(韓中漢字字形比較硏究)2 - 한문(漢文) 교육용(敎育用) 기초한자(基礎漢字) 고등학교용(高等學校用) 900자(字)를 중심(中心)으로

  • Gang, Hye-Geun
    • 중국학논총
    • /
    • no.62
    • /
    • pp.1-25
    • /
    • 2019
  • 作者对韩国教育部指定的"漢文敎育用基礎漢字高等學校用900字"跟中国规范汉字字形, 进行比较分析的结果如下: (1)字形完全一样的(在附录"高中学校用900字"汉字旁边标注为"="), 一共有424个汉字(约占47%); (2)字形相似的(在附录"高中学校用900字"汉字旁边标注为"Δ"), 一共有86个汉字(约占10%); (3)字形不同的(在附录"高中学校用900字"汉字旁边标注为"×"), 一共有389个汉字(约占43%). 字形相似, 不等于字形相同, 所以也应该看作字形不同的字, 属于这两种情况的字合起来, 一共有475个(约占53%). 韩中汉字字形不同的主要来源, 不止"简化字"和"传承字里的新字形", 还有"从一些异体字里选出来的正体字"也和韩国常用汉字字形不同.

SEL-RefineMask: A Seal Segmentation and Recognition Neural Network with SEL-FPN

  • Dun, Ze-dong;Chen, Jian-yu;Qu, Mei-xia;Jiang, Bin
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.411-427
    • /
    • 2022
  • Digging historical and cultural information from seals in ancient books is of great significance. However, ancient Chinese seal samples are scarce and carving methods are diverse, and traditional digital image processing methods based on greyscale have difficulty achieving superior segmentation and recognition performance. Recently, some deep learning algorithms have been proposed to address this problem; however, current neural networks are difficult to train owing to the lack of datasets. To solve the afore-mentioned problems, we proposed an SEL-RefineMask which combines selector of feature pyramid network (SEL-FPN) with RefineMask to segment and recognize seals. We designed an SEL-FPN to intelligently select a specific layer which represents different scales in the FPN and reduces the number of anchor frames. We performed experiments on some instance segmentation networks as the baseline method, and the top-1 segmentation result of 64.93% is 5.73% higher than that of humans. The top-1 result of the SEL-RefineMask network reached 67.96% which surpassed the baseline results. After segmentation, a vision transformer was used to recognize the segmentation output, and the accuracy reached 91%. Furthermore, a dataset of seals in ancient Chinese books (SACB) for segmentation and small seal font (SSF) for recognition were established which are publicly available on the website.

A historical study on the flexibility square-format typeface and the prospects - Focused on the three-pairs fonts of hangeul - (탈네모글꼴에 관한 역사적 연구와 전망 - 세벌식 한글 글꼴을 중심으로 -)

  • Yu, Jeong-Mi
    • Archives of design research
    • /
    • v.19 no.2 s.64
    • /
    • pp.241-250
    • /
    • 2006
  • Hangeul as the Korean unique characters were invented according to some character-making principles and based on scholars' exhaustive researches. While most of the characters in the world evolved naturally, Hangeul was invented based on a precise linguistic analysis of the time, and therefore, it is most scientific and reasonable among various characters throughout the world. Nevertheless, Hangeul typeface designs do not seem to inherit the ideology of scientific and reasonable Hangeul correctly. For the square forms have been used intact due to the influences from the Chinese characters which prevailed during the time. If a single set of square characters should be designed, as much as 11,172 fonts should be designed, which suggests that advantages of Mangeul may not well be used fully; Hangeul was invented to visualize every sound with the combinations of 28 vowels and consonants. Problems of such square fonts began to be identified since 1900's when typewriters were introduced first from the West. Since a typewriter is designed with 28 characters laid out on its keyboard by using such combinations, the letters may be easily combined on it. The so-called the flexibility square-format typeface was born as such. Specially, the three-pairs fonts of these can be combined up to 67 letters including vowels and consonants. The three-pairs fonts system can help to solve the problems arising form the conventional square fonts and inherit the original ideology of Hangeul invention. This study aims to review the history of the three-pairs fonts designs facilitated by mechanic encoding of Hangeul and thereupon, suggest some desirable directions for future Hangeul fonts. Since the flexibility square-format typeface is expected to evolve more and more owing to development of the digital technology, they would serve our age of information in terms of both functions and convenience. Just as Hunminjongum tried to be literally independent from the Chinese characters, so the flexibility square-format typeface designs would serve to recover identity of our Hangeul font designs.

  • PDF