• Title/Summary/Keyword: Unicode

Search Result 68, Processing Time 0.032 seconds

An Anti-Forensic Technique for Hiding Data in NTFS Index Record with a Unicode Transformation (유니코드 변환이 적용된 NTFS 인덱스 레코드에 데이터를 숨기기 위한 안티포렌식 기법)

  • Cho, Gyu-Sang
    • Convergence Security Journal
    • /
    • v.15 no.7
    • /
    • pp.75-84
    • /
    • 2015
  • In an "NTFS Index Record Data Hiding" method messages are hidden by using file names. Windows NTFS file naming convention has some forbidden ASCII characters for a file name. When inputting Hangul with the Roman alphabet, if the forbidden characters for the file name and binary data are used, the codes are convert to a designated unicode point to avoid a file creation error due to unsuitable characters. In this paper, the problem of a file creation error due to non-admittable characters for the file name is fixed, which is used in the index record data hiding method. Using Hangul with Roman alphabet the characters cause a file creation error are converted to an arbitrary unicode point except Hangul and Roman alphabet area. When it comes to binary data, all 256 codes are converted to designated unicode area except an extended unicode(surrogate pairs) and ASCII code area. The results of the two cases, i.e. the Hangul with Roman alphabet case and the binary case, show the applicability of the proposed method.

Development of a Font Processing System for GSM Mobile Phone (GSM 핸드폰을 위한 폰트 처리 시스템의 설계 및 구현)

  • Lee, Sang-Bum;Lee, Yong-Hun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.3
    • /
    • pp.951-957
    • /
    • 2010
  • In this thesis, we propose a font development system that can handle various fonts efficiently in the GSM mobile terminals. The ASCII code was widely used to express characters on the computer in the beginning but it has limitation for representing many characters. Recently, Unicode was developed to add more characters. Researches on code systems are still on going to express the characters more efficiently. Attempt of applying this kind of Unicode to the mobile terminal didn't work efficiently since there are too many characters for various languages. In this research, we designed and developed a font system to shorten processing time and efforts that apply Unicode to mobile terminals to solve these problems. Our system can save processing time and efforts since it reduces the meaningless processing compared to other systems.

A Study on Auto-Generation of Dactylology and Chirology Animation from Text Inputs (텍스트 입력 기반 지화 및 수화 애니메이션 자동 생성에 관한 연구)

  • Lee, Geum-Yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.04b
    • /
    • pp.1151-1154
    • /
    • 2002
  • Unicode 와 지화, 수화의 공통점은 각국 언어의 자모 혹은 단어에 고유한 표현양식이 1:1 로 대응되어 있다는 것이다. Unicode 의 경우 각 자모별 고유의 헥사코드가 지정되어 있고 지화, 수화의 경우 각 자모별, 단어별로 고유한 동작을 표현하는 손동작이 지정되어 있는 것이다. 본 논문에서는 텍스트 입력에 대응하는 지화, 수화 손동작 그림을 연속적으로 렌더링함으로써 애니메이션 효과를 낼 수 있는 알고리즘과 그 구현에 관한 연구를 소개한다.

  • PDF

A study on Mapping the Unicode based Hangul-Hanja for prescription names in Korean Medicine (처방명 연계를 위한 유니코드 한자 기반의 한글-한자 매핑정보 구축에 관한 연구)

  • Jeon, Byoung-Uk;Kim, An-Na;Kim, Ji-Young;Oh, Yong-Taek;Kim, Chul;Song, Mi-Young;Jang, Hyun-Chul
    • Korean Journal of Oriental Medicine
    • /
    • v.18 no.3
    • /
    • pp.133-139
    • /
    • 2012
  • Objective : UMLS is 'Ontology' which establishes the database for medical terminology by gathering various medical vocabularies representing same fundamental concepts. Method : Although Chinese character are represented in the Chinese part of Korean Unicode system in a computer, writing of Chinese characters is vary depending on Chinese input systems and Chinese writers' levels of knowledge. As the result of this, representation of Chinese writing in a computer will be considerably different from an old Chinese document. Therefore, a meaningful relationship between digital Chinese terminology and translated Korean is necessary in order to build Ontology for Chinese medical terms from Oriental medical prescription in a computer system. Result : This research will present 1:1 mapping information among the Chinese characters used in the Oriental medical prescription with analysis of 'same character different sound' and 'same meaning different shape' in Chinese part of Unicode systems. Conclusions : Furthermore, the research will provide top-down menu of relationship between Chinese term and Korean term in medical prescription with assumption of that the Oriental medical prescription has its own unique meaning.

New Text Steganography Technique Based on Part-of-Speech Tagging and Format-Preserving Encryption

  • Mohammed Abdul Majeed;Rossilawati Sulaiman;Zarina Shukur
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.170-191
    • /
    • 2024
  • The transmission of confidential data using cover media is called steganography. The three requirements of any effective steganography system are high embedding capacity, security, and imperceptibility. The text file's structure, which makes syntax and grammar more visually obvious than in other media, contributes to its poor imperceptibility. Text steganography is regarded as the most challenging carrier to hide secret data because of its insufficient redundant data compared to other digital objects. Unicode characters, especially non-printing or invisible, are employed for hiding data by mapping a specific amount of secret data bits in each character and inserting the character into cover text spaces. These characters are known with limited spaces to embed secret data. Current studies that used Unicode characters in text steganography focused on increasing the data hiding capacity with insufficient redundant data in a text file. A sequential embedding pattern is often selected and included in all available positions in the cover text. This embedding pattern negatively affects the text steganography system's imperceptibility and security. Thus, this study attempts to solve these limitations using the Part-of-speech (POS) tagging technique combined with the randomization concept in data hiding. Combining these two techniques allows inserting the Unicode characters in randomized patterns with specific positions in the cover text to increase data hiding capacity with minimum effects on imperceptibility and security. Format-preserving encryption (FPE) is also used to encrypt a secret message without changing its size before the embedding processes. By comparing the proposed technique to already existing ones, the results demonstrate that it fulfils the cover file's capacity, imperceptibility, and security requirements.

A Unicode based Deep Handwritten Character Recognition model for Telugu to English Language Translation

  • BV Subba Rao;J. Nageswara Rao;Bandi Vamsi;Venkata Nagaraju Thatha;Katta Subba Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.101-112
    • /
    • 2024
  • Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively.

Support on Ideograph Characters Search of Unicode Based Information System (정보 시스템의 유니코드 기반 한자 검색 지원)

  • Yoon, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.375-391
    • /
    • 2007
  • Unicode Han ideograph character set differed from the our principle of the phonetic value ordering in that it followed the principle of KangXi radical-stroke ordering of the characters. Therefore, information system should support ideograph search on precise analysis of materials which consist of korean character (hangul) and ideograph character (hanja). History Information system has been maintaining Hanja(Chinese Character) to Hangul Dictionary, Terminology Dictionary for composition, borrowing, non-ideographic principles, Variant Forms Dictionary, and Recently discovered Chinese Characters List.

A Study on the Use of Hangeul Identifier in Java (Java에서 한글 식별자 사용에 관한 연구)

  • Yang, Dan-Hee
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.10
    • /
    • pp.53-60
    • /
    • 2017
  • The use of 'Idumunja' for programs is inevitable before Unicode came out. However, even now that Unicode has been established as an international standard in both name and reality, there has been an insistence that the use of Hangeul identifiers should be avoided. This study surveyed the students for the reasons why they prefer to use English identifiers and for the notations that can substitute the function of English capital letters in using Hangeul identifiers. Then, we discussed the vanity of argument that the use of English identifiers is a global trend, and proposed two notations for the use of Korean identifiers. In order to improve the productivity of software, programmer's job satisfaction, and the ease of maintenance, the use of Hangeul identifiers should become rapidly common.

Study on Scrambling Occurrence in Line Coder for UTF-8 Hangul Syllable Code based on Unicode (유니코드 기반 UTF-8 한글글자마디 부호의 회선부호기내 스크램블링 발생에 관한 연구)

  • Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.7
    • /
    • pp.831-836
    • /
    • 2015
  • This paper studied scrambling occurrence in the line coder for UTF-8 hangul syllable code based on unicode. The paper suggested that scrambling is occurred when consecutive four "0" bit is entered into the line coder from the source codes. Currently, ITU-T is applying HDB-3 scrambling method in AMI line coder, According to the study result, scrambling is occurred about 39% in UTF-8 code on Unicoe Hangul syllable.

Designing Unicode-compliant Indic-script based Institutional Digital Repository with special reference to Bengali

  • Roy, Bijan Kumar;Biswas, Subal Chandra;Mukhopadhyay, Parthasarathi
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.8 no.3
    • /
    • pp.53-67
    • /
    • 2018
  • Local languages based information storage and retrieval system is essential for any online digital repository system. This paper reports the development of an interface in Bengali that allows users not only browsing and searching Indic-script based documents but also allows administrator performing various system level operations. This paper briefly describes the origin and key characteristics of Indic-scripts along with their encoding in Unicode standard with special reference to Bengali language. It also demonstrates the development processes of Bengal-script based information representation and retrieval (IRR) system viz. BURA (Burdwan University Research Archive) using different open standard and open source software (OSS) including different factors essential for building such successful Indic-script based multilingual digital libraries. The suggested strategies may help digital library developers to design an appropriate multi-script based information access services in any other Indic-script based languages.