• Title/Summary/Keyword: Language Conversion

Search Result 199, Processing Time 0.03 seconds

Organizing an in-class hackathon to correct PDF-to-text conversion errors of Genomics & Informatics 1.0

  • Kim, Sunho;Kim, Royoung;Nam, Hee-Jo;Kim, Ryeo-Gyeong;Ko, Enjin;Kim, Han-Su;Shin, Jihye;Cho, Daeun;Jin, Yurhee;Bae, Soyeon;Jo, Ye Won;Jeong, San Ah;Kim, Yena;Ahn, Seoyeon;Jang, Bomi;Seong, Jiheyon;Lee, Yujin;Seo, Si Eun;Kim, Yujin;Kim, Ha-Jeong;Kim, Hyeji;Sung, Hye-Lynn;Lho, Hyoyoung;Koo, Jaywon;Chu, Jion;Lim, Juwon;Kim, Youngju;Lee, Kyungyeon;Lim, Yuri;Kim, Meongeun;Hwang, Seonjeong;Han, Shinhye;Bae, Sohyeun;Kim, Sua;Yoo, Suhyeon;Seo, Yeonjeong;Shin, Yerim;Kim, Yonsoo;Ko, You-Jung;Baek, Jihee;Hyun, Hyejin;Choi, Hyemin;Oh, Ji-Hye;Kim, Da-Young;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.18 no.3
    • /
    • pp.33.1-33.7
    • /
    • 2020
  • This paper describes a community effort to improve earlier versions of the full-text corpus of Genomics & Informatics by semi-automatically detecting and correcting PDF-to-text conversion errors and optical character recognition errors during the first hackathon of Genomics & Informatics Annotation Hackathon (GIAH) event. Extracting text from multi-column biomedical documents such as Genomics & Informatics is known to be notoriously difficult. The hackathon was piloted as part of a coding competition of the ELTEC College of Engineering at Ewha Womans University in order to enable researchers and students to create or annotate their own versions of the Genomics & Informatics corpus, to gain and create knowledge about corpus linguistics, and simultaneously to acquire tangible and transferable skills. The proposed projects during the hackathon harness an internal database containing different versions of the corpus and annotations.

Design and Application of XTML Script Language based on XML (XML을 이용한 스크립트 언어 XTML 의 설계 및 응용)

  • Jeong, Byeong-Hui;Park, Jin-U;Lee, Su-Yeon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.5 no.6
    • /
    • pp.816-833
    • /
    • 1999
  • 스타일 정보를 중심으로 하는 기존의 워드 프로세서의 출력 문서들을 차세대 인터넷 문서인 XML문서방식에 따라서 표기하고 또한 제목, 초록, 장 및 단락 등과 같은 논리적인 구조를 반영할 수 있도록 구조화함으로써 문서들의 상호교환뿐만 아니라 인터넷에서 유효하게 사용할 수가 있다. 본 논문에서는 스타일 또는 표현 속성 중심으로 하는 다양한 문서의 평면 구조를 XML의 계층적인 논리적인 구조로, 또한 다양한 DTD(Document Type Definition)환경하에서 변경시킬 수가 있는 변환 스크립트 언어를 표현할 수 있도록 하기 위하여 XTML(XML Transformation Markup Language)을 DTD형식으로 정의하고 이를 이용하여 변환 스크립트를 작성하였으며 자동태깅에 적용하여 보았다.XTML은 그 인스턴스에 해당하는 변환 알고리즘의 효과적인 수행을 위하여 즉 기존의 XML문서를 효과적으로 다루기 위하여 문서를 GROVE라는 트리 구조로 만들어 저장하고 또한 이를 조작할 수 있는 기능 및 다양한 명령어 인터페이스를 제공하였다. Abstract Output documents of existing word processors based on style informations or presentation attributes can be structured by converting them into XML(Extensible Markup Language) documents based on hierarchically logical structures such as title, abstract, chapter and so on. If so, it can be very useful to interchange and manipulate documents under Internet environment. The conversion need the complicate process calling auto-tagging by which elements of output documents can be inferred from style informations and sequences of text etc, and which is different from various kinds of simple conversion.In this paper, we defined XTML(XML Transformation Markup Language) of DTD(Document Type Definition) form and also defined the script language as instances of its DTD for the auto-tagging. XTML and its DTD are represented in XML syntax.Especially XTML includes various functions and commands to generate tree structure named as "GROVE" and also to process, store and manipulate the GROVE in order to process efficiently XML documents.documents.

SDL-OPNET Model Conversion Technique for the Development of Communication Protocols with an Integrated Model Design Approach (통합 모델 설계 방식 기반 통신 프로토콜 개발을 위한 SDL-OPNET 모델 변환 기법)

  • Kim, Jae-Woo;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.2
    • /
    • pp.67-76
    • /
    • 2010
  • Although both functional verification and performance evaluation are necessary for the development of effective and reliable communication systems, they have been often performed independently; by functional modeling with formal language tools and by performance modeling with professional network performance evaluation tools, respectively. Separate and repeated modeling of one system, however, would often result in cost increase and inconsistency between the models. This paper proposes an integrated model design approach in order to overcome this problem that evaluates the performance of a communication protocol designed in SDL with SDL-OPNET model conversion. The proposed technique generates OPNET skeleton code from Tau-generated C code of the SDL model by analyzing the relations between SDL and OPNET models. IEEE 802.2 LLC protocol was used as an example of model conversion to show the applicability and effectiveness of the proposed technique.

Statistical Approach to the Automatic Korean-English String Conversion (통계적 기법에 의한 한-영 문자열의 자동 전환)

  • Ahn, Young-Hoon;Kang, Seung-Shik
    • Annual Conference on Human and Language Technology
    • /
    • 2001.10d
    • /
    • pp.205-208
    • /
    • 2001
  • 한글 혹은 영어 문자열을 입력할 때 입력 모드를 수동으로 전환하지 않더라도 입력된 문자열이 한글인지, 영어인지를 자동으로 판단하여 해당 문자열로 변환하는 방법을 제안한다. 한글 문자열일 확률을 계산하기 위해 음절 구성 요건과 음절 빈도 정보를 이용하고, 영어 문자열일 확률을 계산하기 위해 영어 bigram 및 trigram 정보를 이용한다. 또한, 한글과 영어가 혼합된 문자열은 한글일 확률과 영어일 확률이 교차되는 경계 위치를 인식함으로써 혼합 문자열을 생성한다.

  • PDF

XML Conversion of HTML Documents Using Web Schema (웹 스키마를 이용한 HTML 문서의 XML 변환)

  • 오금용;박동문;황인준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.175-177
    • /
    • 2001
  • 최근에 웹(Web) 사용의 지속적이 증가로 인하여 정보가 급증하고, 이로 인하여 웹은 정보교환의 의미뿐아니라 정보 저장이라는 중요한 의미를 지니게 되었다. 하지만 현재 많은 웹 페이지들이 HTML(Hyper Text Markup Language)문서로 제작되어 있어 정보관리의 의미에서 많은 부족함이 있고 이를 보완하기 위한 방법 중에 하나가 구조적이고 기능적 언어로 부상하고 있는 XML(exTensive Markup Language)을 기반으로 하여 문서를 제작하거나 변환하는 것이다. 본 논문은 HTML문서를 XML문서로 변환하는데 있어HTML문서 구조를 분석하고 분석결과를 토대로 형성되는 웹 스키마(Schema)를 이용하여 구조 중심의 변환이 이루어지도록 하는 방법에 대해서 제안한다.

  • PDF

A Study on Hangul Code Conversion Interface (한글 코드 변환 인터페이스에 관한 연구)

  • Yun, Ho-Sang;Baik, Doo-Kwon;Hwang, Chong-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1990.11a
    • /
    • pp.39-46
    • /
    • 1990
  • 최근 컴퓨터의 발달로 인하여 많은 부분에서 컴퓨터가 사용되고 있다. 그러나 컴퓨터가 한글 사용을 전혀 고려하지 않고 개발되었기 때문에 한글 사용에 많은 문제점이 발생하였다. 본 논문에서는 컴퓨터에서 한글의 사용에 있어서의 문제점을 고찰해보고 이를 해결하기 위하여 한글 코드변환 인터페이스를 연구하였다.

  • PDF

Prospect of Treatment with Herb Medicine for Developmental Delay of Language and Intelligence Quotient (어지와 지능지수에 대한 한약치료의 전망)

  • Park, Jae-Hyung;Park, Jae-Hyun;Yun, Young-Ju;Jeong, Seul-Ki;Lim, Ja-Sung;Paeck, Eun-Kyung
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.21 no.4
    • /
    • pp.1025-1029
    • /
    • 2007
  • It is widely assumed that Intelligence Quotient (IQ) is determined by inherent disposition and environmental factor. IQ is estimated by age-conversion score and stabilized around age 4 and IQ of adult age can be predicted after age 10. Though children with Mental Retardation (MR) are delayed in language development since early infant period, they receive only special education including speech and language therapy, but no special medication. In traditional Korean medicine, the etiology and treatment for developmental delay of language have been handed down for a long time. Some studies on herbs and prescriptions for improving language development have been undertaken recently. We have found several cases of significant elevation of IQ in the children treated with long term medications of Korean herbal medicine for improvement of language. Analyzing these cases, especially performance IQ showed significant change. Therefore we suggest that Korean herbal medicine might improve cognition development in children with MR.

Tester Structure Expression Language and Its Application to the Environment for VLSI Tester Program Development

  • Sato, Masayuki;Wakamatsu, Hiroki;Arai, Masayuki;Ichino, Kenichi;Iwasaki, Kazuhiko;Asakawa, Takeshi
    • Journal of Information Processing Systems
    • /
    • v.4 no.4
    • /
    • pp.121-132
    • /
    • 2008
  • VLSI chips have been tested using various automatic test equipment (ATE). Although each ATE has a similar structure, the language for ATE is proprietary and it is not easy to convert a test program for use among different ATE vendors. To address this difficulty we propose a tester structure expression language, a tester language with a novel format. The developed language is called the general tester language (GTL). Developing an interpreter for each tester, the GTL program can be directly applied to the ATE without conversion. It is also possible to select a cost-effective ATE from the test program, because the program expresses the required ATE resources, such as pin counts, measurement accuracy, and memory capacity. We describe the prototype environment for the GTL and the tester selection tool. The software size of the prototype is approximately 27,800 steps and 15 manmonths were required. Using the tester selection tool, the number of man-hours required in order to select an ATE could be reduced to 1/10. A GTL program was successfully executed on actual ATE.

Design and Implementation of Wired and Wireless Markup Language Content Conversion Module (무선인터넷 서비스를 위한 유무선 마크업 언어간의 컨텐츠 변환 모듈 설계 및 구현)

  • Kim Eun-Soo;Kim Seok-Hun;Yun Seong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.9 no.4 s.32
    • /
    • pp.149-155
    • /
    • 2004
  • Current wireless internetservice of domestic five mobile communication companies are being limited to the service using markup-language. platform and Contents. Therefore, research on integration Contents of different Wireless internet platform and its development are necessary. For the easiness of maintenance and compensation. this paper attempts to design a Wireless internet Contents converter for integrate wired and wireless platform which automatically converts to WML and C-HTML through analysis of HTML document.

  • PDF

Fundamental Examination and Renaming of the Terminology of the Buddhist Pagoda -Based upon Conversion from Indian Stupa into Korean Pagoda- (탑 용어에 대한 근본 고찰 및 제안 -인도 스투파에서부터 한국 석탑으로의 변환을 바탕으로-)

  • Lee, Hee-Bong
    • Journal of architectural history
    • /
    • v.19 no.4
    • /
    • pp.55-70
    • /
    • 2010
  • Although scholarly terminology should have clear meanings as signs, Korean pagoda terminology has become jargon and is creating difficulties in communicating meanings which are far from the originally intended meanings; this terminology is sometimes notated in dead language, meaning old Chinese characters, or Japanese styled Chinese characters. Nobody has asked questions on the terminology itself which has long been commonly used for a century, since the Japanese-ruling period. One of the main reasons for this error is that the Indian Buddhist scriptures in Sanskrit has been translated into Chinese with vague understanding of form and meaning of stupa since 3rd Century A.D. On the other hand, the English-language terminology, already built by Indology scholars since the beginning of the 20th century, consists of easier language and clearer meanings. This paper examines misunderstanding and mistranslation of the original Indian stupa terms and suggests new terminology in current, easier language.