• Title/Summary/Keyword: Language Conversion

Search Result 200, Processing Time 0.02 seconds

XML-based EDI Document Processing System with Binary Format Mapping Rules

  • Kim, Chang-Su;Jung, Hoe-Kyung
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.3
    • /
    • pp.258-263
    • /
    • 2012
  • Recently, the magnitude of electronic data interchange (EDI) document processing for the handling of port logistics is abruptly being increased. The existing system processes EDI documents in a script mode, but due to a complicated script preparation procedure and low document processing efficiency, it cannot meet the demand as the usage flow of documents increases. In this paper, an EDI electronic document processing system was designed and implemented in a document scanner and mapper, which are binary form electronic document processing tools and do not require script files during the conversion of extensible markup language (XML)-based electronic documents. This new system has the merits of XML features during reading and writing with improved speed, usage convenience, and good portability on systems when compared to the conventional ones.

A Conversion System of HTML Document into OWL Ontology language (OWL 온톨로지 언어로의 HTML문서 변환 시스템)

  • Kwak Hyoun-Soo;Kim Su-Kyoung;Kim Yeong-Geun;Ahn Kee-Hong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.539-542
    • /
    • 2004
  • 텍스트 중심의 현재의 웹은 주로 시각적 효과만을 고려하여 사용되었으므로, 사용자가 원하는 정보를 효율적으로 추출하기에는 많은 문제점을 지니고 있다. 그래서 점차 메타데이타의 개념을 통하여 웹 문서에 시맨틱 정보를 덧붙이고 이를 이용하여 컴퓨터와 사람이 의사소통을 할 수 있는 시맨틱 웹이 제안되었다. 앞으로 의미 중심의 시맨틱 웹으로 발전해 나가기 위해서는 온톨로지의 구현이 필수적으로 요구되는데, 본 논문은 현재 웹에서 사용되고 있는 HTML언어를 재입력하지 않고, 온톨로지 언어 중 하나인 OWL로 자동 변환하는 시스템을 구현하고자 한다. 온톨로지를 사용함으로써 현재의 웹과 비교하여 좋은 잇점은 문서에 대한 의미와 구조를 파악하여 기계가 의미에 따라 정보를 자동 추론을 할 수 있고, 이기종간의 상호운용성을 보장한다. 또한 현재의 웹에서는 많은 문서들이 서로 동일한 내용으로 작성되는 경우가 많은데, 작성된 온톨로지를 공유하고 재사용하여 그에 따르는 시간과 비용을 줄일 수 있다.

  • PDF

A Novel Text to Image Conversion Method Using Word2Vec and Generative Adversarial Networks

  • LIU, XINRUI;Joe, Inwhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.401-403
    • /
    • 2019
  • In this paper, we propose a generative adversarial networks (GAN) based text-to-image generating method. In many natural language processing tasks, which word expressions are determined by their term frequency -inverse document frequency scores. Word2Vec is a type of neural network model that, in the case of an unlabeled corpus, produces a vector that expresses semantics for words in the corpus and an image is generated by GAN training according to the obtained vector. Thanks to the understanding of the word we can generate higher and more realistic images. Our GAN structure is based on deep convolution neural networks and pixel recurrent neural networks. Comparing the generated image with the real image, we get about 88% similarity on the Oxford-102 flowers dataset.

Semi-Automatic Ontology Construction from HTML Documents: A conversion of Text-formed Information into OWL 2

  • Im, Chan jong;Kim, Do wan
    • International Journal of Contents
    • /
    • v.12 no.2
    • /
    • pp.24-30
    • /
    • 2016
  • Ontology is known to be one of the most important technologies in achieving semantic web. It is critical as it represents the knowledge in a machine readable state. World Wide Web Consortium (W3C) has been contributing to the development of ontology for the last several years. However, the recommendation of W3C left out HTML despite the massive amount of information it contains. Also, it is difficult and time consuming to keep up with all the technologies especially in the case of constructing ontology. Thus, we propose a module and methods that reuse HTML documents, extract necessary information from HTML tags and mapping it to OWL 2. We will be combining two kinds of approaches which will be the structural refinement for making an ontology skeleton and linguistic approach for adding detailed information onto the skeleton.

Text Classification on Social Network Platforms Based on Deep Learning Models

  • YA, Chen;Tan, Juan;Hoekyung, Jung
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • The natural language on social network platforms has a certain front-to-back dependency in structure, and the direct conversion of Chinese text into a vector makes the dimensionality very high, thereby resulting in the low accuracy of existing text classification methods. To this end, this study establishes a deep learning model that combines a big data ultra-deep convolutional neural network (UDCNN) and long short-term memory network (LSTM). The deep structure of UDCNN is used to extract the features of text vector classification. The LSTM stores historical information to extract the context dependency of long texts, and word embedding is introduced to convert the text into low-dimensional vectors. Experiments are conducted on the social network platforms Sogou corpus and the University HowNet Chinese corpus. The research results show that compared with CNN + rand, LSTM, and other models, the neural network deep learning hybrid model can effectively improve the accuracy of text classification.

A Study on Considerations in the Authority Control to Accommodate LRM Nomen (LRM 노멘을 수용하기 위한 전거제어시 고려사항에 관한 연구)

  • Lee, Mihwa
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.1
    • /
    • pp.109-128
    • /
    • 2021
  • This paper is to explore considerations in authority control to accommodate LRM nomen entities through the literature reviews, the analysis of RDA rules, and the opinion survey of domestic catalog experts. As a result, for authority control, considerations were proposed in the aspect of nomen's attribute elements, catalog description, and MARC authority format. First, it is necessary to describe in as much detail as possible the category, the scheme, intended audience, the context of use, the reference source, the language, the script, the script conversion as the attributes of the nomen with the status of identification, note, and indifferentiated name indicators added in RDA. Second, the description method of attribute elements and relational elements of nomen can be unstructured, structured, identifier, and IRI as suggested in RDA, and vocabulary encoding scheme (VES) and string encoding scheme (SES) should be written for structured description, Also, cataloging rules for structuring authorized access points and preferred names/title should be established. Third, an additional expansion plan based on Maxwell's expansion (draft) was proposed in order to prepare the MARC 21 authority format to reflect the LRM nomen. (1) The attribute must be described in 4XX and 5XX so that the attribute can be entered for each nomen, and the attributes of the nomen to be described in 1XX, 5XX and 4XX are presented separately. (2) In order to describe the nomen category, language, script, script conversion, context of use, and date of usage as a nomen attribute, field and subfield in MARC 21 must be added. Accordingly, it was proposed to expand the subfield of 368, 381, and 377, and to add fields to describe the context of use and date of usage. The considerations in authority control for the LRM nomen proposed in this paper will be the basis for establishing an authority control plan that reflects LRM in Korea.

The Conversion Scheme of GML Document into Spatial Database using the Directed Schema Graph Mapping Rules (방향성 스키마 그래프 매핑 규칙을 이용한 GML 문서의 공간 데이터베이스 변환 기법)

  • Chung, Warn-Ill;Park, Soon-Young;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.1 s.13
    • /
    • pp.39-52
    • /
    • 2005
  • GML (Geography Markup Language) has become the widely adopted standard for transport and storage of geographic information. So, various researches such as modeling, storage, query, and etc have been studied to provide the interoperability of geographic information in web environments. Especially, there are increased needs to store semi-structured data such as GML documents efficiently. Therefore, in this paper, we design and implement a GML repository to store GML documents on the basis of GML schema using spatial database system. GML Schema is converted into directed GML schema graph and the schema mapping technique from directed schema graph to spatial schema is presented. Also, we define the conversion rules on spatial schema to preserve the constraints of GML schema. GML repository using spatial database system is useful to provide the interoperability of geographic information and to store and manage enormous GML documents.

  • PDF

Implementation of Hangul to $T_EX$ conversion software (아래아 한글 파일의 텍 파일로의 변환 소프트웨어 구현)

  • Kim, Sung-Won;Lee, Han-Na;Park, Sang-Hoon;Oh, Chang-Hyuck
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.99-107
    • /
    • 2010
  • This research is for implementation of a software that can convert Hangul format file to $T_EX$ format file. Hangul is a word processor that has widely been used in Korea. It is known that Hangul is relatively easy of typing in equations and tables in preparing a paper draft. $T_EX$ has been developed as a computer programming language for preparing and publishing documents. Documents are first typed in with a plain text editor with $T_EX$ commands and then is compiled and linked. The software implemented in this research converts Hangul format files which are written under the specific format of a journal to $T_EX$ format file with the given style specific file. It converts special symbols, texts, tables, equations, and paragraph formats. We have used Hangul format of Journal of the Korean Data & Information Science Society (JKDISS) and the style file of $T_EX$ for the beta-test for the software.

Service-centric Object Fragmentation Model for Efficient Retrieval and Management of XML Documents (XML 문서의 효율적인 검색과 관리를 위한 SCOF 모델)

  • Jeong, Chang-Hoo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.595-598
    • /
    • 2007
  • Vast amount of XML documents raise interests in how they will be used and how far their usage can be expanded. This paper has two central goals: 1) easy and fast retrieval of XML documents or relevant elements; and 2) efficient and stable management of large-size XML documents. The keys to develop such a practical system are how to segment a large XML document to smaller fragments and how to store them. In order to achieve these goals, we designed SCOF(Service-centric Object Fragmentation) model, which is a semi-decomposition method based on conversion rules provided by XML database managers. Keyword-based search using SCOF model then retrieves the specific elements or attributes of XML documents, just as typical XML query language does. Even though this approach needs the wisdom of managers in XML document collection, SCOF model makes it efficient both retrieval and management of massive XML documents.

  • PDF

Burke-Schumann analysis of silica formation by hydrolysis in an external chemical vapor deposition process (외부 화학증착 공정에서의 가수분해반응으로 인한 실리카 생성에 대한 버크-슈만 해석)

  • Song, Chang-Geol;Hwang, Jeong-Ho
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.20 no.5
    • /
    • pp.1671-1678
    • /
    • 1996
  • In external chemical vapor deposition processes including VAD and OVD the distribution of flame-synthesized silica particles is determined by heat and mass transfer limitations to particle formation. Combustion gas flow velocities are such that the particle diffusion time scale is longer than that of gas flow convection in the zone of particle formation. The consequence of these effects is that the particles formed tend to remain along straight smooth flow stream lines. Silica particles are formed due to oxidation and hydrolysis. In the hydrolysis, the particles are formed in diffuse bands and particle formation thus requires the diffusion of SiCl$\_$4/ toward CH$\_$4//O$\_$2/ combustion zone to react with H$\_$2/O diffusing away from these same zones on the torch face. The conversion kinetics of hydrolysis is fast compared to diffusion and the rate of conversion is thus diffusion-limited. In the language of combustion, the hydrolysis occurs as a Burke-Schumann process. In selected conditions, reaction zone shape and temperature distributions predicted by the Burke-Schumann analysis are introduced and compared with experimental data available. The calculated centerline temperatures inside the reaction zone agree well with the data, but the calculated values outside the reaction zone are a little higher than the data since the analysis does not consider diffusion in the axial direction and mixing of the combustion products with ambient air. The temperatures along the radial direction agree with the data near the centerline, but gradually diverge from the data as the distance is away from the centerline. This is caused by the convection in the radial direction, which is not considered in the analysis. Spatial distribution of silica particles are affected by convection and diffusion, resulting in a Gaussian form in the radial direction.