• Title/Summary/Keyword: Text Reuse

Search Result 22, Processing Time 0.023 seconds

A MVC Framework for Visualizing Text Data (텍스트 데이터 시각화를 위한 MVC 프레임워크)

  • Choi, Kwang Sun;Jeong, Kyo Sung;Kim, Soo Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.39-58
    • /
    • 2014
  • As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.

WTO, an ontology for wheat traits and phenotypes in scientific publications

  • Nedellec, Claire;Ibanescu, Liliana;Bossy, Robert;Sourdille, Pierre
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.14.1-14.11
    • /
    • 2020
  • Phenotyping is a major issue for wheat agriculture to meet the challenges of adaptation of wheat varieties to climate change and chemical input reduction in crop. The need to improve the reuse of observations and experimental data has led to the creation of reference ontologies to standardize descriptions of phenotypes and to facilitate their comparison. The scientific literature is largely under-exploited, although extremely rich in phenotype descriptions associated with cultivars and genetic information. In this paper we propose the Wheat Trait Ontology (WTO) that is suitable for the extraction and management of scientific information from scientific papers, and its combination with data from genomic and experimental databases. We describe the principles of WTO construction and show examples of WTO use for the extraction and management of phenotype descriptions obtained from scientific documents.

Multilingual Automatic Translation Based on UNL: A Case Study for the Vietnamese Language

  • Thuyen, Phan Thi Le;Hung, Vo Trung
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.77-84
    • /
    • 2016
  • In the field of natural language processing, Universal Networking Language (UNL) has been used by various researchers as an inter-lingual approach to automatic machine translation. The UNL system consists of two main components, namely, EnConverter for converting text from a source language to UNL, and DeConverter for converting from UNL to a target language. Currently, many projects are researching how to apply UNL to different languages. In this paper, we introduce the tools that are UNL's applications and discuss how to reuse them to encode a Vietnamese sentence into UNL expressions and decode UNL expressions into a Vietnamese sentence. The testing was done with about 1,000 Vietnamese sentences (a dictionary that includes 4573 entries and 3161 rules). In addition, we compare the proportion of sentences translated based on a direct method (Google Translator) and another one based on UNL.

High Speed Local Text Reuse Detection using IR Approach (정보검색 기법을 이용한 부분 문서 재사용 고속 탐색)

  • Bae, Won-Sik;Jo, Myung-Rae;Cha, Jeong-Won
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.63-68
    • /
    • 2008
  • 인터넷의 발달로 지식의 재사용이 폭발적으로 증가하였다. 이는 지식의 확산이라는 측면에서는 바람직하지만 지식의 도용이라는 문제점을 안고 있다. 따라서 문서의 전부나 일부분을 재사용한 것인지를 판단하고자하는 요구가 증가하고 있다. 본 논문에서는 정보검색 기술을 이용하여 문서에서 부분 문서 재사용 및 표절을 탐색하는 방법을 제안한다. 본 논문에서는 대용량 문서의 고속 탐색을 위해서 원본 문서와 대상 문서를 색인하여 검색에 이용한다. 또한 한글의 언어적 특성을 맞게 어순 변경 비교, 기능어 생략 비교, 갭(gap) 비교 등의 다양한 처리 조건을 제공하여 문서 재사용을 탐색할 수 있다. 실험을 통해서 기존의 시스템보다 정확하게 고속으로 문서 재사용 탐색이 가능함을 보였다. 특히 비교 문서가 증가하더라도 비교 시간이 급격하게 증가하지 않으며, 정보검색 기법을 사용하는 경우 취약하다고 알려져 있는 부분 문서 재사용 탐색에도 견고하며, 처리 조건에 따라 유연하게 문서 재사용 탐색이 가능하다.

  • PDF

Semi-Automatic Ontology Construction from HTML Documents: A conversion of Text-formed Information into OWL 2

  • Im, Chan jong;Kim, Do wan
    • International Journal of Contents
    • /
    • v.12 no.2
    • /
    • pp.24-30
    • /
    • 2016
  • Ontology is known to be one of the most important technologies in achieving semantic web. It is critical as it represents the knowledge in a machine readable state. World Wide Web Consortium (W3C) has been contributing to the development of ontology for the last several years. However, the recommendation of W3C left out HTML despite the massive amount of information it contains. Also, it is difficult and time consuming to keep up with all the technologies especially in the case of constructing ontology. Thus, we propose a module and methods that reuse HTML documents, extract necessary information from HTML tags and mapping it to OWL 2. We will be combining two kinds of approaches which will be the structural refinement for making an ontology skeleton and linguistic approach for adding detailed information onto the skeleton.

The Analysis of the Level of Technological Maturity for the u-Learning of Public Education by Mobile Phone (휴대폰을 이용한 공교육 u-러닝의 기술 성숙도 분석)

  • Lee, Jae-Won;Na, Eun-Gu;Song, Gil-Ju
    • IE interfaces
    • /
    • v.19 no.4
    • /
    • pp.306-315
    • /
    • 2006
  • In this paper we analyze whether we can use the mobile phone having been highly distributed into young generation as a device for the u-learning in Korean public education. For this purpose we deal with the technical maturity in three axes. Firstly, we examine the authoring nature of mobile internet-based contents such as both text and motion picture for the contents developers in the public education. As a research result the authoring of text has almost no difficulty, but that of the motion picture shows some problems. Secondly, we deal with whether u-learners can easily get and use u-contents on both mobile phone and PC respectively. After analysing this factor, we found that the downloading of motion picture contents into mobile phone is very limited. Therfore we talk about the usability and problem of various PC Sync tools and propose their standardization. Finally, the needs of the introduction of the ubiquitous SCORM which could enable to reuse u-contents among different Korean telco’s mobile phones are discussed. Here we describe some functionality of both ubiquitous SCORM and u-LMS. Our study looks like almost the first work examining the technological maturity for the introduction of u-learning with mobile phone in Korean public education and it could be used as a reference for the study of any other wireless telecommunication-based u-learning other than mobile telecommunication.

A Text Reuse Measuring Model Using Circumference Sentence Similarity (주변 문장 유사도를 이용한 문서 재사용 측정 모델)

  • Choi, Sung-Won;Kim, Sang-Bum;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 2005.10a
    • /
    • pp.179-183
    • /
    • 2005
  • 기존의 문서 재사용 탐지 모델은 문서 혹은 문장 단위로 그 내부의 단어 혹은 n-gram을 비교를 통해 문장의 재사용을 판별하였다. 그렇지만 문서 단위의 재사용 검사는 다른 문서의 일부분을 재사용하는 경우에 대해서는 문서 내에 문서 재사용이 이루어지지 않은 부분에 의해서 그 재사용 측정값이 낮아지게 되어 오류가 발생할 수 있는 가능성이 높아진다. 반면에 문장 단위의 문서 재사용 검사는 비교문서 내의 문장들에 대한 비교를 수행하게 되므로, 문서의 일부분에 대해 재사용물 수행한 경우에도 그 재사용된 부분 내의 문장들에 대한 비교를 수행하는 것이므로 문서 단위의 재사용에 비해 그런 경우에 더 견고하게 작동된다. 그렇지만, 문장 단위의 비교는 문서에 비해 짧은 문장을 단위로 하기 때문에 그 신뢰도에 문제가 발생하게 된다. 본 논문에서는 이런 문장단위 비교의 단점을 보완하기 위해 문장 단위의 문서 재사용 검사를 수행 후, 문장의 주변 문장의 재사용 검사 결과를 이용하여 문장 단위 재사용 검사에서 일어나는 오류를 감소시키고자 하였다.

  • PDF

A Development Plan for Co-creation-based Smart City through the Trend Analysis of Internet of Things (사물인터넷 동향분석을 통한 Co-creation기반 스마트시티 구축 방안)

  • Park, Ju Seop;Hong, Soon-Goo;Kim, Na Rang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.4
    • /
    • pp.67-78
    • /
    • 2016
  • Recently many countries around the world are actively promoting smart city projects to address various urban problems such as traffic congestion, housing shortage, and energy scarcity. Due to development of the Internet of Things (IoT), the development of a smart city with sustainability, convenience, and environment-friendliness was enabled through the effective control and reuse of urban resources. The purpose of this study is to analyze the technical trends of IoT and present a development plan for smart city which is one of the applications of the IoT. To this end, the news articles of the Electronic Times between 2013 and 2015were analyzed using the text mining technique and smart city development cases of other countries were investigated. The analysis results revealed the close relationships of big data, cloud, platforms, and sensors with smart city. For the successful development of a smart city, first, all the interested parties in the city must work together to create new values throughout the entire process of value chain. Second, they must utilize big data and disclose public data more actively than they are doing now. This study has made academic contribution in that it has presented a big data analysis method and stimulated follow-up studies. For the practical contribution, the results of this study provided useful data for the policy making of local governments and administrative agencies for smart city development. This study may have limitations in the incorporation of the total trends because only the news articles of the Electronic Times were selected to analyze the technical trends of the IoT.

Documentation of the History of Ok-Cheon Catholic Church by standardized 2D CAD and 3D Digital Modeling (표준화된 2D CAD와 3D Digital Modeling을 이용한 옥천천주교회의 연혁 기록)

  • Kim, Myung-Sun;Choi, Soon-Yong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.1
    • /
    • pp.523-528
    • /
    • 2011
  • Ok-Cheon catholic church has been changed 4 times since it's first construction in 1955. Prior three changes were small ones of windows, doors, roof finish etc. but the last alteration was the extension of it's plan from 一 shape to long cross shape and along with it the size, structure and form of it changed. This history of the church has not been recorded in drawing but only in text with indistinct features not documented. This study makes a new 2D CAD files using layers matched the changes and 3D digital models, these have not only present information but also change informations of the church. They are useful data for effective management, conservation restoration or possible reuse of it.

A Study on the Annotation of Digital Content (디지털 콘텐트의 어노테이션에 관한 연구)

  • Kwak, Seung-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.4
    • /
    • pp.267-286
    • /
    • 2006
  • Tools are needed to have access to more effective information and to select it in the environment of digital information where information pours in. and some of the advanced techniques to make up these tools are metadata and annotation. Annotation additionally records the marks for the supplementary explanation of and emphasis on a specific part of the original text and kas more various merits than metadata in terms of the search and use of digital resources. This research aims at suggesting methods that annotation. which has a range of functions including access to information. its reuse and sharing in the digital surroundings of late, can be applied to digital contents such as web services, digital libraries and electronic books. As to the research method, the case studies of annotation systems applied to web services and digital libraries have been carried out, and the metadata formation of the systems has been analyzed.