• Title/Summary/Keyword: Web Documents

Search Result 828, Processing Time 0.024 seconds

A Case Study for Migration from SGML Document to XML Documents (SGML 문서를 XML 문서로 변환하는 사례 연구)

  • Cho, Min-Ho;Ryew, Sung-Yul;Park, Si-Hyoung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.653-660
    • /
    • 2001
  • Recently, The range of Internet based information environment is spreading over core business area, as well as simple information provision area. Especially, with spreading WWW technology, markup language based technology is emerging as an important part in Internet based business. But, the data made by SGML can only see by using SGML Browser, so it has some problem in information providing at Internet, and compatibility of data between Data source. So, this study suggests essential architecture and technique for migrating from SGML to XML environment. In our study, we use 600MB SGML data that are selected from 3Tera DataBase of SGML as testing target for migration. We can reduce data displaying time after migration, can do mobile computing which is based on Internet as a result of this study. And the same technique and idea that is used in this study can apply to more large SGML Environment without changing. So, It will be very helpful to the reader who is interesting to migrate from SGML doc to XML doc.

  • PDF

An Efficient Damage Information Extraction from Government Disaster Reports

  • Shin, Sungho;Hong, Seungkyun;Song, Sa-Kwang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.55-63
    • /
    • 2017
  • One of the purposes of Information Technology (IT) is to support human response to natural and social problems such as natural disasters and spread of disease, and to improve the quality of human life. Recent climate change has happened worldwide, natural disasters threaten the quality of life, and human safety is no longer guaranteed. IT must be able to support tasks related to disaster response, and more importantly, it should be used to predict and minimize future damage. In South Korea, the data related to the damage is checked out by each local government and then federal government aggregates it. This data is included in disaster reports that the federal government discloses by disaster case, but it is difficult to obtain raw data of the damage even for research purposes. In order to obtain data, information extraction may be applied to disaster reports. In the field of information extraction, most of the extraction targets are web documents, commercial reports, SNS text, and so on. There is little research on information extraction for government disaster reports. They are mostly text, but the structure of each sentence is very different from that of news articles and commercial reports. The features of the government disaster report should be carefully considered. In this paper, information extraction method for South Korea government reports in the word format is presented. This method is based on patterns and dictionaries and provides some additional ideas for tokenizing the damage representation of the text. The experiment result is F1 score of 80.2 on the test set. This is close to cutting-edge information extraction performance before applying the recent deep learning algorithms.

Target Word Selection for English-Korean Machine Translation System using Multiple Knowledge (다양한 지식을 사용한 영한 기계번역에서의 대역어 선택)

  • Lee, Ki-Young;Kim, Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.75-86
    • /
    • 2006
  • Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the translation accuracy of machine translation systems. In this paper, we present a new approach to select Korean target word for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean local context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. The experiment showed promising results for diverse sentences from web documents.

  • PDF

Examining the Knowledge Structure in the Communication Field: Author Cocitation Analysis for the Editorial Board of the Journal of Communication, 2008 and 2011 (Journal of Communication의 편집위원회에 대한 저자동시인용분석을 이용한 언론학 분야의 지적구조와 사회적 배경 분석: 2008년과 2011년 비교)

  • Kim, Hyun-Jung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.23 no.2
    • /
    • pp.109-132
    • /
    • 2012
  • This study examines the social network of scholars in the field of communication by using author cocitation data. A matrix containing the number of cocited documents between pairs of authors is created for social network analysis of scholars who are on the editorial board of Journal of Communication, and the networked map of the scholars is used to visualize the knowledge structure of the field by identifying groups of authors who are more central than others. In addition, the study compares the previous analysis performed in 2008 and the current analysis on the editorial board of the journal, which increased from 146 to 254 scholars in numbers. Author cocitation data was collected using Social Science Citation Index (SSCI) through the Web of Science database, and UCInet was used to create and visualize the author cocitation network and to analyze the correlation between the cocitation network and the factors that may have affected the structure of the cocitation network.

A Review of Influencing Aronia Intake on Human Body in Korea (국내 아로니아 습취가 인체에 미치는 영향에 관한 문헌분석)

  • Nam, Soo-Tai;Yu, Ok-Kyeong;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.149-152
    • /
    • 2017
  • Big data analysis is an effective analysis techniques of unstructured data such as internet, social network services, web documents generated in mobile environment, e-mail, and social data, as well as formal data well organized in the database. Thus, meta-analysis is a statistical integration method that delivers an opportunity to overview the entire result of integrating and analyzing many quantitative research results. Today, regardless of gender and age is increasing interest in whether you can lead a younger and healthier life. With this change of life which has been developed with a variety of functional health food. Aronia melanocarpa called black chokeberry is a fruit of berry plants belonging to the Rosaceae originally growing in the North America region. In the studies for factors related to quality characteristics and antioxidant activities as the extracts of Aronia in this study, which it is only targeted factors as total sugar, acidity, polyphenol, anthocyanin, antioxidant. Thus, we present the theoretical and practical implications of these results.

  • PDF

Design and Implementation of XML Based Relational Database Metadata Repository (XML을 기반으로 한 관계형 데이터베이스 메타데이터 리파지토리 설계 및 구현)

  • Gwon, Eun-Jeong;Yong, Hwan-Seung
    • The KIPS Transactions:PartD
    • /
    • v.9D no.1
    • /
    • pp.1-10
    • /
    • 2002
  • Metadata is data about data that is used to mange data itself. As applications based on DBMS are increased, it is suggested that metadata model and metadata interchange model to manage metadata in DBMS. but metadata which is in the form of XML (eXtensible Markup Language) document is generally stored into RDBMS. Therefore In this paper, as for the method to store metadata of RDBMS into OODBMS, we design metadata model, metadata interchange model and implement new repository system. The metadata of RDBMS is translated into in the form of XML Document and integrated into XML Data Server on OODBMS, eXcelon and executes retrieval metadata information about RDBMS by XQL(XML Query Language). So It is possible to searchm edit a metadata. The metadata of XML documents stored in eXcelon is easily made to be printed in web browser by applying a XSL (extensible StyleSheets Language). So we can have a detail information about property of metadata in DBMS.

Design and Implementation of RFID based Tree History Information System for Cultural Heritage Restoration (RFID 기반 문화재 복원용 임목 이력 정보 시스템의 설계 및 구현)

  • Kim, Sam-Geun;Moon, Il-Hwan;Park, Jae-Pyo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.9B
    • /
    • pp.1360-1368
    • /
    • 2010
  • Recently, as the development of Radio Frequency Identification (RFID) technology becomes active, the demand for services which can electronically manage the history and location information of major trees, including trees for cultural heritage restoration and nurse trees, has been increased. This information has been managed by separated drawings and documents or storing its information into PDAs and then structuring data files through input and computation. But, these methods imply limitations in terms of its extensibility and scalability. This paper has designed and implemented an RFID based Tree History Information System (THIS) for cultural heritage restoration. The purpose of the proposed system is to support to be able to effectively and consistently manage historical information of major trees and improve working processes by implementing mobile RFID services through wireless Internet or Local Area Network (LAN) as mobile communication networks. Through implementation, it is confirmed that the proposed system can manage the historical information of major trees more effectively than conventional methods and also improve previous field working conditions.

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.

Method Customizing From Web-based English-Korean MT System To English-Korean MT System for Patent Documents (웹 영한 번역기로부터 특허 영한 번역기로의 특화 방법)

  • Choi, Sung-Kwon;Kwon, Oh-Woog;Lee, Ki-Young;Roh, Yoon-Hyung;Park, Sang-Kyu
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.57-64
    • /
    • 2006
  • 본 논문에서는 웹과 같은 일반적인 도메인의 영한 자동 번역기를 특허용 영한 자동번역기로 특화하는 방법에 대해 기술한다. 특허용 영한 파동번역기로의 특화는 다음과 같은 절차에 의해 이루어진다: 1) 대용량 특허 문서에 대한 언어학적 특성 분석, 2) 대용량 특허문서 대상 전문용어 추출 및 대역어 구축, 3) 기존 번역사전 대역어의 특화, 4) 특허문서 고유의 번역 패턴 추출 및 구축, 5) 언어학적 특성 분석에 따른 번역 엔진 모듈의 특화 및 개선, 6) 특화된 번역 지식 및 번역 엔진 모듈에 따른 번역률 평가. 이와 같은 절차에 의해 만들어진 특허 영한 자동 번역기는 특허 전문번역가의 평가에 의해 전분야 평균 81.03%의 번역률을 내었으며, 분야별로는 기계분야(80.54%), 전기전자분야(81.58%), 화학일반분야(79.92%), 의료위생분야(80.79%), 컴퓨터분야(82.29%)의 성능을 보였으며 계속 개선 중에 있다. 현재 본 논문에서 기술된 영한 특허 자동번역 시스템은 산업자원부의 특허지원센터에서 변리사 및 특허 심사관이 영어 전기전자분야 특허 문서를 검색할 때 한국어 번역서비스를 제공받도록 이용되고 있으며($\underline{http://www.ipac.or.kr}$), 2007년에는 전분야 특허문서에 대한 영한 자동번역 서비스를 제공할 예정이다.

  • PDF

Problem and Improvement of the Computerization Practice Real - estate Registration

  • Youn, Sung-Ho;Kim, Moon-sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.1
    • /
    • pp.161-168
    • /
    • 2016
  • It has become the information medium that mediate between individuals of marketing with the new information media that use of the Internet in accordance with the progress of the IT industry- real estate marketing, real estate leasing, real estate auctions, etc. The real estate registration services should be disclosed to outside that include a schedule details the public book-register and problems caused by it should be reasonably resolved by handling a large amount of real estate registration work quickly, accurately handling to utilize Information Technology(IT) through the information system construction as the feature to process the information of register entries. Computerization of real estate registration will raise the efficiency of the registration as well as people can see the information related to real estate on the basis of published content without limitation the time and place and will pursue the ideal and reliability of the registration -web accessibility improving of the Internet Registry and permanent storage of preserving documents by electronic means. It is very large that impact on the real estate transaction if the accuracy of legal registration choosing formalism in the processing of real estate registration business through the Internet than written application is highly probable to occur incorrect registration. Also, It is necessary to manage it effectively -if you do not quickly and exactly respond to it, there is problems such as delays or poor registration service because real estate business activated and real estate registration services also increased sharply. In this paper, we will identify the problems of the real estate registration work and suggest improvements about it.