• Title/Summary/Keyword: Web Parsing

Search Result 59, Processing Time 0.022 seconds

Design and Implementation of a new XML-Signcryption scheme to protect the XML document (XML 문서 보안을 위한 새로운 XML-Signcryption scheme 설계 및 구현)

  • Han, Myung-Jin;Lee, Young-Kyung;Shin, Jung-Hwa;Rhee, Kyung-Hyung
    • The KIPS Transactions:PartC
    • /
    • v.10C no.4
    • /
    • pp.405-412
    • /
    • 2003
  • As the XML is approved standard language by the UN, the progress which complemented the XML security has being processed rapidly. In this paper, we design and implement the "XML-Signcryption" as a security mechanism to protect the XML document that can operate between other platforms. The signature and encryption which is the standard specification in W3C needs to be able to proceed them separately. Generally the signature and encryption require four times modular exponential operation, however the signcryption only needed three times modular exponential operation. This will benefit overall system effectiveness in terms of cost. And this scheme offers to convenient the user, because the signature and encryption implement as a single XML format. This tool can save the parsing time as a number of tags is few within a document. And also, in this paper, based on a research of Web Services security, we can apply XML-Signcryption to the SOAP message to provide the security services. Based on the XML-Signcryption scheme which provides confidentiality, integrity, authentication and non-repudiation to the XML document and Web Service security simultaneously.

Improving a Korean Spell/Grammar Checker for the Web-Based Language Learning System (웹기반 언어 학습시스템을 위한 한국어 철자/문법 검사기의 성능 향상)

  • 남현숙;김광영;권혁철
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.3
    • /
    • pp.1-18
    • /
    • 2001
  • The goal of this paper is the pedagogical application of a Korean Spell/Grammar Checker to the web-based language learning system for Korean writing. To maximize the efficient instruction of our learning system \\`Urimal Baeumteo\\` we have to improve our Korean Spell/Grammar Checker. Today the NLP system\\`s performance defends on its semantic processing capability. In our Korean Spell/Grammar Checker. the tasks accomplished in the semantic level are: the detection and correction of misused derived and compound nouns in a Korean spell-checking device and the detection and correction of syntactic and semantic errors in a Korean grammars-checking device. We describe a common approach to the partial parsing using collocation rules based on the dependency grammar. To provide more detailed semantic rules. we classified nouns according to their concepts. and subcategorized verbs referring to their syntactic and semantic features. Improving a Korean Spell/Gl-Grammar Checker makes our learning system active and intelligent in a web-based environment. We acknowledge the flaws in our system: the classification of nouns based on their meanings and concepts is a time consuming task. the analytic unit of this study is principally limited to the phrases in a sentence therefore the accurate parsing of embedded sentences remains a difficult problem to solve. Concerning the web-based language learning system. it is critically important to consider its interface design and structure of its contents.

  • PDF

Web Data Collection and Utilization using Content Syndication (콘텐츠 신디케이션을 이용한 웹 데이터 수집 및 활용)

  • Hwang, Sanghyun;Kim, Heewan
    • Journal of Service Research and Studies
    • /
    • v.5 no.2
    • /
    • pp.83-92
    • /
    • 2015
  • Many data on the web are present, put out by processing in the content in order to provide services by collecting the necessary data is not easy. One of the reasons is because there is no way to provide a standardized data. Therefore, it can be seen as a part or all of the contents of the site, the content distribution to be available for other services is very important. A syndication format that allows you to use a representative of some or all of the site's content for other services such as RSS and there are Atom, OPML-based XML. Throughout the links provided in this syndication format is called feed address. With a feed address to collect data faster than the conventional HTML parsing and data provider is the advantage of being able to easily provide the data to the outside. In this study, we feed the data obtained by collecting by implementing the web address based on the data acquisition system to propose a method for processing and utilizing the data as a background.

A Distance Approach for Open Information Extraction Based on Word Vector

  • Liu, Peiqian;Wang, Xiaojie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2470-2491
    • /
    • 2018
  • Web-scale open information extraction (Open IE) plays an important role in NLP tasks like acquiring common-sense knowledge, learning selectional preferences and automatic text understanding. A large number of Open IE approaches have been proposed in the last decade, and the majority of these approaches are based on supervised learning or dependency parsing. In this paper, we present a novel method for web scale open information extraction, which employs cosine distance based on Google word vector as the confidence score of the extraction. The proposed method is a purely unsupervised learning algorithm without requiring any hand-labeled training data or dependency parse features. We also present the mathematically rigorous proof for the new method with Bayes Inference and Artificial Neural Network theory. It turns out that the proposed algorithm is equivalent to Maximum Likelihood Estimation of the joint probability distribution over the elements of the candidate extraction. The proof itself also theoretically suggests a typical usage of word vector for other NLP tasks. Experiments show that the distance-based method leads to further improvements over the newly presented Open IE systems on three benchmark datasets, in terms of effectiveness and efficiency.

System Design for Collecting Real-Time Product Information Using RSS (RSS를 이용한 실시간 상품정보 수집시스템의 설계)

  • Chuluun, Munkhzaya;Ko, Sun-Woo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.1
    • /
    • pp.1-9
    • /
    • 2012
  • It is well known that internet shoppers are very sensitive to sale prices. They visit the various shopping malls and collect the product information including purchase conditions for goods purchase decision-making. Recently the necessity of information support is increasing because of increase of information amount which is necessary and complexity of goods purchase decision-making process. The comparison shopping agent systems have provided price comparison information which is collected from various shopping malls to satisfy internet shoppers information craving. But the frequent price change caused by keen price competition is becoming the primary reason of information quality decline among price comparison sites. RSS which is a family of web feed formats used to publish frequently updated is applied even in on-line shopping malls. This paper develops a RSS product information collection system to get real-time product information. The proposed product information system consists of (1) web crawler module for searching RSS feed shopping malls automatically, (2) RSS reader module for parsing product information from RSS feed file, (3) product DB and (4) product searching module. Performance of the proposed system is higher than the comparison shopping agent systems when it is defined with the volume of collecting product information per unit time.

A data management system for microbial genome projects

  • Ki-Bong Kim;Hyeweon Nam;Hwajung Seo and Kiejung Park
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.83-85
    • /
    • 2000
  • A lot of microbial genome sequencing projects is being done in many genome centers around the world, since the first genome, Haemophilus influenzae, was sequenced in 1995. The deluge of microbial genome sequence data demands new and highly automatic data flow system in order for genome researchers to manage and analyze their own bulky sequence data from low-level to high-level. In such an aspect, we developed the automatic data management system for microbial genome projects, which consists mainly of local database, analysis programs, and user-friendly interface. We designed and implemented the local database for large-scale sequencing projects, which makes systematic and consistent data management and retrieval possible and is tightly coupled with analysis programs and web-based user interface, That is, parsing and storage of the results of analysis programs in local database is possible and user can retrieve the data in any level of data process by means of web-based graphical user interface. Contig assembly, homology search, and ORF prediction, which are essential in genome projects, make analysis programs in our system. All but Contig assembly program are open as public domain. These programs are connected with each other by means of a lot of utility programs. As a result, this system will maximize the efficiency in cost and time in genome research.

  • PDF

Phrase-based Indexing for Korean Information Retrieval System (한국어 정보검색 시스템을 위한 구 단위 색인)

  • 윤성희
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.1
    • /
    • pp.44-48
    • /
    • 2004
  • This paper proposes a phrase-based indexing system based on the phrase. the larger syntax unit than a single keyword. Early information retrieval systems with indexing system matching single keyword is simple and popular. But with single keyword matching it is very hard to represent the exact meaning of documents and the set of documents from retrieval is very large, therefore it can't satisfy the user of the information retrieval systems. Web documents include lots of syntactic errors, the natural language parser with high quality cannot be expected in Web. Partial trees, even not a full tree, from fully bottom-up parsing is still useful for extracting phrases, and they are much more discriminative than single keyword for index. It helps the information retrieval system enhance the efficiency and reduce the processing overhead, too.

  • PDF

ICU Real-Time Sign Information Transmission System using TMO in Distributed Network Systems (분산 네트워크 시스템에서 TMO를 이용한 ICU 실시간 생체정보 전송 시스템)

  • Oh, Seung-Jae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.4 no.3
    • /
    • pp.230-235
    • /
    • 2009
  • The TMO may contain two types of methods, time-triggered methods(also called the spontaneous methods of SpMs) which are clearly separated from the conventional service methods (SvMs). The SpM executions are triggered upon design time whereas the SvM executions are triggered by service request message from clients. In this paper, we describes the application environment as the patient monitor telemedicine system with TMO structure. Vital sign information web viewer systems is also the standard protocol for medical image and transfer. We have to design to obtain useful vital sign information, which is generated at parsing data receiver modulor of HIS with TMO structure, that is offered by the central monitor of ICU. In order to embrace new technologies as telemedicine service, it is important to develope the standard protocol between different systems in the hospital, as well as the communication with external hospital systems.

  • PDF

Customized Search System using Real-time Contexts of User (사용자의 실시간 상황정보를 이용한 사용자 맞춤 검색 시스템)

  • Kwon, Mi-Rim;Hong, Kwang-Jin;Jung, Kee-Chul
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.5
    • /
    • pp.19-30
    • /
    • 2016
  • In these days, people get information from internet easily. However, there are too many information. It makes interrupt and inefficient for searching data. Therefore, we need user customized web search system which provides appropriate information. In this paper, we propose a searching system that can collect semi-automatically conditions of users such as weather, location and time and provide essential information to users. Using these context data, the proposed system can understand what information users want in specific situations and can provide more useful information to users than existing systems. The proposed system based on 'Production/Sharing Service of Personal Korean Contents with Voluntary Sharing Economy System' and we add data parsing algorithm in each input, store and search part. In the experiments, we compare and analyze the results of existing system and the proposed system using some general key words.

Design and Implementation of USN Middleware using DTD GenerationTechnique (DTD 자동 생성 기법을 이용한 USN 미들웨어 설계 및 구현)

  • Nam, Si-Byung;Kwon, Ki-Hyeon;Yu, Myung-Han
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.3
    • /
    • pp.41-50
    • /
    • 2012
  • In the monitoring system based on web service application, it is faced with the problems like code reproduction, difficult scalability and error recovery derived from the frequent change of data structure. So we propose a technique of monitoring system by DTD(Document Type Definition) automatic generation. This technique is to use dynamic server-side script to cope with the change of sensor data structure, generate the DTD dynamically. An it also adapt the AJAX(Asynchronous JavaScript and XML) for XML data parsing, it can support mass data transmission and exception processing for data loss and damage. This technique shows the result of recovery time is decreased about 44.8ms in case of temporary data failure by comparing to the conventional XML method.