• Title/Summary/Keyword: Markup Pattern

Search Result 14, Processing Time 0.036 seconds

Web Information Retrieval Exploiting Markup Pattern (마크업 패턴을 이용한 웹 검색)

  • Kim, Min-Soo;Kim, Min-Koo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.6
    • /
    • pp.407-411
    • /
    • 2007
  • Over the years, great attention has been paid to the question of exploiting inherent semantic of HTML in the area of web document retrieval. Although HTML is mainly presentation oriented, HTML tags implicitly contain useful semantics that can be catch meaning of text. Focusing on this idea. in this paper we define 'markup pattern' and try to improve performance of web document retrieval using markup patterns. Markup pattern is a mirror of intends of web document publisher and an internal semantic of text on web document. To discover the markup pattern and exploit it, we suggest a new scheme for extracting concepts and weighting documents. For evaluation task, we select two domains-BBC and CNN web sites, and use their search engines to gather domain documents. We re-weight and re-score documents using proposed scheme, and show the performance improvement in the two domains.

Study for XML document retrieval to use XSL (XSL를 이용한 XML 문서 검색에 관한 연구)

  • 김충성;김용성
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.66-68
    • /
    • 1999
  • 최근 들어 이 기종 간의 문서 교환을 위해 SGML(Standard Markup Language) 문서보다 XML(eXtend Markup Language) 문서가 인터넷 기반에서 표준으로 자리잡고 있다. 앞으로 인터넷상의 수많은 정보들을 XML을 기반으로 할 것이고 이를 위해 문서 정보 검색 시스템이 필요하게 된다. 문서의 논리 구조를 표현하는 DTD(Document Type Definition) 기반으로 구조 검색을 할 수 있지만 본 논문에서는 XSL(XML Style Sheet Language) 문서에서 DTD의 Element를 지정하는 Pattern을 이용하여 문서 구조와 속성을 새로운 Tree로 표현하며 검색에 필요한 질의어 또한 XSL의 Pattern 자체를 이용하고 있다. 사용자에게 편하고 효율적인 검색 환경을 위해서 검색 인터페이스의 모형을 제안하였다.

  • PDF

Automatically Converting HTML Documents with Similar Pattern into XML Documents (유사 패턴을 갖는 HTML 문서의 XML 자동 변환)

  • O, Geum-Yong;Hwang, In-Jun
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.355-364
    • /
    • 2002
  • Recently, WWW(World Wide Web) has become a source of a large amount of information, and is now recognized not only as an information-sharing tool, but also as an information repository. Currently, the majority of documents on the web were created using HTML(Hypertext Markup Language). Although HTML is simple and easy to learn, its inherent lack of describing document structure makes it difficult to retrieve information effectively. One possible solution would be to convert such HTML documents into XML (extensible Markup Language) documents. This is a standard markup language for exchanging data on the web. It can describe a document structure freely by defining its own DTD (Document Type Definition). This makes it possible to integrate, store, and retrieve data on the web efficiently In this paper, we will propose a converter that automatically converts HTML documents with similar pattern into XML documents by analyzing the document structure and recognizing its path information.

XSL document authoring system using XSL Pattern (XSL패턴을 응용한 XSL 문서 편집 시스템)

  • 박진우;김성한;현득창;정회경
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.372-374
    • /
    • 2000
  • 본 논문은 인터넷상에서 사용이 가능하며 확장성이 뛰어나 XML(eXtensible Markup Language)을 다른 문서로 변환 및 브라우징(Browsing), 프리젠테이션(Presentation)이 가능한 표준 언어인 XSL(eXtensible Stylesheet Language)을 응용한 XSL 문서 편집 시스템의 설계 및 구현에 관한 것이다. 이를 위해 XSL문서의 기본 구조 단위를 패턴(Pattern)으로 구분하여 서식 집합(Template Rule)들을 사용자 서술 방식이 아닌 사용자의 선택 방식의 자동 생성으로 구성하도록 하였다. XSL 엘리먼트 (Element)의 선택방식은 구성되어진 XML문서를 읽어 들여 문서의 엘리먼트 정보를 확장할 수 있게 하였다. 또한 XML문서와 XSL문서의 HTML(HyperText Markup Language)로의 변환을 쉽게 확인할 수 있는 사용자 인터페이스(User Interface)를 구성하며 원활하게 문서를 교환 할 수 있도록 설계 구현 하였다.

  • PDF

A Design and Implementation of JiKU/XML Object-oriented Code Generator Using for Design Pattern (디자인 패턴을 이용한 JiKU/XML 객체지향코드 생성기 설계 및 구현)

  • Sun, Su-Kyun
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.907-916
    • /
    • 2004
  • The present code generation system, developing based on single system, Is not easy for developers or maintenance men to share pattern design information in distribution environment. So in this paper, we design and implement XML as basis of web environment, and JiKU/XML object-oriented code generator using pattern design. We use UML to change pattern design to XML code, and create code, suitable to PIML command, to generate design information designed by UML into XML code. This JiKU/XML Object-oriented Code Generator makes 10-step codes, and can be easily applied to web environment. It complements the disadvantage of present generator, F77/J++, and makes standardization of design because it uses UML and design pattern information. We compare it with present system by implement Eases, and as a result, generator suggested in this study gives more effective function.

Simulation-Based Stochastic Markup Estimation System $(S^2ME)$ (시뮬레이션을 기반(基盤)으로 하는 영업이윤율(營業利潤率) 추정(推定) 시스템)

  • Yi, Chang-Yong;Kim, Ryul-Hee;Lim, Tae-Kyung;Kim, Wha-Jung;Lee, Dong-Eun
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2007.11a
    • /
    • pp.109-113
    • /
    • 2007
  • This paper introduces a system, Simulation based Stochastic Markup Estimation System (S2ME), for estimating optimum markup for a project. The system was designed and implemented to better represent the real world system involved in construction bidding. The findings obtained from the analysis of existing assumptions used in the previous quantitative markup estimation methods were incorporated to improve the accuracy and predictability of the S2ME. The existing methods has four categories of assumption as follows; (1) The number of competitors and who is the competitors are known, (2) A typical competitor, who is fictitious, is assumed for easy computation, (3) the ratio of bid price against cost estimate (B/C) is assumed to follow normal distribution, (4) The deterministic output obtained from the probabilistic equation of existing models is assumed to be acceptable. However, these assumptions compromise the accuracy of prediction. In practice, the bidding patterns of the bidders are randomized in competitive bidding. To complement the lack of accuracy contributed by these assumptions, bidding project was randomly selected from the pool of bidding database in the simulation experiment. The probability to win the bid in the competitive bidding was computed using the profile of the competitors appeared in the selected bidding project record. The expected profit and probability to win the bid was calculated by selecting a bidding record randomly in an iteration of the simulation experiment under the assumption that the bidding pattern retained in historical bidding DB manifest revival. The existing computation, which is handled by means of deterministic procedure, were converted into stochastic model using simulation modeling and analysis technique as follows; (1) estimating the probability distribution functions of competitors' B/C which were obtained from historical bidding DB, (2) analyzing the sensitivity against the increment of markup using normal distribution and actual probability distribution estimated by distribution fitting, (3) estimating the maximum expected profit and optimum markup range. In the case study, the best fitted probability distribution function was estimated using the historical bidding DB retaining the competitors' bidding behavior so that the reliability was improved by estimating the output obtained from simulation experiment.

  • PDF

XML Document Clustering Based on Sequential Pattern (순차패턴에 기반한 XML 문서 클러스터링)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.7
    • /
    • pp.1093-1102
    • /
    • 2003
  • As the use of internet is growing, the amount of information is increasing rapidly and XML that is a standard of the web data has the property of flexibility of data representation. Therefore electronic document systems based on web, such as EDMS (Electronic Document Management System), ebXML (e-business extensible Markup Language), have been adopting XML as the method for exchange and standard of documents. So research on the method which can manage and search structural XML documents in an effective wav is required. In this paper we propose the clustering method based on structural similarity among the many XML documents, using typical structures extracted from each document by sequential pattern mining in pre-clustering process. The proposed algorithm improves the accuracy of clustering by computing cost considering cluster cohesion and inter-cluster similarity.

An Effective XML Schema Conversion Technique for Improving XML Document Reusability using Pattern List

  • Ko, Hye-Kyeong;Yang, Minho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.9 no.2
    • /
    • pp.11-19
    • /
    • 2017
  • The growing use of XML markup language has made amount of heterogeneous. XML documents are widely available in the Web. As the number of applications that utilize heterogeneous XML documents grow, the importance of XML document extraction increases greatly. In this paper, we propose a XML schema conversion technique that converts reusable XML schema from XML documents. We convert the schema graph and we use the reusability pattern list. The converted XML schema is evaluated in terms of cohesion, coupling, and reusability. The converted XML schema could be used to construct databases for various fields where XML is used as an intermediation of data exchange.

Implementation of an XML-Based Editor/Transformer for Large Volume of Similar Documents (XML 기반의 대용량 유사 문서 편집기/변환기 구현)

  • 황인준
    • The Journal of Society for e-Business Studies
    • /
    • v.9 no.1
    • /
    • pp.21-38
    • /
    • 2004
  • With its recent popularity, Web is now considered as a huge repository of information. Most documents on the web have been created using HTML(Hyper Text Markup Language). Even though HTML is simple and easy to learn, it has several features that are obstacles to the efficient information retrieval. XML(eXtensible Markup Language) can provide a solution to such problems and in fact, has already been used in many applications, XML is a standard markup language for exchanging data on the web. It can describe a document structure freely by defining its DTD, which enables efficient integration and retrieval of data on the web. In this paper, we propose a versatile and efficient XML document manager. Its features include (i) form-based XML editor that enables easy creation of new XML documents, (ii) automatic document converter that can transform HTML documents with similar structure into XML documents automatically, and (iii) GUI-based DTD editor.

  • PDF