• Title/Summary/Keyword: web Graph

Search Result 218, Processing Time 0.023 seconds

Web Structure Mining by Extracting Hyperlinks from Web Documents and Access Logs (웹 문서와 접근로그의 하이퍼링크 추출을 통한 웹 구조 마이닝)

  • Lee, Seong-Dae;Park, Hyu-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.11
    • /
    • pp.2059-2071
    • /
    • 2007
  • If the correct structure of Web site is known, the information provider can discover users# behavior patterns and characteristics for better services, and users can find useful information easily and exactly. There may be some difficulties, however, to extract the exact structure of Web site because documents one the Web tend to be changed frequently. This paper proposes new method for extracting such Web structure automatically. The method consists of two phases. The first phase extracts the hyperlinks among Web documents, and then constructs a directed graph to represent the structure of Web site. It has limitations, however, to discover the hyperlinks in Flash and Java Applet. The second phase is to find such hidden hyperlinks by using Web access log. It fist extracts the click streams from the access log, and then extract the hidden hyperlinks by comparing with the directed graph. Several experiments have been conducted to evaluate the proposed method.

A Study of Automatic Ontology Building by Web Information Extraction and Natural Language Processing (웹 문서 정보추출과 자연어처리를 통한 온톨로지 자동구축에 관한 연구)

  • Kim, Myung-Gwan;Lee, Young-Woo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.3
    • /
    • pp.61-67
    • /
    • 2009
  • The proliferation of the Internet grows, according to electronic documents, along with increasing importance of technology in information retrieval. This research is possible to build a more efficient and accurate knowledge-base with unstructured text documents from the Web using to extract knowledge of the core meaning of LGG (Local Grammar Graph). We have built a ontology based on OWL(Web Ontology Language) using the areas of particular stocks up/down patterns created by the extraction and grammar patterns. It is possible for the user can search for meaning and quality of information about the user wants.

  • PDF

Text Extraction and Summarization from Web News (웹 뉴스의 기사 추출과 요약)

  • Han, Kwang-Rok;Sun, Bok-Keun;Yoo, Hyoung-Sun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.1-10
    • /
    • 2007
  • Many types of information provided through the web including news contents contain unnecessary clutters. These clutters make it difficult to build automated information processing systems such as the summarization, extraction and retrieval of documents. We propose a system that extracts and summarizes news contents from the web. The extraction system receives news contents in HTML as input and builds an element tree similar to DOM tree, and extracts texts while removing clutters with the hyperlink attribute in the HTML tag from the element tree. Texts extracted through the extraction system are transferred to the summarization system, which extracts key sentences from the texts. We implement the summarization system using co-occurrence relation graph. The summarized sentences of this paper are expected to be transmissible to PDA or cellular phone by message services such as SMS.

  • PDF

Web Site Construction Using Internet Information Extraction (인터넷 정보 추출을 이용한 웹문서 구조화)

Hierarchical Web Structuring Using Integer Programming

  • Lee Wookey;Kim Seung;Kim Hando;Kang Suk-Ho
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.51-67
    • /
    • 2004
  • World Wide Web is nearly ubiquitous and the tremendous growing number of Web information strongly requires a structuring framework by which an overview visualization of Web sites has provided as a visual surrogate for the users. We have a viewpoint that the Web site is a directed graph with nodes and arcs where the nodes correspond to Web pages and the arcs correspond to hypertext links between the Web pages. In dealing with the WWW, the goal in this paper is not to derive a naive shortest path or a fast access method, but to generate an optimal structure based on the context centric weight. We modeled a Web site formally so that a integer programming model can be formulated. Even if changes such as modification of the query terms, the optimized Web site structure can be maintained in terms of sensitivity.

  • PDF

Automated Generation of Composite Web Services based on Functional Semantics (기능적 의미에 기반한 복합 웹 서비스 자동 구성)

  • Shin, Dong-Hoon;Lee, Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1310-1323
    • /
    • 2008
  • Recently, many studies on automated generation of composite Web services have been done. Most of these works compose Web services by chaining their inputs and outputs, but do not consider the functional semantics. Therefore, they may construct unsatisfied composite services against users' intention. Futhermore, they have high time-complexity since every possible combinations of available services should be considered. To resolve these problems, this paper proposes a sophisticated composition method that explicitly specifies and uses the functional semantics of Web services. Specifically, A graph model is constructed to represent the functional semantics of Web services as well as the dependency among inputs and outputs. On the graph, we search core services which provide the requested function ality and additional services which transform between I/O types of the user request and the core services. Then, composite services are built from combinations of the discovered services. The proposed method improves the semantic correctness of composite services by the functional semantics of Web services, and reduces the time complexity by combinations of functionally related services.

  • PDF

Remote Diagnosis of Hypertension through HTML-based Backward Inference

  • Song, Yong-Uk;Chae, Young-Moon;Cho, Kyoung-Won;Ho, Seung-Hee
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.496-507
    • /
    • 2001
  • An expert system for the diagnosis and indication of hypertension is implemented through HTML-based backward inference. HTML-based backward inference is performed using the hypertext function of HTML, and many HTML files, which are hyperlinked to each other based on the backward rules, should be prepared beforehand. The development and maintenance of the HTML files are conducted automatically using the decision graph. Still, the drawing and input of the decision graph is a time consuming and tedious job if it is done manually. So, automatic generator of the decision graph for the diagnosis and indication of hypertension was implemented. The HTML-based backward inference ensures accessibility, multimedia facilities, fast response, stability, easiness, and platform independency of the expert system. So, this research reveals that HTML-based inference approach can be used for many Web-based intelligent site with fast and stable performance.

  • PDF

Document Clustering with Relational Graph Of Common Phrase and Suffix Tree Document Model (공통 Phrase의 관계 그래프와 Suffix Tree 문서 모델을 이용한 문서 군집화 기법)

  • Cho, Yoon-Ho;Lee, Sang-Keun
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.2
    • /
    • pp.142-151
    • /
    • 2009
  • Previous document clustering method, NSTC measures similarities between two document pairs using TF-IDF during web document clustering. In this paper, we propose new similarity measure using common phrase-based relational graph, not TF-IDF. This method suggests that weighting common phrases by relational graph presenting relationship among common phrases in document collection. And experimental results indicate that proposed method is more effective in clustering document collection than NSTC.

SGS: Splicing Graph Server

  • Bollina, Durgaprasad;Lee, Bernett T.K.;Ranganathan, Shoba
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.47-50
    • /
    • 2005
  • SGS (Splicing Graph Server) is as web application based on the MVC architecture with a Java platform. The specifications of the implemented design pattern are closely associated with the specific requirements of splicing graphs for analyzing alternative splice variants from a single gene. The paper presents the use of MVC architecture using JavaBeans as a model, with a JSP viewer and the servlet as the controller for this bioinformatics web application, with the open source apache/tomcat application server and a MySql database management system.

  • PDF

Comparisons of MMR, Clustering and Perfect Link Graph Summarization Methods (MMR, 클러스터링, 완전연결기법을 이용한 요약방법 비교)

  • 유준현;변동률;박순철
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1319-1322
    • /
    • 2003
  • We present a web document summarizer, simpler more condense than the existing ones, of a search engine. This summarizer generates summaries with a statistic-based summarization method using Clustering or MMR technique to reduce redundancy in the results, and that generates summaries using Perfect Link Graph. We compare the results with the summaries generated by human subjects. For the comparison, we use FScore. Our experimental results verify the accuracy of the summarization methods.

  • PDF