• Title/Summary/Keyword: 그래프 검색

Search Result 226, Processing Time 0.026 seconds

Developing RDF Meta data Graph for Transportation Open Data Platform (교통데이터 유통을 위한 RDF 메타 데이터 그래프 구축방안)

  • Park, Eun Mi;Kang, Jung Hyun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.6
    • /
    • pp.110-116
    • /
    • 2021
  • W3C enacted RDF(Resource Description Framework based DCAT meta data standard, which is world-widely accepted so far. To guarantee the inter-operability and integrity of data from various sources and even from various countries, it is considered that transportation meta data should also follow the DCAT standard. But still, to represent the transportation domain-specific features, it is necessary to define new properties and vocabularies in addition to the DCAT standard. This research identified the additional properties and vocabularies for transportation metadata, considering uniqueness of transportation data. The revised RDF schema and RDF graph proposed in this research should be able to lead the transportation open data platform revitalization.

Design and Implementation of High-Speed Pattern Matcher in Network Intrusion Detection System (네트워크 침입 탐지 시스템에서 고속 패턴 매칭기의 설계 및 구현)

  • Yoon, Yeo-Chan;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.11B
    • /
    • pp.1020-1029
    • /
    • 2008
  • This paper proposes an high speed pattern matching algorithm and its implementation. The pattern matcher is used to check patterns from realtime input packet. The proposed algorithm can find exact string, range of string values, and combination of string values from input packet at high speed. Given string and rule set are modelled as a state transition graph which can find overlapped strings simultaneously, and the state transition graph is partitioned according to input implicants to reduce implementation complexity. The pattern matcher scheme uses the transformed state transition graph and input packet as an input. The pattern matcher was modelled and implemented in VHDL language. Experimental results show the proprieties of the proposed approach.

Fast Computation of All-pairs 2-step Radom Walk on Large Graphs (큰 그래프에서의 모든 쌍에 대한 빠른 2 단계 랜덤 워크 계산 방법)

  • Park, Sung-Chan;Lee, Sang-Goo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06c
    • /
    • pp.125-127
    • /
    • 2012
  • 현재 이종 그래프에 대한 연구가 활발히 진행되고 있다. 특히 추천 및 검색 분야에서 이종 그래프를 활용하여 성능을 높이는 성과가 두드러진다. 이종 그래프는 다양한 정보를 갖고 있으며, 특히 2단계 랜덤 워크 확률은 여러 유용한 정보를 가지고 있다. "어떤 사용자가 많이 본 영화를 많이 본 사용자", "어떤 사용자의 이웃이 많이 구입한 상품" 등이 그예이다. 하지만 이러한 정보를 실시간에 계산하기는 어려우며, 미리 계산해두는 것도 시간이 많이 든다. 이에 따라, 본 연구에서는 모든 출발 노드-도착 노드 쌍에 대한 2단계 랜덤 워크를 빠르게 미리 계산하는 알고리듬을 제시한다. 동일한 이웃 노드를 다수 가진 두 노드에서 출발하는 랜덤 워크 확률 값은 서로 비슷하다는 사실을 이용하여, 이전 계산 결과를 활용하여 근접 노드 목록에 대한 임의 접근 횟수를 줄인다. 더불어 본 알고리듬과 관련된 현안을 몇 가지 소개한다.

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.

A SNOMED CT Browser System Supporting Structural Search of Clinical Terminology (의학용어의 구조 검색을 지원하는 SNOMED CT 브라우저 시스템)

  • Ryu, Wooseok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.353-355
    • /
    • 2015
  • SNOMED CT browser is a search browser which searches and browses terminologies include in SNOMED CT. These terminologies shows a structural form using a variety of relationships. However, previous browsers merely lists up substring-matched search results, rather than using structural characteristics. This paper proposes and implements a browser system which shows a sub-graph of search results enabling structural search of the results. The implementation includes searching of terminologies based on substring-matching, tree-based graphical organization of the search results, and history of concept views.

  • PDF

Indexing method with deduplication for efficient RDF data retrieving (효율적인 RDF 데이터 검색을 위한 중복 제거 색인 방법)

  • Jang, Hyeonggyu;Bang, Sungho;Oh, Sangyoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.61-62
    • /
    • 2020
  • RDF의 활용이 증가하면서 RDF데이터를 저장하는 방법 또한 많은 연구가 이루어졌다. 그래프 형태인 RDF 데이터를 테이블로 바꿀 때, 동일한 데이터가 중복 저장되어 검색 시 불필요한 연산을 하는 문제점이 발생한다. 본 논문에서는 중복저장 및 불필요한 검색을 줄이기 위해 색인을 주어(S), 목적어(O) 색인과 이들의 중복 값을 별도의 색인을 만들고, 검색 시 중복 값을 확인하여 필요한 색인만 검색하는 기법을 제안한다. 실험에서 본 기법을 사용하여 불필요한 검색을 줄여서 전체적인 검색 시간이 줄어드는 것을 확인하였다.

  • PDF

Web Document Clustering based on Graph using Hyperlinks (하이퍼링크를 이용한 그래프 기반의 웹 문서 클러스터링)

  • Lee, Joon;Kang, Jin-Beom;Choi, Joong-Min
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.590-595
    • /
    • 2009
  • With respect to the exponential increment of web documents on the internet, it is important how to improve performance of clustering method for web documents. Web document clustering techniques can offer accurate information and fast information retrieval by clustering web documents through semantic relationship. The clustering method based on mesh-graph provides high recall by calculating similarity for documents, but it requires high computation cost. This paper proposes a clustering method using hyperlinks which is structural feature of web documents in order to keep effectiveness and reduce computation cost.

  • PDF

Learning Bayesian Networks for Text Documents Classification (텍스트 문서 분류를 위한 베이지안망 학습)

  • 황규백;장병탁;김영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.262-264
    • /
    • 2000
  • 텍스트 문서 분류는 텍스트 형태로 주어진 문서를 종류별로 구분하는 작업으로 웹페이지 검색, 뉴스 그룹 검색, 메일 필터링 등이 분야에 응용될 수 있는 기반 작업이다. 지금까지 문서를 분류하는데는 k-NN, 신경망 등 여러 가지 기계학습 기법이 이용되어 왔다. 이 논문에서는 베이지안망을 이용해서 텍스트 문서 분류를 행한다. 베이지안망은 다수의 변수들간의 확률적 관계를 표현하는 그래프 모델로 DAG 형태인 망 구조와 각 노드에 연관된 지역확률분포로 구성된다. 그래프 모델을 사용할 경우 학습에 이용되는 각 속성들간의 관계를 사람이 알아보기 쉬운 형태로 학습할 수 있다는 장점이 있다. 실험 데이터로는 Reuters-21578 문서분류데이터를 이용했으며 베이안망의 성능은 나이브 베이즈 분류기와 비슷했다.

  • PDF

Deadlock Detection using Graph Technique in Puzzle Game Environment (퍼즐 게임 환경에서 그래프 기법을 이용한 교착상태의 발견)

  • Park, Moon-Kyoung;Choi, Yong-Suk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.343-346
    • /
    • 2011
  • 대부분의 퍼즐 게임에서 발생 할 수 있는 중요한 문제 중 하나는 교착상태 문제이다. 본 논문에서는 이러한 교착상태를 해결하기 위해 퍼즐 게임을 그래프 기법으로 나타낸 뒤, 이를 이용한 새로운 교착상태 발견 기법인 Cycle Detection을 제안한다. 기존의 기법들은 알고리즘을 수행하는데 너무 많은 시간이 걸리거나, 패턴에 대한 데이터베이스가 구축되어 있어야 하기 때문에 실시간으로 교착상태를 발견하기엔 문제가 있다. 본 논문에서는 이러한 문제점을 해결하기 위하여 탐색해야 하는 노드의 개수를 최대한 줄이는 Local search 기법과 Pruning 기법을 적용하여 퍼즐 게임을 플레이하는 동안 실시간으로 교착상태를 발견할 수 있는 기법을 제안한다. 본 기법은 성능을 평가하기 위해 실제 퍼즐게임 환경에서 알고리즘을 수행하고, 그 결과로 검색하는 노드의 개수와 검색 시간을 기존의 기법과 비교하여 성능향상을 확인하였다.

Image Information Retrieval Using DTW(Dynamic Time Warping) (DTW(Dynamic Time Warping)를 이용한 영상 정보 검색)

  • Ha, Jeong-Yo;Lee, Na-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of Digital Contents Society
    • /
    • v.10 no.3
    • /
    • pp.423-431
    • /
    • 2009
  • There are various image retrieval methods using shape, color and texture features. One of the most active area is using shape and color information. A number of shape representations have been suggested to recognize shapes even under affine transformation. There are many kinds of method for shape recognition, the well-known method is Fourier descriptors and moment invariant. The other method is CSS(Curvature Scale Space). The maxima of curvature scale space image have already been used to represent 2-D shapes in different applications. Because preexistence CSS exists several problems, in this paper we use improved CSS method for retrieval image. There are two kinds of method, One is using RGB color information feature and the other is using HSI color information feature. In this paper we used HSI color model to represent color histogram before, then use it as comparison measure. The similarity is measured by using Euclidean distance and for reduce search time and accuracy, We use DTW for measure similarity. Compare with the result of using Euclidean distance, we can find efficiency elevated.

  • PDF