• Title/Summary/Keyword: Graph-structured Data

Search Result 32, Processing Time 0.038 seconds

On the Organization of Object-Oriented Model Bases for Structured Modeling (구조적 모델링을 위한 객체지향적 모델베이스 조직화)

  • 정대율
    • The Journal of Information Systems
    • /
    • v.5
    • /
    • pp.149-173
    • /
    • 1996
  • This paper focus on the development of object-oriented model bases for Structured Modeling. For the model base organization, object modeling techniques and model typing concept which is similar to data typing concept are used. Structured modeling formalizes the notion of a definitional system as a way of dscribing models. From the object-oriented concept, a structured model can be represented as follows. Each group of similar elements(genus) is represented by a composite class. Other type of genera can be represented in a similar manner. This hierarchical class composition gives rise to an acyclic class-composition graph which corresponds with the genus graph of structured model. Nodes in this graph are instantiated to represent the elemental graph for a specific model. Taking this class composition process one step further, we aggregate the classes into higher-level composite classes which would correspond to the structured modeling notion of a module. Finally, the model itself is then represented by a composite class having attributes each of whose domain is a composite class representing one of the modules. The resulting class-composition graph represent the modular tree of the structured.

  • PDF

Efficient Mining of Frequent Subgraph with Connectivity Constraint

  • Moon, Hyun-S.;Lee, Kwang-H.;Lee, Do-Heon
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.267-271
    • /
    • 2005
  • The goal of data mining is to extract new and useful knowledge from large scale datasets. As the amount of available data grows explosively, it became vitally important to develop faster data mining algorithms for various types of data. Recently, an interest in developing data mining algorithms that operate on graphs has been increased. Especially, mining frequent patterns from structured data such as graphs has been concerned by many research groups. A graph is a highly adaptable representation scheme that used in many domains including chemistry, bioinformatics and physics. For example, the chemical structure of a given substance can be modelled by an undirected labelled graph in which each node corresponds to an atom and each edge corresponds to a chemical bond between atoms. Internet can also be modelled as a directed graph in which each node corresponds to an web site and each edge corresponds to a hypertext link between web sites. Notably in bioinformatics area, various kinds of newly discovered data such as gene regulation networks or protein interaction networks could be modelled as graphs. There have been a number of attempts to find useful knowledge from these graph structured data. One of the most powerful analysis tool for graph structured data is frequent subgraph analysis. Recurring patterns in graph data can provide incomparable insights into that graph data. However, to find recurring subgraphs is extremely expensive in computational side. At the core of the problem, there are two computationally challenging problems. 1) Subgraph isomorphism and 2) Enumeration of subgraphs. Problems related to the former are subgraph isomorphism problem (Is graph A contains graph B?) and graph isomorphism problem(Are two graphs A and B the same or not?). Even these simplified versions of the subgraph mining problem are known to be NP-complete or Polymorphism-complete and no polynomial time algorithm has been existed so far. The later is also a difficult problem. We should generate all of 2$^n$ subgraphs if there is no constraint where n is the number of vertices of the input graph. In order to find frequent subgraphs from larger graph database, it is essential to give appropriate constraint to the subgraphs to find. Most of the current approaches are focus on the frequencies of a subgraph: the higher the frequency of a graph is, the more attentions should be given to that graph. Recently, several algorithms which use level by level approaches to find frequent subgraphs have been developed. Some of the recently emerging applications suggest that other constraints such as connectivity also could be useful in mining subgraphs : more strongly connected parts of a graph are more informative. If we restrict the set of subgraphs to mine to more strongly connected parts, its computational complexity could be decreased significantly. In this paper, we present an efficient algorithm to mine frequent subgraphs that are more strongly connected. Experimental study shows that the algorithm is scaling to larger graphs which have more than ten thousand vertices.

  • PDF

A Path Partitioning Technique for Indexing XML Data (XML 데이타 색인을 위한 경로 분할 기법)

  • 김종익;김형주
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.320-330
    • /
    • 2004
  • Query languages for XML use paths in a data graph to represent queries. Actually, paths in a data graph are used as a basic constructor of an XML query. User can write more expressive Queries by using Patterns (e.g. regular expressions) for paths. There are many identical paths in a data graph because of the feature of semi-structured data. Current researches for indexing XML utilize identical paths in a data graph, but such an index can grow larger than source data graph and cannot guarantee efficient access path. In this paper we propose a partitioning technique that can partition all the paths in a data graph. We develop an index graph that can find appropriate partitions for a path query efficiently. The size of our index graph can be adjusted regardless of the source data. So, we can significantly improve the cost for index graph traversals. In the performance study, we show our index much faster than other graph based indexes.

Improving Diversity of Keyword Search on Graph-structured Data by Controlling Similarity of Content Nodes (콘텐트 노드의 유사성 제어를 통한 그래프 구조 데이터 검색의 다양성 향상)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.3
    • /
    • pp.18-30
    • /
    • 2020
  • Recently, as graph-structured data is widely used in various fields such as social networks and semantic Webs, needs for an effective and efficient search on a large amount of graph data have been increasing. Previous keyword-based search methods often find results by considering only the relevance to a given query. However, they are likely to produce semantically similar results by selecting answers which have high query relevance but share the same content nodes. To improve the diversity of search results, we propose a top-k search method that finds a set of subtrees which are not only relevant but also diverse in terms of the content nodes by controlling their similarity. We define a criterion for a set of diverse answer trees and design two kinds of diversified top-k search algorithms which are based on incremental enumeration and A heuristic search, respectively. We also suggest an improvement on the A search algorithm to enhance its performance. We show by experiments using real data sets that the proposed heuristic search method can find relevant answers with diverse content nodes efficiently.

Design of SGML Document Storage Management System using GROVE (GROVE를 이용한 SGML 문서 저장 관리 시스템 설계)

  • 정회경;안성옥;오일덕
    • The Journal of Information Technology
    • /
    • v.2 no.2
    • /
    • pp.269-279
    • /
    • 1999
  • SGML(Standard Generalized Markup Language) is proper to view, modify and create new electronic document as documentation standard to create and interchange the structured document information. Accordingly, a study on efficient storage and management of very large structured SGML document information is need. This paper proposes design of data modeling based on GROVE(Graph Representation Of property ValuEs) defined in HyTime(Hypermedia Time-based Structuring Language) and describes design of SGML document storage management system.

  • PDF

On XML Data Processing through Implementing A Deductive and Object-oriented Database Language (연역 객체 지향 데이터베이스 언어 구현을 통한 XML 데이터 처리에 관한 연구)

  • Kim, Seong-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.991-998
    • /
    • 2002
  • With the advent of XML and database languages armed with the object-oriented concept and deductive logic, the problem of efficient query processing for them has become a major issue. We describe a way of processing semi-structured XML data through an implementation of a Deductive and Object-oriented Database (DOODB) language with the explanation of query processing. We have shown how to convert an XML data model to a DOODB data model. We have then presented an efficient query processing method based on Connection Graph Resolution. We also present a knowledge-based query processing method that uses the homomorphism of objects in the database and the associative rule of substitutions.

A Method for Non-redundant Keyword Search over Graph Data (그래프 데이터에 대한 비-중복적 키워드 검색 방법)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.6
    • /
    • pp.205-214
    • /
    • 2016
  • As a large amount of graph-structured data is widely used in various applications such as social networks, semantic web, and bio-informatics, keyword-based search over graph data has been getting a lot of attention. In this paper, we propose an efficient method for keyword search over graph data to find a set of top-k answers that are relevant as well as non-redundant in structure. We define a non-redundant answer structure for a keyword query and a relevance measure for the answer. We suggest a new indexing scheme on the relevant paths between nodes and keyword terms in the graph, and also propose a query processing algorithm to find top-k non-redundant answers efficiently by exploiting the pre-calculated indexes. We present effectiveness and efficiency of the proposed approach compared to the previous method by conducting an experiment using a real dataset.

Efficient Query Retrieval from Social Data in Neo4j using LIndex

  • Mathew, Anita Brigit
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2211-2232
    • /
    • 2018
  • The unstructured and semi-structured big data in social network poses new challenges in query retrieval. This requirement needs to be met by introducing quality retrieval time measures like indexing. Due to the huge volume of data storage, there originate the need for efficient index algorithms to promote query processing. However, conventional algorithms fail to index the huge amount of frequently obtained information in real time and fall short of providing scalable indexing service. In this paper, a new LIndex algorithm, which is a heuristic on Lucene is built on Neo4jHA architecture that holds the social network Big data. LIndex is a flexible and simplified adaptive indexing scheme that ascendancy decomposed shortest paths around term neighbors as basic indexing unit. This newfangled index proves to be effectual in query space pruning of graph database Neo4j, scalable in index construction and deployment. A graph query is processed and optimized beyond the traditional Lucene in a time-based manner to a more efficient path method in LIndex. This advanced algorithm significantly reduces query fetch without compromising the quality of results in time. The experiments are conducted to confirm the efficiency of the proposed query retrieval in Neo4j graph NoSQL database.

Design & Implementation of Extractor for Design Sequence of DB tables using Data Flow Diagrams (자료흐름도를 사용한 테이블 설계순서 추출기의 설계 및 구현)

  • Lim, Eun-Ki
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.3
    • /
    • pp.43-49
    • /
    • 2012
  • Information obtained from DFD(Data Flow Diagram) are very important in system maintenance, because most legacy systems are analyzed using DFD in structured analysis. In our thesis, we design and implement an extractor for design sequence of database tables using DFD. Our extractor gets DFDs as input data, transform them into a directed graph, and extract design sequence of DB tables. We show practicality of our extractor by applying it to a s/w system in operation.

Developing Graphic Interface for Efficient Online Searching and Analysis of Graph-Structured Bibliographic Big Data (그래프 구조를 갖는 서지 빅데이터의 효율적인 온라인 탐색 및 분석을 지원하는 그래픽 인터페이스 개발)

  • You, Youngseok;Park, Beomjun;Jo, Sunhwa;Lee, Suan;Kim, Jinho
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.77-88
    • /
    • 2020
  • Recently, many researches habe been done to organize and analyze various complex relationships in real world, represented in the form of graphs. In particular, the computer field literature data system, such as DBLP, is a representative graph data in which can be composed of papers, their authors, and citation among papers. Becasue graph data is very complex in storage structure and expression, it is very difficult task to search, analysis, and visualize a large size of bibliographic big data. In this paper, we develop a graphic user interface tool, called EEUM, which visualizes bibliographic big data in the form of graphs. EEUM provides the features to browse bibliographic big data according to the connected graph structure by visually displaying graph data, and implements search, management and analysis of the bibliographc big data. It also shows that EEUM can be conveniently used to search, explore, and analyze by applying EEUM to the bibliographic graph big data provided by DBLP. Through EEUM, you can easily find influential authors or papers in every research fields, and conveniently use it as a search and analysis tool for complex bibliographc big data, such as giving you a glimpse of all the relationships between several authors and papers.