• Title/Summary/Keyword: Graph DB

Search Result 32, Processing Time 0.039 seconds

An Efficient Traversal Algorithm for Large Hypergraphs and its Applications for Graph Analysis (대용량 하이퍼그래프에 대한 효율적인 탐색 기법과 분석에의 응용)

  • Ryu, Chungmo;Seo, Junghyuk;Kim, Myoung Ho
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.492-497
    • /
    • 2017
  • A hypergraph consists of a set of nodes and hyperedges that connect an arbitrary number of nodes. We employ graph traversal algorithms such as BFS and DFS to analyze or explore hypergraph data. However, the conventional BFS and DFS do not consider the structural characteristics of hyperedges. In this paper, we propose a method to record visited edges and nodes during the traversal algorithm for data stored in hypergraphDB. In the experiments, we conduct various hypergraph analyses that utilize traversal algorithms and show that our method achieves a fewer number of database accesses and faster processing time than the conventional one.

Graph Database Benchmarking Systems Supporting Diversity (다양성을 지원하는 그래프 데이터베이스 벤치마킹 시스템)

  • Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.84-94
    • /
    • 2021
  • Graph databases have been developed to efficiently store and query graph data composed of vertices and edges to express relationships between objects. Since the query types of graph database show very different characteristics from traditional NoSQL databases, benchmarking tools suitable for graph databases to verify the performance of the graph database are needed. In this paper, we propose an efficient graph database benchmarking system that supports diversity in graph inputs and queries. The proposed system utilizes OrientDB to conduct benchmarking for graph databases. In order to support the diversity of input graphs and query graphs, we use LDBC that is an existing graph data generation tool. We demonstrate the feasibility and effectiveness of the proposed scheme through analysis of benchmarking results. As a result of performance evaluation, it has been shown that the proposed system can generate customizable synthetic graph data, and benchmarking can be performed based on the generated graph data.

Knowledge graph-based knowledge map for efficient expression and inference of associated knowledge (연관지식의 효율적인 표현 및 추론이 가능한 지식그래프 기반 지식지도)

  • Yoo, Keedong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.49-71
    • /
    • 2021
  • Users who intend to utilize knowledge to actively solve given problems proceed their jobs with cross- and sequential exploration of associated knowledge related each other in terms of certain criteria, such as content relevance. A knowledge map is the diagram or taxonomy overviewing status of currently managed knowledge in a knowledge-base, and supports users' knowledge exploration based on certain relationships between knowledge. A knowledge map, therefore, must be expressed in a networked form by linking related knowledge based on certain types of relationships, and should be implemented by deploying proper technologies or tools specialized in defining and inferring them. To meet this end, this study suggests a methodology for developing the knowledge graph-based knowledge map using the Graph DB known to exhibit proper functionality in expressing and inferring relationships between entities and their relationships stored in a knowledge-base. Procedures of the proposed methodology are modeling graph data, creating nodes, properties, relationships, and composing knowledge networks by combining identified links between knowledge. Among various Graph DBs, the Neo4j is used in this study for its high credibility and applicability through wide and various application cases. To examine the validity of the proposed methodology, a knowledge graph-based knowledge map is implemented deploying the Graph DB, and a performance comparison test is performed, by applying previous research's data to check whether this study's knowledge map can yield the same level of performance as the previous one did. Previous research's case is concerned with building a process-based knowledge map using the ontology technology, which identifies links between related knowledge based on the sequences of tasks producing or being activated by knowledge. In other words, since a task not only is activated by knowledge as an input but also produces knowledge as an output, input and output knowledge are linked as a flow by the task. Also since a business process is composed of affiliated tasks to fulfill the purpose of the process, the knowledge networks within a business process can be concluded by the sequences of the tasks composing the process. Therefore, using the Neo4j, considered process, task, and knowledge as well as the relationships among them are defined as nodes and relationships so that knowledge links can be identified based on the sequences of tasks. The resultant knowledge network by aggregating identified knowledge links is the knowledge map equipping functionality as a knowledge graph, and therefore its performance needs to be tested whether it meets the level of previous research's validation results. The performance test examines two aspects, the correctness of knowledge links and the possibility of inferring new types of knowledge: the former is examined using 7 questions, and the latter is checked by extracting two new-typed knowledge. As a result, the knowledge map constructed through the proposed methodology has showed the same level of performance as the previous one, and processed knowledge definition as well as knowledge relationship inference in a more efficient manner. Furthermore, comparing to the previous research's ontology-based approach, this study's Graph DB-based approach has also showed more beneficial functionality in intensively managing only the knowledge of interest, dynamically defining knowledge and relationships by reflecting various meanings from situations to purposes, agilely inferring knowledge and relationships through Cypher-based query, and easily creating a new relationship by aggregating existing ones, etc. This study's artifacts can be applied to implement the user-friendly function of knowledge exploration reflecting user's cognitive process toward associated knowledge, and can further underpin the development of an intelligent knowledge-base expanding autonomously through the discovery of new knowledge and their relationships by inference. This study, moreover than these, has an instant effect on implementing the networked knowledge map essential to satisfying contemporary users eagerly excavating the way to find proper knowledge to use.

Optimized Adoption of NVM Storage by Considering Workload Characteristics

  • Kim, Jisun;Bahn, Hyokyung
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.17 no.1
    • /
    • pp.1-6
    • /
    • 2017
  • This paper presents an optimized adoption of NVM for the storage system of heterogeneous applications. Our analysis shows that a bulk of I/O does not happen on a single storage partition, but it is varied significantly for different application categories. In particular, journaling I/O accounts for a dominant portion of total I/O in DB applications like OLTP, whereas swap I/O accounts for a large portion of I/O in graph visualization applications, and file I/O accounts for a large portion in web browsers and multimedia players. Based on these observations, we argue that maximizing the performance gain with NVM is not obtained by fixing it as a specific storage partition but varied widely for different applications. Specifically, for graph visualization, DB, and multimedia player applications, using NVM as a swap, a journal, and a file system partitions, respectively, performs well. Our optimized adoption of NVM improves the storage performance by 10-61%.

Construction of web-based material DB and comparison of material properties using 3D graph (웹기반 재료 DB 구축 및 3D 그래프를 사용한 물성비교)

  • Chun D.M.;Ahn S.H.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2005.06a
    • /
    • pp.724-727
    • /
    • 2005
  • Material selection is one of the important activities in design and manufacturing. A selected material at the conceptual design stage affects material properties of the designed part as well as manufacturability and cost of the final product. Unfortunately there are not many accessible material databases that can be used for design. In this research, a web-based material database was constructed. In order to assist designers to compare different materials, two-dimensional and three-dimensional graphs were provided. Using these graphical tools, multi-dimensional comparison was available in more intuitive manner. To provide environmental safety of materials, the database included National Fire Protection Association publication Standard No.704. The web-based tool is available at http://fab.snu.ac.kr/matdb.

  • PDF

ANIDS(Advanced Network Based Intrusion Detection System) Design Using Association Rule Mining (연관법칙 마이닝(Association Rule Mining)을 이용한 ANIDS (Advanced Network Based IDS) 설계)

  • Jeong, Eun-Hee;Lee, Byung-Kwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.12
    • /
    • pp.2287-2297
    • /
    • 2007
  • The proposed ANIDS(Advanced Network Intrusion Detection System) which is network-based IDS using Association Rule Mining, collects the packets on the network, analyze the associations of the packets, generates the pattern graph by using the highly associated packets using Association Rule Mining, and detects the intrusion by using the generated pattern graph. ANIDS consists of PMM(Packet Management Module) collecting and managing packets, PGGM(Pattern Graph Generate Module) generating pattern graphs, and IDM(Intrusion Detection Module) detecting intrusions. Specially, PGGM finds the candidate packets of Association Rule large than $Sup_{min}$ using Apriori algorithm, measures the Confidence of Association Rule, and generates pattern graph of association rules large than $Conf_{min}$. ANIDS reduces the false positive by using pattern graph even before finalizing the new pattern graph, the pattern graph which is being generated is compared with the existing one stored in DB. If they are the same, we can estimate it is an intrusion. Therefore, this paper can reduce the speed of intrusion detection and the false positive and increase the detection ratio of intrusion.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Enterprise Network Weather Map System using SNMP (SNMP를 이용한 엔터프라이즈 Network Weather Map 시스템)

  • Kim, Myung-Sup;Kim, Sung-Yun;Park, Jun-Sang;Choi, Kyung-Jun
    • The KIPS Transactions:PartC
    • /
    • v.15C no.2
    • /
    • pp.93-102
    • /
    • 2008
  • The network weather map and bandwidth time-series graph are popularly used to understand the current and past traffic condition of NSP, ISP, and enterprise networks. These systems collect traffic performance data from a SNMP agent running on the network devices such as routers and switches, store the gathered information into a DB, and display the network performance status in the form of a time-series graph or a network weather map using Web user interface. Most of current enterprise networks are constructed in the form of a hierarchical tree-like structure with multi-Gbps Ethernet links, which is quietly different from the national or world-wide backbone network structure. This paper focuses on the network weather map for current enterprise network. We start with the considering points in developing a network weather map system suitable for enterprise network. Based on these considerings, this paper proposes the best way of using SNMP in constructing a network weather map system. To prove our idea, we designed and developed a network weather map system for our campus network, which is also described in detail.

An Improved Depth-Based TDMA Scheduling Algorithm for Industrial WSNs to Reduce End-to-end Delay (산업 무선 센서 네트워크에서 종단 간 지연시간 감소를 위한 향상된 깊이 기반 TDMA 스케줄링 개선 기법)

  • Lee, Hwakyung;Chung, Sang-Hwa;Jung, Ik-Joo
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.530-540
    • /
    • 2015
  • Industrial WSNs need great performance and reliable communication. In industrial WSNs, cluster structure reduces the cost to form a network, and the reservation-based MAC is a more powerful and reliable protocol than the contention-based MAC. Depth-based TDMA assigns time slots to each sensor node in a cluster-based network and it works in a distributed manner. DB-TDMA is a type of depth-based TDMA and guarantees scalability and energy efficiency. However, it cannot allocate time slots in parallel and cannot perfectly avoid a collision because each node does not know the total network information. In this paper, we suggest an improved distributed algorithm to reduce the end-to-end delay of DB-TDMA, and the proposed algorithm is compared with DRAND and DB-TDMA.

Morpheme Graph Generation with HMM based Continuous Speech Recognition (HMM에 기반한 연속음성인식에서의 형태소 그래프 생성)

  • Choi, Joon-Ki;Lee, Geun-Bae;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.500-504
    • /
    • 1997
  • 본 논문에서는 형태소 그래프를 정의하고 이를 한국어 연속 음성 인식의 결과로서 사용함과 동시에 한국어의 자연어 처리를 위한 지식 표현 방법으로 사용한다. 또한 형태소 그래프를 연속 음성 인식과정에서 효율적으로 생성하는 알고리즘으로서 Tree-Trellis 탐색 알고리즘을 소개한다. 한국어 연속 음성 인식기는 HMM 인식기를 사용하며 탐색 알고리즘 또한 HMM 음소 인식기의 사용을 전제로 한다. 실험 DB로는 한국과학기술원 통신연구실에서 제작한 3000 단어급의 무역상담관련 DB를 사용하였다.

  • PDF