• Title/Summary/Keyword: graph structure

Search Result 507, Processing Time 0.044 seconds

NOGSEC: A NOnparametric method for Genome SEquence Clustering (녹섹(NOGSEC): A NOnparametric method for Genome SEquence Clustering)

  • 이영복;김판규;조환규
    • Korean Journal of Microbiology
    • /
    • v.39 no.2
    • /
    • pp.67-75
    • /
    • 2003
  • One large topic in comparative genomics is to predict functional annotation by classifying protein sequences. Computational approaches for function prediction include protein structure prediction, sequence alignment and domain prediction or binding site prediction. This paper is on another computational approach searching for sets of homologous sequences from sequence similarity graph. Methods based on similarity graph do not need previous knowledges about sequences, but largely depend on the researcher's subjective threshold settings. In this paper, we propose a genome sequence clustering method of iterative testing and graph decomposition, and a simple method to calculate a strict threshold having biochemical meaning. Proposed method was applied to known bacterial genome sequences and the result was shown with the BAG algorithm's. Result clusters are lacking some completeness, but the confidence level is very high and the method does not need user-defined thresholds.

A Visual Concurrent Programming Based on Extended State Transition Graph (확장 상태 전이 그래프에 기반을 둔 시각 병렬 프로그래밍)

  • Chung, Won-Ho;Hur, Hye-Jung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2430-2441
    • /
    • 2000
  • A visual concurrent programming environment, called ESTGVP is designed and implemented, which is easy to understand, highly portable, and can represent parallel behaviors. For our purpose, a conventional state transition graph is extended so as to enable both of synchronous and asynchronous parallel operations. We call it extended state transition graph (ESTG). ESTGVP uses the ESTG and texts for programming, and makes it easy programming sequential and parallel behaviors. Also, it is easy to understand the control structure of a program because ESTGVP is a visual programming environment based on the graph. ESTGVP is written in Tel language and thus it is highly portable on various operating systems. It consists of three major components; edition, transformation and execution. If necessary, ESTG can be transformed into C or Tel language, and its execution is based on Tel.

  • PDF

Application Plan of Graph Databases in the Big Data Environment (빅데이터환경에서의 그래프데이터베이스 활용방안)

  • Park, Sungbum;Lee, Sangwon;Ahn, Hyunsup;Jung, In-Hwan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.247-249
    • /
    • 2013
  • Even though Relational Databases have been widely used in many enterprises, the relations among entities are not managed effectively and efficiently. In order to analyze Big Data, it is absolutely needed to express various relations among entities in a graphical form. In this paper, we define Graph Databases and its structure. And then, we check out their characteristics such as transaction, consistency, availability, retrieval function, and expandability. Also, we appropriate or inappropriate subjects for application of Graph Databases.

  • PDF

Processing of Multiple Regular Path Expressions using PID (경로 식별자를 이용한 다중 정규경로 처리기법)

  • Kim, Jong-Ik;Jeong, Tae-Seon;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.274-284
    • /
    • 2002
  • Queries on XML are based on paths in the data graph, which is represented as an edge labeled graph model. All proposed query languages for XML express queries using regular expressions to traverse arbitrary paths in the data graph. A meaningful query usually has several regular path expressions in it, but much of recent research is more concerned with optimizing a single path expression. In this paper, we present an efficient technique to process multiple path expressions in a query. We developed a data structure named as the path identifier(PID) to identify whether two given nodes lie on the fame path in the data graph or not, and utilized the PID for efficient processing of multiple path expressions. We implement our technique and present preliminary performance results.

Efficient Construction of Over-approximated CFG on Esterel (Esterel에서 근사-제어 흐름그래프의 효율적인 생성)

  • Kim, Chul-Joo;Yun, Jeong-Han;Seo, Sun-Ae;Choe, Kwang-Moo;Han, Tai-Sook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.11
    • /
    • pp.876-880
    • /
    • 2009
  • A control flow graph(CFG) is an essential data structure for program analyses based on graph theory or control-/data- flow analyses. Esterel is an imperative synchronous language and its synchronous parallelism makes it difficult to construct a CFG of an Esterel program. In this work, we present a method to construct over-approximated CFGs for Esterel. Our method is very intuitive and generated CFGs include not only exposed paths but also invisible ones. Though the CFGs may contain some inexecutable paths due to complex combinations of parallelism and exception handling, they are very useful for other program analyses.

Malware Detection with Directed Cyclic Graph and Weight Merging

  • Li, Shanxi;Zhou, Qingguo;Wei, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3258-3273
    • /
    • 2021
  • Malware is a severe threat to the computing system and there's a long history of the battle between malware detection and anti-detection. Most traditional detection methods are based on static analysis with signature matching and dynamic analysis methods that are focused on sensitive behaviors. However, the usual detections have only limited effect when meeting the development of malware, so that the manual update for feature sets is essential. Besides, most of these methods match target samples with the usual feature database, which ignored the characteristics of the sample itself. In this paper, we propose a new malware detection method that could combine the features of a single sample and the general features of malware. Firstly, a structure of Directed Cyclic Graph (DCG) is adopted to extract features from samples. Then the sensitivity of each API call is computed with Markov Chain. Afterward, the graph is merged with the chain to get the final features. Finally, the detectors based on machine learning or deep learning are devised for identification. To evaluate the effect and robustness of our approach, several experiments were adopted. The results showed that the proposed method had a good performance in most tests, and the approach also had stability with the development and growth of malware.

Entity Matching Method Using Semantic Similarity and Graph Convolutional Network Techniques (의미적 유사성과 그래프 컨볼루션 네트워크 기법을 활용한 엔티티 매칭 방법)

  • Duan, Hongzhou;Lee, Yongju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.5
    • /
    • pp.801-808
    • /
    • 2022
  • Research on how to embed knowledge in large-scale Linked Data and apply neural network models for entity matching is relatively scarce. The most fundamental problem with this is that different labels lead to lexical heterogeneity. In this paper, we propose an extended GCN (Graph Convolutional Network) model that combines re-align structure to solve this lexical heterogeneity problem. The proposed model improved the performance by 53% and 40%, respectively, compared to the existing embedded-based MTransE and BootEA models, and improved the performance by 5.1% compared to the GCN-based RDGCN model.

Knowledge Graph of Administrative Codes in Korea: The Case for Improving Data Quality and Interlinking of Public Data

  • Haklae Kim
    • Journal of Information Science Theory and Practice
    • /
    • v.11 no.3
    • /
    • pp.43-57
    • /
    • 2023
  • Government codes are created and utilized to streamline and standardize government administrative procedures. They are generally employed in government information systems. Because they are included in open datasets of public data, users must be able to understand them. However, information that can be used to comprehend administrative code is lost during the process of releasing data in the government system, making it difficult for data consumers to grasp the code and limiting the connection or convergence of different datasets that use the same code.This study proposes a way to employ the administrative code produced by the Korean government as a standard in a public data environment on a regular basis. Because consumers of public data are barred from accessing government systems, a means of universal access to administrative code is required. An ontology model is used to represent the administrative code's data structure and meaning, and the full administrative code is built as a knowledge graph. The knowledge graph thus created is used to assess the accuracy and connection of administrative codes in public data. The method proposed in this study has the potential to increase the quality of coded information in public data as well as data connectivity.

Numerical Analysis in Hydrograph Determination for Sluice Gate installed Levee (배수통문이 설치된 제방의 설계수위파형결정에 관한 수치해석)

  • Kim, Jin-Man;Choi, Bong-Hyuck;Oh, Eun-Ho;Cho, Won-Beom
    • Journal of the Korean Geosynthetics Society
    • /
    • v.14 no.4
    • /
    • pp.1-9
    • /
    • 2015
  • According to national regulations and its commentary, such as Rivers Design Criteria & Commentary (KWRA, 2009), Foundation Structure Guideline and its Commentary(MLTM, 2014 and KGS, 2009), the integrity evaluation of river levee includes slope stability evaluation of both riverside/protected low-land and piping stability evaluation with respect to foundation and levee body along with water level conditions. In this case the design hydro-graph can be the most important input factor for the integrity evaluation, however it is fact that the national regulations do not provide any proper determination methods regarding hydro-graph. The authors thus executed an integrity evaluation of sluice gate in levee by changing each hydro-graph factor, including rising ordinary water level, lasting flood water level, falling water level, and flood frequency, in order to suggest a determination method of reasonable hydro-graph. As a result, the authors suggested that at least over 57 hours of rising ordinary water level and over 53 hours of lasting flood water level should be considered for the design hydro-graph of sluice gate in levee at Mun-san-jae.

Graph Topology Design for Generating Building Database and Implementation of Pattern Matching (건물 데이터베이스 구축을 위한 그래프 토폴로지 설계 및 패턴매칭 구현)

  • Choi, Hyo-Seok;Yom, Jae-Hong;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.5
    • /
    • pp.411-419
    • /
    • 2013
  • Research on developing algorithms for building modeling such as extracting outlines of the buildings and segmenting patches of the roofs using aerial images or LiDAR data are active. However, utilizing information from the building model is not well implemented yet. This study aims to propose a scheme for search identical or similar shape of buildings by utilizing graph topology pattern matching under the assumptions: (1) Buildings were modeled beforehand using imagery or LiDAR data, or (2) 3D building data from digital maps are available. Side walls, segmented roofs and footprints were represented as nodes, and relationships among the nodes were defined using graph topology. Topology graph database was generated and pattern matching was performed with buildings of various shapes. The results show that efficiency of the proposed method in terms of reliability of matching and database structure. In addition, flexibility in the search was achieved by altering conditions for the pattern matching. Furthermore, topology graph representation could be used as scale and rotation invariant shape descriptor.