• Title/Summary/Keyword: 생물학적 데이터 탐색

Search Result 18, Processing Time 0.02 seconds

Prediction of Protein Interactions using the Associative Feature Concept Space Mapping (연관속성개념공간으로의 사상을 이용한 단백질 상호작용 예측)

  • Eom Jae-Hong;Zhang Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.73-75
    • /
    • 2006
  • 생물체 내에서 중요 생물학적 기능을 수행하는 기본 단위인 단백질 및 이들의 상호작용 대한 많은 연구가 이루어져 다양한 생물체에 대한 단백질 상호작용 데이터베이스가 구축되었다. 본 논문에서는 효모에 대해 공개되어있는 단백질 상호작용 데이터를 이용하여 새로운 단백질 상호작용을 예측하는 방법을 제안한다. 논문에서는 문헌에서 연관 정보를 효율적으로 찾아내기 위하여 제안된 연관개념공간 탐색 방법을 확장하여 단백질 상호작용 예측에 사용한다. 단백질들은 각각이 가지는 다양한 속성들의 벡터로 간주되며, 상호작용은 해당 단백질들의 연관성을 통해 이루어지는 것으로 표현된다. 상호작용하는 두 단백질들의 속성은 단어의 공동 출현과 같이 고려되어 단백질 상호작용은 두 단백질 벡터의 요소로 표현되고 벡터의 요소 속성들 간의 연관성을 표현하기 위해 연관속성개념공간으로 사상되어 공간상의 거리 기반으로 연관속성을 추출한다. 추출된 연관속성을 최대로 포함하는 단백질들 간의 상호작용을 예측하는 방식으로 단백질 상호작용을 예측한다. 논문에서 제안한 방법은 효모의 단백질 상호작용 예측에 대해 평균 약 91.8%의 예측 정확도를 보여, 연관속성개념공간을 이용한 방법이 단백질 상호작용을 예측하는 또 다른 대안으로 사용 될 수 있음을 확인하였다.

  • PDF

Integrated Model Design of Microarray Data Using miRNA, PPI, Disease Information (miRNA, PPI, 질병 정보를 이용한 마이크로어레이 데이터 통합 모델 설계)

  • Ha, Kyung-Sik;Lim, Jin-Muk;Kim, Hong-Gee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.786-792
    • /
    • 2012
  • A microarray is a collection of thousands of DNAs or RNAs arranged on a substrate, and it enables one to navigate large amounts of gene expression. However, a researcher uses his designed experimental methods to focus on particular phenotypes from the available mass of data. In this paper, we used MicroRNAs(miRNAs) and Protein-Protein Interation(PPI) databases to enhance and expand meanings in microarray data. Further, the expanded data are linked with the Online Mendelian Inheritance in Man(OMIM), and International Statistical Classification of Diseases and Related Health Problems, $10^{th}$ Revision(ICD-10), in order to extract common genetic relationships between diseases. This approach, we expect, should provide new biological views.

Least Square Prediction Error Expansion Based Reversible Watermarking for DNA Sequence (최소자승 예측오차 확장 기반 가역성 DNA 워터마킹)

  • Lee, Suk-Hwan;Kwon, Seong-Geun;Kwon, Ki-Ryong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.11
    • /
    • pp.66-78
    • /
    • 2015
  • With the development of bio computing technology, DNA watermarking to do as a medium of DNA information has been researched in the latest time. However, DNA information is very important in biologic function unlikely multimedia data. Therefore, the reversible DNA watermarking is required for the host DNA information to be perfectively recovered. This paper presents a reversible DNA watermarking using least square based prediction error expansion for noncodng DNA sequence. Our method has three features. The first thing is to encode the character string (A,T,C,G) of nucleotide bases in noncoding region to integer code values by grouping n nucleotide bases. The second thing is to expand the prediction error based on least square (LS) as much as the expandable bits. The last thing is to prevent the false start codon using the comparison searching of adjacent watermarked code values. Experimental results verified that our method has more high embedding capacity than conventional methods and mean prediction method and also makes the prevention of false start codon and the preservation of amino acids.

Design and Implementation of Red Tide Monitoring System using Wireless Sensor Network (무선 센서 네트워크를 이용한 적조 모니터링 시스템의 설계 및 구현)

  • Heo, Min;Yim, Jae-Hong;Kim, Byoung-Chan
    • Journal of Navigation and Port Research
    • /
    • v.31 no.3 s.119
    • /
    • pp.263-269
    • /
    • 2007
  • The outbreaks of red tide were sporadic in the South Sea until 1994, but have become frequent and widespread in whole coastal waters of the South Sea and East Sea since 1995 For monitoring of red tide, many kinds of techniques such as remote sensing, GIS and fuzzy model system have been developed and applied. The purpose of this paper is to develop red tide monitoring system for collection of red tide data and biological-oceanography parameters using wireless sensor network. The wireless sensor network has been noticed as a core technology in order to realize ubiquitous computing. In this paper, we design red tide database using wireless sensor network and suggest red tide monitoring software and web-service for user and biological-oceanographer.

Microarray data analysis using relative hierarchical clustering (상대적 계층적 군집 방법을 이용한 마이크로어레이 자료의 군집분석)

  • Woo, Sook Young;Lee, Jae Won;Jhun, Myoungshic
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.999-1009
    • /
    • 2014
  • Hierarchical clustering analysis helps easily exploring massive microarray data and understanding biological phenomena with dendrogram. But, because hierarchical clustering algorithms only consider the absolute similarity, it is difficult to illustrate a relative dissimilarity, which consider not only the distance between a pair of clusters, but also how distant are they from the rest of the clusters. In this study, we introduced the relative hierarchical clustering method proposed by Mollineda and Vidal (2000) and compared hierarchical clustering method and relative hierarchical method using the simulated data and the real data in the various situations. The evaluation of the quality of two hierarchical methods was performed using percentage of incorrectly grouped points (PIGP), homogeneity and separation.

De Novo Drug Design Using Self-Attention Based Variational Autoencoder (Self-Attention 기반의 변분 오토인코더를 활용한 신약 디자인)

  • Piao, Shengmin;Choi, Jonghwan;Seo, Sangmin;Kim, Kyeonghun;Park, Sanghyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.11-18
    • /
    • 2022
  • De novo drug design is the process of developing new drugs that can interact with biological targets such as protein receptors. Traditional process of de novo drug design consists of drug candidate discovery and drug development, but it requires a long time of more than 10 years to develop a new drug. Deep learning-based methods are being studied to shorten this period and efficiently find chemical compounds for new drug candidates. Many existing deep learning-based drug design models utilize recurrent neural networks to generate a chemical entity represented by SMILES strings, but due to the disadvantages of the recurrent networks, such as slow training speed and poor understanding of complex molecular formula rules, there is room for improvement. To overcome these shortcomings, we propose a deep learning model for SMILES string generation using variational autoencoders with self-attention mechanism. Our proposed model decreased the training time by 1/26 compared to the latest drug design model, as well as generated valid SMILES more effectively.

Characterization of the Alzheimer's disease-related network based on the dynamic network approach (동적인 개념을 적용한 알츠하이머 질병 네트워크의 특성 분석)

  • Kim, Man-Sun;Kim, Jeong-Rae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.6
    • /
    • pp.529-535
    • /
    • 2015
  • Biological networks have been handled with the static concept. However, life phenomena in cells occur depending on the cellular state and the external environment, and only a few proteins and their interactions are selectively activated. Therefore, we should adopt the dynamic network concept that the structure of a biological network varies along the flow of time. This concept is effective to analyze the progressive transition of the disease. In this paper, we applied the proposed method to Alzheimer's disease to analyze the structural and functional characteristics of the disease network. Using gene expression data and protein-protein interaction data, we constructed the sub-networks in accordance with the progress of disease (normal, early, middle and late). Based on this, we analyzed structural properties of the network. Furthermore, we found module structures in the network to analyze the functional properties of the sub-networks using the gene ontology analysis (GO). As a result, it was shown that the functional characteristics of the dynamics network is well compatible with the stage of the disease which shows that it can be used to describe important biological events of the disease. Via the proposed approach, it is possible to observe the molecular network change involved in the disease progression which is not generally investigated, and to understand the pathogenesis and progression mechanism of the disease at a molecular level.

Consecutive Difference Expansion Based Reversible DNA Watermarking (연속적 차분 확장 기반 가역 DNA 워터마킹)

  • Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.7
    • /
    • pp.51-62
    • /
    • 2015
  • Of recent interests on high capacity DNA storage, DNA watermarking for DNA copyright protection, and DNA steganography for DNA secret communication are augmented, the reversible DNA watermarking is much needed both to embed the watermark without changing the functionality of organism and to perfectly recover the host DNA sequence. In this paper, we address two ways of DE based reversible DNA watermarking using noncoding DNA sequence. The reversible DNA watermarking should consider the string structure of a DNA sequence, the organism functionality, the perfect recovery, and the high embedding capacity. We convert the string sequence of four characters in noncoding region to the decimal coded values and embed the watermark bit into coded values by two ways; DE based multiple bits embedding (DE-MBE) using pairs of neighbor coded values and consecutive DE-MBE (C-DE-MBE). Two ways process the comparison searching to prevent the false start codon that produces false coding region. Experimental results verified that our ways have more high embedding capacity than conventional methods and produce no false start codon and recover perfectly the host sequence without the reference sequence. Especially C-DE-MBE can embed more high two times than DE-MBE.