• Title/Summary/Keyword: protein-protein Interaction prediction

Search Result 69, Processing Time 0.024 seconds

A New Approach to Find Orthologous Proteins Using Sequence and Protein-Protein Interaction Similarity

  • Kim, Min-Kyung;Seol, Young-Joo;Park, Hyun-Seok;Jang, Seung-Hwan;Shin, Hang-Cheol;Cho, Kwang-Hwi
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.141-147
    • /
    • 2009
  • Developed proteome-scale ortholog and paralog prediction methods are mainly based on sequence similarity. However, it is known that even the closest BLAST hit often does not mean the closest neighbor. For this reason, we added conserved interaction information to find orthologs. We propose a genome-scale, automated ortholog prediction method, named OrthoInterBlast. The method is based on both sequence and interaction similarity. When we applied this method to fly and yeast, 17% of the ortholog candidates were different compared with the results of Inparanoid. By adding protein-protein interaction information, proteins that have low sequence similarity still can be selected as orthologs, which can not be easily detected by sequence homology alone.

Prediction of Protein-Protein Interaction Sites Based on 3D Surface Patches Using SVM (SVM 모델을 이용한 3차원 패치 기반 단백질 상호작용 사이트 예측기법)

  • Park, Sung-Hee;Hansen, Bjorn
    • The KIPS Transactions:PartD
    • /
    • v.19D no.1
    • /
    • pp.21-28
    • /
    • 2012
  • Predication of protein interaction sites for monomer structures can reduce the search space for protein docking and has been regarded as very significant for predicting unknown functions of proteins from their interacting proteins whose functions are known. In the other hand, the prediction of interaction sites has been limited in crystallizing weakly interacting complexes which are transient and do not form the complexes stable enough for obtaining experimental structures by crystallization or even NMR for the most important protein-protein interactions. This work reports the calculation of 3D surface patches of complex structures and their properties and a machine learning approach to build a predictive model for the 3D surface patches in interaction and non-interaction sites using support vector machine. To overcome classification problems for class imbalanced data, we employed an under-sampling technique. 9 properties of the patches were calculated from amino acid compositions and secondary structure elements. With 10 fold cross validation, the predictive model built from SVM achieved an accuracy of 92.7% for classification of 3D patches in interaction and non-interaction sites from 147 complexes.

Computational approaches for prediction of protein-protein interaction between Foot-and-mouth disease virus and Sus scrofa based on RNA-Seq

  • Park, Tamina;Kang, Myung-gyun;Nah, Jinju;Ryoo, Soyoon;Wee, Sunghwan;Baek, Seung-hwa;Ku, Bokkyung;Oh, Yeonsu;Cho, Ho-seong;Park, Daeui
    • Korean Journal of Veterinary Service
    • /
    • v.42 no.2
    • /
    • pp.73-83
    • /
    • 2019
  • Foot-and-Mouth Disease (FMD) is a highly contagious trans-boundary viral disease caused by FMD virus, which causes huge economic losses. FMDV infects cloven hoofed (two-toed) mammals such as cattle, sheep, goats, pigs and various wildlife species. To control the FMDV, it is necessary to understand the life cycle and the pathogenesis of FMDV in host. Especially, the protein-protein interaction between FMDV and host will help to understand the survival cycle of viruses in host cell and establish new therapeutic strategies. However, the computational approach for protein-protein interaction between FMDV and pig hosts have not been applied to studies of the onset mechanism of FMDV. In the present work, we have performed the prediction of the pig's proteins which interact with FMDV based on RNA-Seq data, protein sequence, and structure information. After identifying the virus-host interaction, we looked for meaningful pathways and anticipated changes in the host caused by infection with FMDV. A total of 78 proteins of pig were predicted as interacting with FMDV. The 156 interactions include 94 interactions predicted by sequence-based method and the 62 interactions predicted by structure-based method using domain information. The protein interaction network contained integrin as well as STYK1, VTCN1, IDO1, CDH3, SLA-DQB1, FER, and FGFR2 which were related to the up-regulation of inflammation and the down-regulation of cell adhesion and host defense systems such as macrophage and leukocytes. These results provide clues to the knowledge and mechanism of how FMDV affects the host cell.

Prediction of Implicit Protein - Protein Interaction Using Optimal Associative Feature Rule (최적 연관 속성 규칙을 이용한 비명시적 단백질 상호작용의 예측)

  • Eom, Jae-Hong;Zhang, Byoung-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.4
    • /
    • pp.365-377
    • /
    • 2006
  • Proteins are known to perform a biological function by interacting with other proteins or compounds. Since protein interaction is intrinsic to most cellular processes, prediction of protein interaction is an important issue in post-genomic biology where abundant interaction data have been produced by many research groups. In this paper, we present an associative feature mining method to predict implicit protein-protein interactions of Saccharomyces cerevisiae from public protein interaction data. We discretized continuous-valued features by maximal interdependence-based discretization approach. We also employed feature dimension reduction filter (FDRF) method which is based on the information theory to select optimal informative features, to boost prediction accuracy and overall mining speed, and to overcome the dimensionality problem of conventional data mining approaches. We used association rule discovery algorithm for associative feature and rule mining to predict protein interaction. Using the discovered associative feature we predicted implicit protein interactions which have not been observed in training data. According to the experimental results, the proposed method accomplished about 96.5% prediction accuracy with reduced computation time which is about 29.4% faster than conventional method with no feature filter in association rule mining.

Web-Based Computational System for Protein-Protein Interaction Inference

  • Kim, Ki-Bong
    • Journal of Information Processing Systems
    • /
    • v.8 no.3
    • /
    • pp.459-470
    • /
    • 2012
  • Recently, high-throughput technologies such as the two-hybrid system, protein chip, Mass Spectrometry, and the phage display have furnished a lot of data on protein-protein interactions (PPIs), but the data has not been accurate so far and the quantity has also been limited. In this respect, computational techniques for the prediction and validation of PPIs have been developed. However, existing computational methods do not take into account the fact that a PPI is actually originated from the interactions of domains that each protein contains. So, in this work, the information on domain modules of individual proteins has been employed in order to find out the protein interaction relationship. The system developed here, WASPI (Web-based Assistant System for Protein-protein interaction Inference), has been implemented to provide many functional insights into the protein interactions and their domains. To achieve those objectives, several preprocessing steps have been taken. First, the domain module information of interacting proteins was extracted by taking advantage of the InterPro database, which includes protein families, domains, and functional sites. The InterProScan program was used in this preprocess. Second, the homology comparison with the GO (Gene Ontology) and COG (Clusters of Orthologous Groups) with an E-value of $10^{-5}$, $10^{-3}$ respectively, was employed to obtain the information on the function and annotation of each interacting protein of a secondary PPI database in the WASPI. The BLAST program was utilized for the homology comparison.

Prediction of hub genes of Alzheimer's disease using a protein interaction network and functional enrichment analysis

  • Wee, Jia Jin;Kumar, Suresh
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.39.1-39.8
    • /
    • 2020
  • Alzheimer's disease (AD) is a chronic, progressive brain disorder that slowly destroys affected individuals' memory and reasoning faculties, and consequently, their ability to perform the simplest tasks. This study investigated the hub genes of AD. Proteins interact with other proteins and non-protein molecules, and these interactions play an important role in understanding protein function. Computational methods are useful for understanding biological problems, in particular, network analyses of protein-protein interactions. Through a protein network analysis, we identified the following top 10 hub genes associated with AD: PTGER3, C3AR1, NPY, ADCY2, CXCL12, CCR5, MTNR1A, CNR2, GRM2, and CXCL8. Through gene enrichment, it was identified that most gene functions could be classified as integral to the plasma membrane, G-protein coupled receptor activity, and cell communication under gene ontology, as well as involvement in signal transduction pathways. Based on the convergent functional genomics ranking, the prioritized genes were NPY, CXCL12, CCR5, and CNR2.

Construction of a Protein-Protein Interaction Network for Chronic Myelocytic Leukemia and Pathway Prediction of Molecular Complexes

  • Zhou, Chao;Teng, Wen-Jing;Yang, Jing;Hu, Zhen-Bo;Wang, Cong-Cong;Qin, Bao-Ning;Lv, Qing-Liang;Liu, Ze-Wang;Sun, Chang-Gang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.13
    • /
    • pp.5325-5330
    • /
    • 2014
  • Background: Chronic myelocytic leukemia is a disease that threatens both adults and children. Great progress has been achieved in treatment but protein-protein interaction networks underlining chronic myelocytic leukemia are less known. Objective: To develop a protein-protein interaction network for chronic myelocytic leukemia based on gene expression and to predict biological pathways underlying molecular complexes in the network. Materials and Methods: Genes involved in chronic myelocytic leukemia were selected from OMIM database. Literature mining was performed by Agilent Literature Search plugin and a protein-protein interaction network of chronic myelocytic leukemia was established by Cytoscape. The molecular complexes in the network were detected by Clusterviz plugin and pathway enrichment of molecular complexes were performed by DAVID online. Results and Discussion: There are seventy-nine chronic myelocytic leukemia genes in the Mendelian Inheritance In Man Database. The protein-protein interaction network of chronic myelocytic leukemia contained 638 nodes, 1830 edges and perhaps 5 molecular complexes. Among them, complex 1 is involved in pathways that are related to cytokine secretion, cytokine-receptor binding, cytokine receptor signaling, while complex 3 is related to biological behavior of tumors which can provide the bioinformatic foundation for further understanding the mechanisms of chronic myelocytic leukemia.

Expression, Purification and Characterization of the BLM binding region of human Fanconi Anemia Group J Protein

  • Yeom, Kyuho;Park, Chin-Ju
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.20 no.1
    • /
    • pp.22-26
    • /
    • 2016
  • FANCJ is a DNA helicase which contributes genome stability by resolving G-quadruplex DNA from 5' to 3' direction. In addition to main ATPase helicase core, FANCJ has the protein binding region at its C-terminal part. BRCA1 and BLM are the binding partner of FANCJ and these protein-protein interactions contribute genomic stability and the proper response to replication stress. As the first attempt for studying FANCJ-BLM interaction, we prepared BLM binding region of FANCJ and characterized with CD and NMR spectroscopy. FANCJ (881-941) with N-ter 6xHis was purified as the oligomer. Secondary structure prediction based on CD data revealed that FANCJ (881-941) composed with ${\beta}$ sheet, turn and coils.$^1H-^{15}N$ HSQC spectra showed nonhomogeneous peak intensities with less number of peaks comparing than the number of amino acids in the construct. It indicated that optimization should be necessary for detailed further structural studies.

Review of Biological Network Data and Its Applications

  • Yu, Donghyeon;Kim, MinSoo;Xiao, Guanghua;Hwang, Tae Hyun
    • Genomics & Informatics
    • /
    • v.11 no.4
    • /
    • pp.200-210
    • /
    • 2013
  • Studying biological networks, such as protein-protein interactions, is key to understanding complex biological activities. Various types of large-scale biological datasets have been collected and analyzed with high-throughput technologies, including DNA microarray, next-generation sequencing, and the two-hybrid screening system, for this purpose. In this review, we focus on network-based approaches that help in understanding biological systems and identifying biological functions. Accordingly, this paper covers two major topics in network biology: reconstruction of gene regulatory networks and network-based applications, including protein function prediction, disease gene prioritization, and network-based genome-wide association study.

Protein Function Finding Systems through Domain Analysis on Protein Hub Network (단백질 허브 네트워크에서 도메인분석을 통한 단백질 기능발견 시스템)

  • Kang, Tae-Ho;Ryu, Jea-Woon;Kim, Hak-Yong;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.1
    • /
    • pp.259-271
    • /
    • 2008
  • We propose a protein function finding algorithm that is able to predict specific molecular function for unannotated proteins through domain analysis from protein-protein network. To do this, we first construct protein-protein interaction(PPI) network in Saccharomyces cerevisiae from MIPS databases. The PPI network(proteins; 3,637, interactions; 10,391) shows the characteristics of a scale-free network and a hierarchical network that proteins with a number of interactions occur in small and the inherent modularity of protein clusters. Protein-protein interaction databases obtained from a Y2H(Yeast Two Hybrid) screen or a composite data set include random false positives. To filter the database, we reconstruct the PPI networks based on the cellular localization. And then we analyze Hub proteins and the network structure in the reconstructed network and define structural modules from the network. We analyze protein domains from the structural modules and derive functional modules from them. From the derived functional modules with high certainty, we find tentative functions for unannotated proteins.