• 제목/요약/키워드: protein interaction prediction

검색결과 69건 처리시간 0.026초

상호작용 맵에서 단백질 기능 예측 (A Protein Function Prediction in Interaction Maps)

  • 정재영;최재훈;박종민;박선희
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2004년도 가을 학술발표논문집 Vol.31 No.2 (2)
    • /
    • pp.286-288
    • /
    • 2004
  • 단백질 상호작용 데이터는 현 생물정보학에서 기능이 알려지지 않은 단백질의 기능 예측에 높은 신뢰성이 있는 프로티오믹스의 계산 모델에 이용되고 있다. 일반적으로 이 단백질 기능 예측 알고리즘들은 대규모의 2차원 단백질-단백질 상호작용 맵에서 Guilt-by-Association 개념 기반으로 개발되고 있다. 본 논문에서는 단백질-단백질 상호작용 데이터를 이용한 그래프 기반 단백질 기능 예측 모델을 개발하였다. 특히, 이 모델은 대량의 상호작용 데이터에서 정확한 기능 예측을 수행할 수 있다는 장점을 가지고 있다. 이를 위해 Yeast에 대한 단백질 상호작용 맵, Homology 및 Interaction Generality를 이용하여 이 모델을 평가하였다.

  • PDF

Challenges and New Approaches in Genomics and Bioinformatics

  • Park, Jong Hwa;Han, Kyung Sook
    • Genomics & Informatics
    • /
    • 제1권1호
    • /
    • pp.1-6
    • /
    • 2003
  • In conclusion, the seemingly fuzzy and disorganized data of biology with thousands of different layers ranging from molecule to the Internet have refused so far to be mapped precisely and predicted successfully by mathematicians, physicists or computer scientists. Genomics and bioinformatics are the fields that process such complex data. The insights on the nature of biological entities as complex interaction networks are opening a door toward a generalization of the representation of biological entities. The main challenge of genomics and bioinformatics now lies in 1) how to data mine the networks of the domains of bioinformatics, namely, the literature, metabolic pathways, and proteome and structures, in terms of interaction; and 2) how to generalize the networks in order to integrate the information into computable genomic data for computers regardless of the levels of layer. Once bioinformatists succeed to find a general principle on the way components interact each other to form any organic interaction network at genomic scale, true simulation and prediction of life in silico will be possible.

Analysis of a Large-scale Protein Structural Interactome: Ageing Protein structures and the most important protein domain

  • Bolser, Dan;Dafas, Panos;Harrington, Richard;Schroeder, Michael;Park, Jong
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.26-51
    • /
    • 2003
  • Large scale protein interaction maps provide a new, global perspective with which to analyse protein function. PSIMAP, the Protein Structural Interactome Map, is a database of all the structurally observed interactions between superfamilies of protein domains with known three-dimensional structure in thePDB. PSIMAP incorporates both functional and evolutionary information into a single network. It makes it possible to age protein domains in terms of taxonomic diversity, interaction and function. One consequence of it is to predict the most important protein domain structure in evolution. We present a global analysis of PSIMAP using several distinct network measures relating to centrality, interactivity, fault-tolerance, and taxonomic diversity. We found the following results: ${\bullet}$ Centrality: we show that the center and barycenter of PSIMAP do not coincide, and that the superfamilies forming the barycenter relate to very general functions, while those constituting the center relate to enzymatic activity. ${\bullet}$ Interactivity: we identify the P-loop and immunoglobulin superfamilies as the most highly interactive. We successfully use connectivity and cluster index, which characterise the connectivity of a superfamily's neighbourhood, to discover superfamilies of complex I and II. This is particularly significant as the structure of complex I is not yet solved. ${\bullet}$ Taxonomic diversity: we found that highly interactive superfamilies are in general taxonomically very diverse and are thus amongst the oldest. This led to the prediction of the oldest and most important protein domain in evolution of lift. ${\bullet}$ Fault-tolerance: we found that the network is very robust as for the majority of superfamilies removal from the network will not break up the network. Overall, we can single out the P-loop containing nucleotide triphosphate hydrolases superfamily as it is the most highly connected and has the highest taxonomic diversity. In addition, this superfamily has the highest interaction rank, is the barycenter of the network (it has the shortest average path to every other superfamily in the network), and is an articulation vertex, whose removal will disconnect the network. More generally, we conclude that the graph-theoretic and taxonomic analysis of PSIMAP is an important step towards the understanding of protein function and could be an important tool for tracing the evolution of life at the molecular level.

  • PDF

Partial AUC maximization for essential gene prediction using genetic algorithms

  • Hwang, Kyu-Baek;Ha, Beom-Yong;Ju, Sanghun;Kim, Sangsoo
    • BMB Reports
    • /
    • 제46권1호
    • /
    • pp.41-46
    • /
    • 2013
  • Identifying genes indispensable for an organism's life and their characteristics is one of the central questions in current biological research, and hence it would be helpful to develop computational approaches towards the prediction of essential genes. The performance of a predictor is usually measured by the area under the receiver operating characteristic curve (AUC). We propose a novel method by implementing genetic algorithms to maximize the partial AUC that is restricted to a specific interval of lower false positive rate (FPR), the region relevant to follow-up experimental validation. Our predictor uses various features based on sequence information, protein-protein interaction network topology, and gene expression profiles. A feature selection wrapper was developed to alleviate the over-fitting problem and to weigh each feature's relevance to prediction. We evaluated our method using the proteome of budding yeast. Our implementation of genetic algorithms maximizing the partial AUC below 0.05 or 0.10 of FPR outperformed other popular classification methods.

Backbone 1H, 15N, and 13C Resonance Assignment and Secondary Structure Prediction of HP0495 from Helicobacter pylori

  • Seo, Min-Duk;Park, Sung-Jean;Kim, Hyun-Jung;Seok, Seung-Hyeon;Lee, Bong-Jin
    • BMB Reports
    • /
    • 제40권5호
    • /
    • pp.839-843
    • /
    • 2007
  • HP0495 (Swiss-Prot ID; Y495_HELPY) is an 86-residue hypothetical protein from Helicobacter pylori strain 26695. The function of HP0495 cannot be identified based on sequence homology, and HP0495 is included in a fairly unique sequence family. Here, we report the sequencespecific backbone resonance assignments of HP0495. About 97% of all the $^1HN$, $^{15}N$, $^{13}C{\alpha}$, $^{13}C{\beta}$, and $^{13}CO$ resonances were assigned unambiguously. We could predict the secondary structure of HP0495, by analyzing the deviation of the $^{13}C{\alpha}$ and $^{13}C{\beta}$ shemical shifts from their respective random coil values. Secondary structure prediction shows that HP0495 consists of two $\alpha$-helices and four $\beta$-strands. This study is a prerequisite for determining the solution structure of HP0495 and investigating the protein-protein interaction between HP0495 and other Helicobacter pylori proteins.

Assessment of the Performance of B2PLYP-D for Describing Intramolecular π-π and σ-π Interactions

  • Choi, Tae-Hoon;Han, Young-Kyu
    • Bulletin of the Korean Chemical Society
    • /
    • 제32권12호
    • /
    • pp.4195-4198
    • /
    • 2011
  • Intramolecular ${\pi}-{\pi}$ and ${\sigma}-{\pi}$ interactions are omnipresent for numerous energetic and structural phenomena in nature, and the exact description of these nonbonding interactions plays an important role in the accurate prediction of the three-dimensional structures for numerous interesting molecular systems such as protein folding and polymer shaping. We have selected two prototype molecular systems for benchmarking calculations of intramolecular ${\pi}-{\pi}$ and ${\sigma}-{\pi}$ interactions. Accurately describing conformational energy of such systems requires highly elaborate but very expensive ab initio methods such as coupled cluster singles, doubles, and (triples) (CCSD(T)). Our calculations reveal a double hybrid density functional incorporating dispersion correction (B2PLYP-D) that agrees excellently with the CCSD(T) results, indicating that B2PLYP-D can serve as a practical method of choice.

Evaluation and interpretation of transcriptome data underlying heterogeneous chronic obstructive pulmonary disease

  • Ham, Seokjin;Oh, Yeon-Mok;Roh, Tae-Young
    • Genomics & Informatics
    • /
    • 제17권1호
    • /
    • pp.2.1-2.12
    • /
    • 2019
  • Chronic obstructive pulmonary disease (COPD) is a type of progressive lung disease, featured by airflow obstruction. Recently, a comprehensive analysis of the transcriptome in lung tissue of COPD patients was performed, but the heterogeneity of the sample was not seriously considered in characterizing the mechanistic dysregulation of COPD. Here, we established a new transcriptome analysis pipeline using a deconvolution process to reduce the heterogeneity and clearly identified that these transcriptome data originated from the mild or moderate stage of COPD patients. Differentially expressed or co-expressed genes in the protein interaction subnetworks were linked with mitochondrial dysfunction and the immune response, as expected. Computational protein localization prediction revealed that 19 proteins showing changes in subcellular localization were mostly related to mitochondria, suggesting that mislocalization of mitochondria-targeting proteins plays an important role in COPD pathology. Our extensive evaluation of COPD transcriptome data could provide guidelines for analyzing heterogeneous gene expression profiles and classifying potential candidate genes that are responsible for the pathogenesis of COPD.

Backbone assignment of the intrinsically disordered N-terminal region of Bloom syndrome protein

  • Min June Yang;Chin-Ju Park
    • 한국자기공명학회논문지
    • /
    • 제27권3호
    • /
    • pp.17-22
    • /
    • 2023
  • Bloom syndrome protein (BLM) is a pivotal RecQ helicase necessary for genetic stability through DNA repair processes. Our investigation focuses on the N-terminal region of BLM, which has been considered as an intrinsically disordered region (IDR). This IDR plays a critical role in DNA metabolism by interacting with other proteins. In this study, we performed triple resonance experiments of BLM220-300 and presented the backbone chemical shifts. The secondary structure prediction based on chemical shifts of the backbone atoms shows the region is disordered. Our data could help further interaction studies between BLM220-300 and its binding partners using NMR.

상호작용 네트웍 사전 구축을 이용한 단백질 기능 예측 (Protein Function Prediction by Constructing Interaction Network Dictionary)

  • 진희정;조환규
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2005년도 가을 학술발표논문집 Vol.32 No.2 (2)
    • /
    • pp.238-240
    • /
    • 2005
  • 단백체는 세포가 처해있는 환경에 따라, 그리고 각 조직 별로 유동적으로 존재하며, 세포의 실제적인 기능을 표현해준다. 이러한 이유로 세포 내에서 일어나는 실제적인 현상들을 전체 단백질 단계에서 통합적으로 파악하고자 하는 단백체학 연구가 활발하게 진행되고 있다. 미지의 단백질의 기능을 밝혀내는 연구는 단백체학의 가장 기본적이면서 중요한 부분이라고 할 수 있다. 본 논문에서는 "단백질 상호작용 네트웍 사전(PIND)"을 구축함으로써 단백질의 기능을 예측하는 새로운 방법론을 소개한다.

  • PDF

Backbone 1H, 15N and 13C Resonance Assignment and Secondary Structure Prediction of HP0062 (O24902_HELPY) from Helicobacter pylori

  • Jang, Sun-Bok;Ma, Chao;Park, Sung-Jean;Kwon, Ae-Ran;Lee, Bong-Jin
    • 한국자기공명학회논문지
    • /
    • 제13권2호
    • /
    • pp.117-125
    • /
    • 2009
  • HP0062 is an 86 residue hypothetical protein from Helicobacter pylori strain 26695. HP0062 was identified ESAT-6/WXG100 superfamily protein based on structure and sequence alignment and also contains leucine zipper domain sequence. Here, we report the sequence-specific backbone resonance assignment of HP0062. About 97.7% of all $^1H_N,\;^{15}N,\;^{13}C_{\alpha},\;^{13}C_{\beta}\;and\;^{13}C=O$ resonances were assigned unambiguously. We could predict the secondary structure of HP0062 by analyzing the deviation of the $^{13}C_{alpha}\;and\;^{13}C_{\beta}$ chemical shifts from their respective random coil values. Secondary structure prediction shows that HP0062 consist of two ${\alpha}$-helices. This study is a prerequisite for determining the solution structure of HP0062 and can be used for the study on interaction between HP0062 and DNA and other Helicobacter pylori proteins.