• Title/Summary/Keyword: Protein structure prediction

Search Result 105, Processing Time 0.03 seconds

Computational Approaches for Structural and Functional Genomics

  • Brenner, Steven-E.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.17-20
    • /
    • 2000
  • Structural genomics aims to provide a good experimental structure or computational model of every tractable protein in a complete genome. Underlying this goal is the immense value of protein structure, especially in permitting recognition of distant evolutionary relationships for proteins whose sequence analysis has failed to find any significant homolog. A considerable fraction of the genes in all sequenced genomes have no known function, and structure determination provides a direct means of revealing homology that may be used to infer their putative molecular function. The solved structures will be similarly useful for elucidating the biochemical or biophysical role of proteins that have been previously ascribed only phenotypic functions. More generally, knowledge of an increasingly complete repertoire of protein structures will aid structure prediction methods, improve understanding of protein structure, and ultimately lend insight into molecular interactions and pathways. We use computational methods to select families whose structures cannot be predicted and which are likely to be amenable to experimental characterization. Methods to be employed included modern sequence analysis and clustering algorithms. A critical component is consultation of the presage database for structural genomics, which records the community's experimental work underway and computational predictions. The protein families are ranked according to several criteria including taxonomic diversity and known functional information. Individual proteins, often homologs from hyperthermophiles, are selected from these families as targets for structure determination. The solved structures are examined for structural similarity to other proteins of known structure. Homologous proteins in sequence databases are computationally modeled, to provide a resource of protein structure models complementing the experimentally solved protein structures.

  • PDF

Backbone 1H, 15N, and 13C Resonance Assignment and Secondary Structure Prediction of HP0495 from Helicobacter pylori

  • Seo, Min-Duk;Park, Sung-Jean;Kim, Hyun-Jung;Seok, Seung-Hyeon;Lee, Bong-Jin
    • BMB Reports
    • /
    • v.40 no.5
    • /
    • pp.839-843
    • /
    • 2007
  • HP0495 (Swiss-Prot ID; Y495_HELPY) is an 86-residue hypothetical protein from Helicobacter pylori strain 26695. The function of HP0495 cannot be identified based on sequence homology, and HP0495 is included in a fairly unique sequence family. Here, we report the sequencespecific backbone resonance assignments of HP0495. About 97% of all the $^1HN$, $^{15}N$, $^{13}C{\alpha}$, $^{13}C{\beta}$, and $^{13}CO$ resonances were assigned unambiguously. We could predict the secondary structure of HP0495, by analyzing the deviation of the $^{13}C{\alpha}$ and $^{13}C{\beta}$ shemical shifts from their respective random coil values. Secondary structure prediction shows that HP0495 consists of two $\alpha$-helices and four $\beta$-strands. This study is a prerequisite for determining the solution structure of HP0495 and investigating the protein-protein interaction between HP0495 and other Helicobacter pylori proteins.

Implementation of Protein Motif Prediction System Using integrated Motif Resources (모티프 자원 통합을 이용한 단백질 모티프 예측 시스템 구현)

  • Lee, Bum-Ju;Choi, Eun-Sun;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.679-688
    • /
    • 2003
  • Motif databases are used in the function and structure prediction of proteins which appear on new and rapid release of raw data from genome sequencing projects. Recently, the frequency of use about these databases increases continuously. However, existing motif databases were developed and extended independently and were integrated mainly by using a web-based cross-reference, thus these databases have a heterogeneous search result problem, a complex query process problem and a duplicate database entry handling problem. Therefore, in this paper, we suppose physical motif resource integration and describe the integrated search method about a family-based protein prediction for solving above these problems. Finally, we estimate our implementation of the motif integration database and prediction system for predicting protein motifs.

Structural Analysis of Recombinant Human Preproinsulins by Structure Prediction, Molecular Dynamics, and Protein-Protein Docking

  • Jung, Sung Hun;Kim, Chang-Kyu;Lee, Gunhee;Yoon, Jonghwan;Lee, Minho
    • Genomics & Informatics
    • /
    • v.15 no.4
    • /
    • pp.142-146
    • /
    • 2017
  • More effective production of human insulin is important, because insulin is the main medication that is used to treat multiple types of diabetes and because many people are suffering from diabetes. The current system of insulin production is based on recombinant DNA technology, and the expression vector is composed of a preproinsulin sequence that is a fused form of an artificial leader peptide and the native proinsulin. It has been reported that the sequence of the leader peptide affects the production of insulin. To analyze how the leader peptide affects the maturation of insulin structurally, we adapted several in silico simulations using 13 artificial proinsulin sequences. Three-dimensional structures of models were predicted and compared. Although their sequences had few differences, the predicted structures were somewhat different. The structures were refined by molecular dynamics simulation, and the energy of each model was estimated. Then, protein-protein docking between the models and trypsin was carried out to compare how efficiently the protease could access the cleavage sites of the proinsulin models. The results showed some concordance with experimental results that have been reported; so, we expect our analysis will be used to predict the optimized sequence of artificial proinsulin for more effective production.

Introduction to Gene Prediction Using HMM Algorithm

  • Kim, Keon-Kyun;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.489-506
    • /
    • 2007
  • Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is the most important process in annotating genes and greatly affects gene analysis and genome annotation. As eukaryotic genes have more complicated structures in DNA sequences than those of prokaryotic genes, analysis programs for eukaryotic gene structure prediction have more diverse and more complicated computational models. There are Ab Initio method, Similarity-based method, and Ensemble method for gene prediction method for eukaryotic genes. Each Method use various algorithms. This paper introduce how to predict genes using HMM(Hidden Markov Model) algorithm and present the process of gene prediction with well-known gene prediction programs.

  • PDF

Prediction of the Secondary Structure of the AgfA Subunit of Salmonella enteritidis Overexpressed as an MBP-Fused Protein

  • Won, Mi-Sun;Kim, So-Youn;Lee, Seung-Hwan;Kim, Chul-Jung;Kim, Hyun-Su;Jun, Moo-Hyung;Song, Kyung-Bin
    • Journal of Microbiology and Biotechnology
    • /
    • v.11 no.1
    • /
    • pp.164-166
    • /
    • 2001
  • To examine the characteristics of the recombinant thin aggregative fimbriae of Salmonella, the AgfA subunit gene was amplified from Salmonella enteritidis using a PCR. The maltose binding protein (MBP)-AgfA fusion protein was overproduced in E. coli and purified. The secondary structure of AgfA was then elucidated from the difference CD spectra. An estimation of the secondary structure of AgfA using the self-consistent method revealed a mostly ${\beta}-sheet$ structure.

  • PDF

Mainchain NMR Assignments and secondary structure prediction of the C-terminal domain of BldD, a developmental transcriptional regulator from Streptomyces coelicolor A3(2)

  • Kim, Jeong-Mok;Won, Hyung-Sik;Kang, Sa-Ouk
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.17 no.1
    • /
    • pp.59-66
    • /
    • 2013
  • BldD, a developmental transcription factor from Streptomyces coelicolor, is a homodimeric, DNA-binding protein with 167 amino acids in each subunit. Each monomer consists of two structurally distinct domains, the N-terminal domain (BldD-NTD) responsible for DNA-binding and dimerization and the C-terminal domain (BldD-CTD). In contrast to the BldD-NTD, of which crystal structure has been solved, the BldD-CTD has been characterized neither in structure nor in function. Thus, in terms of structural genomics, structural study of the BldD-CTD has been conducted in solution, and in the present work, mainchain NMR assignments of the recombinant BldD-CTD (residues 80-167 of BldD) could be achieved by a series of heteronuclear multidimensional NMR experiments on a [$^{13}C/^{15}N$]-enriched protein sample. Finally, the secondary structure prediction by CSI and TALOS+ analysis using the assigned chemical shifts data identified a ${\beta}-{\alpha}-{\alpha}-{\beta}-{\alpha}-{\alpha}-{\alpha}$ topology of the domain. The results will provide the most fundamental data for more detailed approach to the atomic structure of the BldD-CTD, which would be essential for entire understanding of the molecular function of BldD.

In silico annotation of a hypothetical protein from Listeria monocytogenes EGD-e unfolds a toxin protein of the type II secretion system

  • Maisha Tasneem;Shipan Das Gupta;Monira Binte Momin;Kazi Modasser Hossain;Tasnim Binta Osman;Fazley Rabbi
    • Genomics & Informatics
    • /
    • v.21 no.1
    • /
    • pp.7.1-7.11
    • /
    • 2023
  • The gram-positive bacterium Listeria monocytogenes is an important foodborne intracellular pathogen that is widespread in the environment. The functions of hypothetical proteins (HP) from various pathogenic bacteria have been successfully annotated using a variety of bioinformatics strategies. In this study, a HP Imo0888 (NP_464414.1) from the Listeria monocytogenes EGD-e strain was annotated using several bioinformatics tools. Various techniques, including CELLO, PSORTb, and SOSUIGramN, identified the candidate protein as cytoplasmic. Domain and motif analysis revealed that the target protein is a PemK/MazF-like toxin protein of the type II toxin-antitoxin system (TAS) which was consistent with BLASTp analysis. Through secondary structure analysis, we found the random coil to be the most frequent. The Alpha Fold 2 Protein Structure Prediction Database was used to determine the three-dimensional (3D) structure of the HP using the template structure of a type II TAS PemK/MazF family toxin protein (DB ID_AFDB: A0A4B9HQB9) with 99.1% sequence identity. Various quality evaluation tools, such as PROCHECK, ERRAT, Verify 3D, and QMEAN were used to validate the 3D structure. Following the YASARA energy minimization method, the target protein's 3D structure became more stable. The active site of the developed 3D structure was determined by the CASTp server. Most pathogens that harbor TAS create a crucial risk to human health. Our aim to annotate the HP Imo088 found in Listeria could offer a chance to understand bacterial pathogenicity and identify a number of potential targets for drug development.

Backbone 1H, 15N and 13C Resonance Assignment and Secondary Structure Prediction of HP0062 (O24902_HELPY) from Helicobacter pylori

  • Jang, Sun-Bok;Ma, Chao;Park, Sung-Jean;Kwon, Ae-Ran;Lee, Bong-Jin
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.13 no.2
    • /
    • pp.117-125
    • /
    • 2009
  • HP0062 is an 86 residue hypothetical protein from Helicobacter pylori strain 26695. HP0062 was identified ESAT-6/WXG100 superfamily protein based on structure and sequence alignment and also contains leucine zipper domain sequence. Here, we report the sequence-specific backbone resonance assignment of HP0062. About 97.7% of all $^1H_N,\;^{15}N,\;^{13}C_{\alpha},\;^{13}C_{\beta}\;and\;^{13}C=O$ resonances were assigned unambiguously. We could predict the secondary structure of HP0062 by analyzing the deviation of the $^{13}C_{alpha}\;and\;^{13}C_{\beta}$ chemical shifts from their respective random coil values. Secondary structure prediction shows that HP0062 consist of two ${\alpha}$-helices. This study is a prerequisite for determining the solution structure of HP0062 and can be used for the study on interaction between HP0062 and DNA and other Helicobacter pylori proteins.