• Title/Summary/Keyword: Protein prediction

Search Result 473, Processing Time 0.022 seconds

Simulation Methods for Prediction of Membrane Protein Structure

  • Son, Hyeon-S.
    • Proceedings of the Korean Biophysical Society Conference
    • /
    • 1998.06a
    • /
    • pp.10-10
    • /
    • 1998
  • IMPs are important to cells in functions such as transport, energy transduction and signalling. Three dimensional molecular structures of such proteins at atomic level are needed to understand such processes. Prediction of such structures (and functions) is necessary especially because there are only a small number of membrane protein structures determined in atomic resolution.(omitted)

  • PDF

Protein Tertiary Structure Prediction Method based on Fragment Assembly

  • Lee, Julian;Kim, Seung-Yeon;Joo, Kee-Hyoung;Kim, Il-Soo;Lee, Joo-Young
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.250-261
    • /
    • 2004
  • A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is introduced. This method utilizes secondary structure prediction information and fragment assembly. The secondary structure prediction of proteins is performed with the PREDICT method which uses PSI-BLAST to generate profiles and a distance measure in the pattern space. In order to predict the tertiary structure of a protein sequence, we assemble fragments in the fragment library constructed as a byproduct of PREDICT. The tertiary structure is obtained by minimizing the potential energy using the conformational space annealing method which enables one to sample diverse low lying minima of the energy function. We apply PROFESY for prediction of some proteins with known structures, which shows good performances. We also participated in CASP5 and applied PROFESY to new fold targets for blind predictions. The results were quite promising, despite the fact that PROFESY was in its early stage of development. In particular, the PROFESY result is the best for the hardest target T0161.

  • PDF

Development of a Constituent Prediction Model of Domestic Rice Using Near Infrared Reflectance Analyzer(I) -Constituent Prediction Model of Brown and Milled Rice- (근적외선분석계를 이용한 국내산 쌀의 성분예측모델 개발(I) -현미와 백미의 성분예측모델-)

  • 한충수;동하원강
    • Journal of Biosystems Engineering
    • /
    • v.21 no.2
    • /
    • pp.198-207
    • /
    • 1996
  • To measure the moisture content, protein and viscosity of brown and milled rice with Near Infrared Reflectance(NIR) analyzer, the comparison and analysis of the data from the chemical analysis and NIR analyzer were conducted. The purpose of this study is to find out the fundamental data required for the prediction of rice qualify and taste rank, and to develop a measuring method of constituents and physical characteristics of domestic rice with NIR analyzer. The important results can be summarized as follows. 1. The $r^2$ and SEC of moisture calibration from brown rice powder were 0.87 and 0.09 respectively, those of milled rice powder were 0.95 and 0.08 respectively. 2. The $r^2$ and SEC of protein calibration from brown rice powder were 0.83 and 0.20 respectively, those of milled rice powder were 0.86 and 0.20 respectively. 3. The $r^2$ and SEC of viscosity calibration from brown rice powder were 0.36 and 15.50 respectively, those of milled rice powder were 0.55 and 12.98 respectively. Further study is required to develop better prediction model for viscosity. It is necessary the continuous study including wavelength selection, because $r^2$ is small for practical use. 4. The regression equation for one rice variety was nearly coincident with other. Therefore, it is required that the prediction model should be developed for the all rice samples.

  • PDF

A New Approach to Find Orthologous Proteins Using Sequence and Protein-Protein Interaction Similarity

  • Kim, Min-Kyung;Seol, Young-Joo;Park, Hyun-Seok;Jang, Seung-Hwan;Shin, Hang-Cheol;Cho, Kwang-Hwi
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.141-147
    • /
    • 2009
  • Developed proteome-scale ortholog and paralog prediction methods are mainly based on sequence similarity. However, it is known that even the closest BLAST hit often does not mean the closest neighbor. For this reason, we added conserved interaction information to find orthologs. We propose a genome-scale, automated ortholog prediction method, named OrthoInterBlast. The method is based on both sequence and interaction similarity. When we applied this method to fly and yeast, 17% of the ortholog candidates were different compared with the results of Inparanoid. By adding protein-protein interaction information, proteins that have low sequence similarity still can be selected as orthologs, which can not be easily detected by sequence homology alone.

The Grammatical Structure of Protein Sequences

  • Bystroff, Chris
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.28-31
    • /
    • 2000
  • We describe a hidden Markov model, HMMTIR, for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike the linear HMMs used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the database, and achieves a great reduction in parameters by representing overlapping motifs in a much more compact form. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model, predicts secondary structure with an accuracy of 74.6% and backbone torsion angles better than any previously reported method, and predicts the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction. HMMSTR has been incorporated into a public, fully-automated protein structure prediction server.

  • PDF

Review of Biological Network Data and Its Applications

  • Yu, Donghyeon;Kim, MinSoo;Xiao, Guanghua;Hwang, Tae Hyun
    • Genomics & Informatics
    • /
    • v.11 no.4
    • /
    • pp.200-210
    • /
    • 2013
  • Studying biological networks, such as protein-protein interactions, is key to understanding complex biological activities. Various types of large-scale biological datasets have been collected and analyzed with high-throughput technologies, including DNA microarray, next-generation sequencing, and the two-hybrid screening system, for this purpose. In this review, we focus on network-based approaches that help in understanding biological systems and identifying biological functions. Accordingly, this paper covers two major topics in network biology: reconstruction of gene regulatory networks and network-based applications, including protein function prediction, disease gene prioritization, and network-based genome-wide association study.

Prediction of Protein Secondary Structure Content Using Amino Acid Composition and Evolutionary Information

  • Lee, So-Young;Lee, Byung-Chul;Kim, Dong-Sup
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.244-249
    • /
    • 2004
  • There have been many attempts to predict the secondary structure content of a protein from its primary sequence, which serves as the first step in a series of bioinformatics processes to gain knowledge of the structure and function of a protein. Most of them assumed that prediction relying on the information of the amino acid composition of a protein can be successful. Several approaches expanded the amount of information by including the pair amino acid composition of two adjacent residues. Recent methods achieved a remarkable improvement in prediction accuracy by using this expanded composition information. The overall average errors of two successful methods were 6.1% and 3.4%. This work was motivated by the observation that evolutionarily related proteins share the similar structure. After manipulating the values of the frequency matrix obtained by running PSI-BLAST, inputs of an artificial neural network were constructed by taking the ratio of the amino acid composition of the evolutionarily related proteins with a query protein to the background probability. Although we did not utilize the expanded composition information of amino acid pairs, we obtained the comparable accuracy, with the overall average error being 3.6%.

  • PDF

Prediction of Metal Ion Binding Sites in Proteins from Amino Acid Sequences by Using Simplified Amino Acid Alphabets and Random Forest Model

  • Kumar, Suresh
    • Genomics & Informatics
    • /
    • v.15 no.4
    • /
    • pp.162-169
    • /
    • 2017
  • Metal binding proteins or metallo-proteins are important for the stability of the protein and also serve as co-factors in various functions like controlling metabolism, regulating signal transport, and metal homeostasis. In structural genomics, prediction of metal binding proteins help in the selection of suitable growth medium for overexpression's studies and also help in obtaining the functional protein. Computational prediction using machine learning approach has been widely used in various fields of bioinformatics based on the fact all the information contains in amino acid sequence. In this study, random forest machine learning prediction systems were deployed with simplified amino acid for prediction of individual major metal ion binding sites like copper, calcium, cobalt, iron, magnesium, manganese, nickel, and zinc.

Sequence driven features for prediction of subcellular localization of proteins

  • Kim, Jong-Kyoung;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.237-242
    • /
    • 2005
  • Predicting the cellular location of an unknown protein gives a valuable information for inferring the possible function of the protein. For more accurate prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper, we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting. The overall prediction accuracy evaluated by the 5-fold cross-validation reached 88.53% for the eukaryotic animal data set. By comparing the prediction accuracy of various feature extraction methods, we could get the biological insight on the location of targeting information. Our numerical experiments confirm that our new feature extraction methods are very useful for predicting subcellular localization of proteins.

  • PDF

A Protein Sequence Prediction Method by Mining Sequence Data (서열 데이타마이닝을 통한 단백질 서열 예측기법)

  • Cho, Sun-I;Lee, Do-Heon;Cho, Kwang-Hwi;Won, Yong-Gwan;Kim, Byoung-Ki
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.261-266
    • /
    • 2003
  • A protein, which is a linear polymer of amino acids, is one of the most important bio-molecules composing biological structures and regulating bio-chemical reactions. Since the characteristics and functions of proteins are determined by their amino acid sequences in principle, protein sequence determination is the starting point of protein function study. This paper proposes a protein sequence prediction method based on data mining techniques, which can overcome the limitation of previous bio-chemical sequencing methods. After applying multiple proteases to acquire overlapped protein fragments, we can identify candidate fragment sequences by comparing fragment mass values with peptide databases. We propose a method to construct multi-partite graph and search maximal paths to determine the protein sequence by assembling proper candidate sequences. In addition, experimental results based on the SWISS-PROT database showing the validity of the proposed method is presented.