• Title/Summary/Keyword: Protein prediction

Search Result 477, Processing Time 0.039 seconds

Enhanced Chemical Shift Analysis for Secondary Structure prediction of protein

  • Kim, Won-Je;Rhee, Jin-Kyu;Yi, Jong-Jae;Lee, Bong-Jin;Son, Woo Sung
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.18 no.1
    • /
    • pp.36-40
    • /
    • 2014
  • Predicting secondary structure of protein through assigned backbone chemical shifts has been used widely because of its convenience and flexibility. In spite of its usefulness, chemical shift based analysis has some defects including isotopic shifts and solvent interaction. Here, it is shown that corrected chemical shift analysis for secondary structure of protein. It is included chemical shift correction through consideration of deuterium isotopic effect and calculate chemical shift index using probability-based methods. Enhanced method was applied successfully to one of the proteins from Mycobacterium tuberculosis. It is suggested that correction of chemical shift analysis could increase accuracy of secondary structure prediction of protein and small molecule in solution.

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences (단백질 서열의 상동 관계를 가중 조합한 단백질 이차 구조 예측)

  • Chi, Sang-mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1816-1821
    • /
    • 2016
  • Protein secondary structure is important for the study of protein evolution, structure and function of proteins which play crucial roles in most of biological processes. This paper try to effectively extract protein secondary structure information from the large protein structure database in order to predict the protein secondary structure of a query protein sequence. To find more remote homologous sequences of a query sequence in the protein database, we used PSI-BLAST which can perform gapped iterative searches and use profiles consisting of homologous protein sequences of a query protein. The secondary structures of the homologous sequences are weighed combined to the secondary structure prediction according to their relative degree of similarity to the query sequence. When homologous sequences with a neural network predictor were used, the accuracies were higher than those of current state-of-art techniques, achieving a Q3 accuracy of 92.28% and a Q8 accuracy of 88.79%.

NOGSEC: A NOnparametric method for Genome SEquence Clustering (녹섹(NOGSEC): A NOnparametric method for Genome SEquence Clustering)

  • 이영복;김판규;조환규
    • Korean Journal of Microbiology
    • /
    • v.39 no.2
    • /
    • pp.67-75
    • /
    • 2003
  • One large topic in comparative genomics is to predict functional annotation by classifying protein sequences. Computational approaches for function prediction include protein structure prediction, sequence alignment and domain prediction or binding site prediction. This paper is on another computational approach searching for sets of homologous sequences from sequence similarity graph. Methods based on similarity graph do not need previous knowledges about sequences, but largely depend on the researcher's subjective threshold settings. In this paper, we propose a genome sequence clustering method of iterative testing and graph decomposition, and a simple method to calculate a strict threshold having biochemical meaning. Proposed method was applied to known bacterial genome sequences and the result was shown with the BAG algorithm's. Result clusters are lacking some completeness, but the confidence level is very high and the method does not need user-defined thresholds.

Sequence driven features for prediction of subcellular localization of proteins (단백질의 세포내 소 기관별 분포 예측을 위한 서열 기반의 특징 추출 방법)

  • Kim, Jong-Kyoung;Choi, Seung-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.226-228
    • /
    • 2005
  • Predicting the cellular location of an unknown protein gives valuable information for inferring the possible function of the protein. For more accurate Prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting . The overall prediction accuracy evaluated by the 5-fold cross-validation reached $88.53\%$ for the eukaryotic animal data set. By comparing the prediction accuracy of various feature extraction methods, we could get the biological insight on the location of targeting information. Our numerical experiments confirm that our new feature extraction methods are very useful forpredicting subcellular localization of proteins.

  • PDF

Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

  • Kim, Jong-Kyoung;Raghava, G. P. S.;Kim, Kwang-S.;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.158-166
    • /
    • 2004
  • Predicting the destination of a protein in a cell gives valuable information for annotating the function of the protein. Recent technological breakthroughs have led us to develop more accurate methods for predicting the subcellular localization of proteins. The most important factor in determining the accuracy of these methods, is a way of extracting useful features from protein sequences. We propose a new method for extracting appropriate features only from the sequence data by computing pairwise sequence alignment scores. As a classifier, support vector machine (SVM) is used. The overall prediction accuracy evaluated by the jackknife validation technique reach 94.70% for the eukaryotic non-plant data set and 92.10% for the eukaryotic plant data set, which show the highest prediction accuracy among methods reported so far with such data sets. Our numerical experimental results confirm that our feature extraction method based on pairwise sequence alignment, is useful for this classification problem.

  • PDF

Structural Analysis of Recombinant Human Preproinsulins by Structure Prediction, Molecular Dynamics, and Protein-Protein Docking

  • Jung, Sung Hun;Kim, Chang-Kyu;Lee, Gunhee;Yoon, Jonghwan;Lee, Minho
    • Genomics & Informatics
    • /
    • v.15 no.4
    • /
    • pp.142-146
    • /
    • 2017
  • More effective production of human insulin is important, because insulin is the main medication that is used to treat multiple types of diabetes and because many people are suffering from diabetes. The current system of insulin production is based on recombinant DNA technology, and the expression vector is composed of a preproinsulin sequence that is a fused form of an artificial leader peptide and the native proinsulin. It has been reported that the sequence of the leader peptide affects the production of insulin. To analyze how the leader peptide affects the maturation of insulin structurally, we adapted several in silico simulations using 13 artificial proinsulin sequences. Three-dimensional structures of models were predicted and compared. Although their sequences had few differences, the predicted structures were somewhat different. The structures were refined by molecular dynamics simulation, and the energy of each model was estimated. Then, protein-protein docking between the models and trypsin was carried out to compare how efficiently the protease could access the cleavage sites of the proinsulin models. The results showed some concordance with experimental results that have been reported; so, we expect our analysis will be used to predict the optimized sequence of artificial proinsulin for more effective production.

Computational approaches for prediction of protein-protein interaction between Foot-and-mouth disease virus and Sus scrofa based on RNA-Seq

  • Park, Tamina;Kang, Myung-gyun;Nah, Jinju;Ryoo, Soyoon;Wee, Sunghwan;Baek, Seung-hwa;Ku, Bokkyung;Oh, Yeonsu;Cho, Ho-seong;Park, Daeui
    • Korean Journal of Veterinary Service
    • /
    • v.42 no.2
    • /
    • pp.73-83
    • /
    • 2019
  • Foot-and-Mouth Disease (FMD) is a highly contagious trans-boundary viral disease caused by FMD virus, which causes huge economic losses. FMDV infects cloven hoofed (two-toed) mammals such as cattle, sheep, goats, pigs and various wildlife species. To control the FMDV, it is necessary to understand the life cycle and the pathogenesis of FMDV in host. Especially, the protein-protein interaction between FMDV and host will help to understand the survival cycle of viruses in host cell and establish new therapeutic strategies. However, the computational approach for protein-protein interaction between FMDV and pig hosts have not been applied to studies of the onset mechanism of FMDV. In the present work, we have performed the prediction of the pig's proteins which interact with FMDV based on RNA-Seq data, protein sequence, and structure information. After identifying the virus-host interaction, we looked for meaningful pathways and anticipated changes in the host caused by infection with FMDV. A total of 78 proteins of pig were predicted as interacting with FMDV. The 156 interactions include 94 interactions predicted by sequence-based method and the 62 interactions predicted by structure-based method using domain information. The protein interaction network contained integrin as well as STYK1, VTCN1, IDO1, CDH3, SLA-DQB1, FER, and FGFR2 which were related to the up-regulation of inflammation and the down-regulation of cell adhesion and host defense systems such as macrophage and leukocytes. These results provide clues to the knowledge and mechanism of how FMDV affects the host cell.

A QUADRATIC APPROXIMATION FOR PROTEIN SEQUENCE TO STRUCTURE MAPPING

  • Oh, Se-Young;Yun, Jae-Heon;Chung, Sei-Young
    • Journal of applied mathematics & informatics
    • /
    • v.12 no.1_2
    • /
    • pp.155-164
    • /
    • 2003
  • A method is proposed to predict the distances between given residue pairs (between C$\sub$${\alpha}$/ atoms) of a protein using a sequence to structure mapping by indefinite quadratic approximation. The prediction technique requires a data fitting in three dimensional space with coordinates of the residues of known structured proteins and leads to a numerical ref resentation of 20 amino acids by minimizing a large least norm iteratively. These approximations are used in distance prediction for given residue pairs. Some computational experience on a test set of small proteins from Brookhaven Protein Data Bank are given.

Real Protein Prediction in an Off-Lattice BLN Model via Annealing Contour Monte Carlo

  • Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.627-634
    • /
    • 2009
  • Recently, the general contour Monte Carlo has been proposed by Liang (2004) as a space annealing version(ACMC) for optimization problems. The algorithm can be applied successfully to determine the ground configurations for the prediction of protein folding. In this approach, we use the distances between the consecutive $C_{\alpha}$ atoms along the peptide chain and the mapping sequences between the 20-letter amino acids and a coarse-grained three-letter code. The algorithm was tested on the real proteins. The comparison showed that the algorithm made a significant improvement over the simulated annealing(SA) and the Metropolis Monte Carlo method in determining the ground configurations.