• 제목/요약/키워드: protein structure

검색결과 1,715건 처리시간 0.029초

Protein Backbone Torsion Angle-Based Structure Comparison and Secondary Structure Database Web Server

  • Jung, Sunghoon;Bae, Se-Eun;Ahn, Insung;Son, Hyeon S.
    • Genomics & Informatics
    • /
    • 제11권3호
    • /
    • pp.155-160
    • /
    • 2013
  • Structural information has been a major concern for biological and pharmaceutical studies for its intimate relationship to the function of a protein. Three-dimensional representation of the positions of protein atoms is utilized among many structural information repositories that have been published. The reliability of the torsional system, which represents the native processes of structural change in the structural analysis, was partially proven with previous structural alignment studies. Here, a web server providing structural information and analysis based on the backbone torsional representation of a protein structure is newly introduced. The web server offers functions of secondary structure database search, secondary structure calculation, and pair-wise protein structure comparison, based on a backbone torsion angle representation system. Application of the implementation in pair-wise structural alignment showed highly accurate results. The information derived from this web server might be further utilized in the field of ab initio protein structure modeling or protein homology-related analyses.

단백질 서열의 상동 관계를 가중 조합한 단백질 이차 구조 예측 (Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences)

  • 지상문
    • 한국정보통신학회논문지
    • /
    • 제20권9호
    • /
    • pp.1816-1821
    • /
    • 2016
  • 단백질은 대부분의 생물학적 과정에서 중대한 역할을 수행하고 있으므로, 단백질 진화, 구조와 기능을 알아내기 위하여 많은 연구가 수행되고 있는데, 단백질의 이차 구조는 이러한 연구의 중요한 기본적 정보이다. 본 연구는 대규모 단백질 구조 자료로부터 단백질 이차 구조 정보를 효과적으로 추출하여 미지의 단백질 서열이 가지는 이차 구조를 예측하려 한다. 질의 서열과 상동관계에 있는 단백질 구조자료내의 서열들을 광범위하게 찾아내기 위하여, 탐색에 사용하는 프로파일의 구성에 질의 서열과 유사한 서열들을 사용하고 갭을 허용하여 반복적인 탐색이 가능한 PSI-BLAST를 사용하였다. 상동 단백질들의 이차구조는 질의 서열과의 상동 관계의 강도에 따라 가중되어 이차 구조 예측에 기여되었다. 이차 구조를 각각 세 개와 여덟 개로 분류하는 예측 실험에서 상동 서열들과 신경망을 동시에 사용하여 93.28%와 88.79%의 정확도를 얻어서 기존 방법보다 성능이 향상되었다.

기하 인스턴싱 기법을 이용한 단백질 구조 가시화 및 속도 향상에 관한 연구 (The Study of Protein Structure Visualization and Rendering Speed Using the Geometry Instancing)

  • 박찬용;황치정
    • 정보처리학회논문지A
    • /
    • 제16A권3호
    • /
    • pp.153-158
    • /
    • 2009
  • 구조적 생물 정보학 분야는 단백질의 3차원 구조를 대상으로 단백질을 연구하는 분야이며, 구조적생물 정보학의 중요한 분야 중의 하나는 단백질 3차원 구조 가시화이다. 단백질의 3차원 구조를 규명하는 장비의 발달로, 규명되는 단백질의 크기와 개수가 증가함에 따라, 고성능의 단백질 가시화 시스템의 필요성도 크게 증가하였으나, 기존의 단백질 구조 가시화 시스템은 3차원 그래픽 하드웨어에 최적화 되지 못하여, 거대 단백질의 가시화에 충분한 성능을 가지지 못하였다. 본 논문에서 제안하는 단백질 3차원 구조 가시화 시스템은 거대 단백질의 가시화 하기 위하여, 3차원 그래픽 하드웨어의 최적화 기법중의 하나인 기하 인스턴싱 기법을 사용하여 빠르게 거대 단백질을 렌더링 한다. 성능 실험에서 7종의 다른 크기의 단백질을 대상으로, 4가지 가시화 방법에 대하여, 제안하는 시스템과 기존의 시스템과의 단백질 렌더링 성능 비교 실험을 하여, 대부분의 경우 우수한 성능을 보였다.

Structure-based Functional Discovery of Proteins: Structural Proteomics

  • Jung, Jin-Won;Lee, Weon-Tae
    • BMB Reports
    • /
    • 제37권1호
    • /
    • pp.28-34
    • /
    • 2004
  • The discovery of biochemical and cellular functions of unannotated gene products begins with a database search of proteins with structure/sequence homologues based on known genes. Very recently, a number of frontier groups in structural biology proposed a new paradigm to predict biological functions of an unknown protein on the basis of its three-dimensional structure on a genomic scale. Structural proteomics (genomics), a research area for structure-based functional discovery, aims to complete the protein-folding universe of all gene products in a cell. It would lead us to a complete understanding of a living organism from protein structure. Two major complementary experimental techniques, X-ray crystallography and NMR spectroscopy, combined with recently developed high throughput methods have played a central role in structural proteomics research; however, an integration of these methodologies together with comparative modeling and electron microscopy would speed up the goal for completing a full dictionary of protein folding space in the near future.

Computational Approaches for Structural and Functional Genomics

  • Brenner, Steven-E.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.17-20
    • /
    • 2000
  • Structural genomics aims to provide a good experimental structure or computational model of every tractable protein in a complete genome. Underlying this goal is the immense value of protein structure, especially in permitting recognition of distant evolutionary relationships for proteins whose sequence analysis has failed to find any significant homolog. A considerable fraction of the genes in all sequenced genomes have no known function, and structure determination provides a direct means of revealing homology that may be used to infer their putative molecular function. The solved structures will be similarly useful for elucidating the biochemical or biophysical role of proteins that have been previously ascribed only phenotypic functions. More generally, knowledge of an increasingly complete repertoire of protein structures will aid structure prediction methods, improve understanding of protein structure, and ultimately lend insight into molecular interactions and pathways. We use computational methods to select families whose structures cannot be predicted and which are likely to be amenable to experimental characterization. Methods to be employed included modern sequence analysis and clustering algorithms. A critical component is consultation of the presage database for structural genomics, which records the community's experimental work underway and computational predictions. The protein families are ranked according to several criteria including taxonomic diversity and known functional information. Individual proteins, often homologs from hyperthermophiles, are selected from these families as targets for structure determination. The solved structures are examined for structural similarity to other proteins of known structure. Homologous proteins in sequence databases are computationally modeled, to provide a resource of protein structure models complementing the experimentally solved protein structures.

  • PDF

The Grammatical Structure of Protein Sequences

  • Bystroff, Chris
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.28-31
    • /
    • 2000
  • We describe a hidden Markov model, HMMTIR, for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike the linear HMMs used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the database, and achieves a great reduction in parameters by representing overlapping motifs in a much more compact form. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model, predicts secondary structure with an accuracy of 74.6% and backbone torsion angles better than any previously reported method, and predicts the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction. HMMSTR has been incorporated into a public, fully-automated protein structure prediction server.

  • PDF

PSAML을 이용한 단백질 구조 비고 시스템 (A Protein Structure Comparison System based on PSAML)

  • 김진홍;안건태;변상희;이수현;이명준
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제11권2호
    • /
    • pp.133-148
    • /
    • 2005
  • 단백질 구조에 대한 유사성과 특이성에 대한 이해는 단백질의 기능을 파악하는데 있어 중요한 역할을 하고 있기 때문에, 많은 단백질 구조를 비교하는 시스템이 개발되고 있다. 그러나 이러한 시스템들은 단백질 구조 비교를 위한 자신의 알고리즘에 맞게 PDB에서 제공하는 데이타를 가공해야 한다 더욱이 PDB 데이타베이스에 저장된 데이타가 증가함에 따라 대용량의 단백질 구조 데이타베이스를 대상으로 주어진 단백질과 유사한 부분구조를 찾는 시스템은 보다 많은 계산량이 필요하여진다. 본 논문에서는 XML 데이타베이스인 eXist를 이용하여 PSAML 문서를 제공하는 PSAML 데이타베이스에 기반을 둔 WS4E(A Web-Based Searching Substructures of Secondary Structure Elements) 단백질 구조 비교 시스템을 소개한다. PSAML(Protein Structure Abstraction Markup Language)은 XML기반의 단백질 구조 표현 기법으로서 단백질의 2차구조 구성요소와 그들 사이의 관계를 이용하여 단백질 구조를 정형화된 방법으로 기술한다. 구축된 PSAML 데이타베이스를 이용하여, WS4E는 PSAML로 표현된 단백질 구조에서 유사한 부분 구조를 찾는 웹서비스를 제공한다. 또한, PSAML 데이타베이스에서 비교 대상이 되는 단백질의 숫자를 감소시키기 위하여, 단백질 2차구조가 가지는 공간상의 정보를 이용하여 하나의 단백질 구조를 표현하는 기법인 topology string을 이용하였다.

Reviving GOR method in protein secondary structure prediction: Effective usage of evolutionary information

  • Lee, Byung-Chul;Lee, Chang-Jun;Kim, Dong-Sup
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.133-138
    • /
    • 2003
  • The prediction of protein secondary structure has been an important bioinformatics tool that is an essential component of the template-based protein tertiary structure prediction process. It has been known that the predicted secondary structure information improves both the fold recognition performance and the alignment accuracy. In this paper, we describe several novel ideas that may improve the prediction accuracy. The main idea is motivated by an observation that the protein's structural information, especially when it is combined with the evolutionary information, significantly improves the accuracy of the predicted tertiary structure. From the non-redundant set of protein structures, we derive the 'potential' parameters for the protein secondary structure prediction that contains the structural information of proteins, by following the procedure similar to the way to derive the directional information table of GOR method. Those potential parameters are combined with the frequency matrices obtained by running PSI-BLAST to construct the feature vectors that are used to train the support vector machines (SVM) to build the secondary structure classifiers. Moreover, the problem of huge model file size, which is one of the known shortcomings of SVM, is partially overcome by reducing the size of training data by filtering out the redundancy not only at the protein level but also at the feature vector level. A preliminary result measured by the average three-state prediction accuracy is encouraging.

  • PDF

Minimally Complex Problem Set for an Ab initio Protein Structure Prediction Study

  • Kim RyangGug;Choi Cha-Yong
    • Biotechnology and Bioprocess Engineering:BBE
    • /
    • 제9권5호
    • /
    • pp.414-418
    • /
    • 2004
  • A 'minimally complex problem set' for ab initio protein Structure prediction has been proposed. As well as consisting of non-redundant and crystallographically determined high-resolution protein structures, without disulphide bonds, modified residues, unusual connectivities and heteromolecules, it is more importantly a collection of protein structures. with a high probability of being the same in the crystal form as in solution. To our knowledge, this is the first attempt at this kind of dataset. Considering the lattice constraint in crystals, and the possible flexibility in solution of crystallographically determined protein structures, our dataset is thought to be the safest starting points for an ab initio protein structure prediction study.

In silico annotation of a hypothetical protein from Listeria monocytogenes EGD-e unfolds a toxin protein of the type II secretion system

  • Maisha Tasneem;Shipan Das Gupta;Monira Binte Momin;Kazi Modasser Hossain;Tasnim Binta Osman;Fazley Rabbi
    • Genomics & Informatics
    • /
    • 제21권1호
    • /
    • pp.7.1-7.11
    • /
    • 2023
  • The gram-positive bacterium Listeria monocytogenes is an important foodborne intracellular pathogen that is widespread in the environment. The functions of hypothetical proteins (HP) from various pathogenic bacteria have been successfully annotated using a variety of bioinformatics strategies. In this study, a HP Imo0888 (NP_464414.1) from the Listeria monocytogenes EGD-e strain was annotated using several bioinformatics tools. Various techniques, including CELLO, PSORTb, and SOSUIGramN, identified the candidate protein as cytoplasmic. Domain and motif analysis revealed that the target protein is a PemK/MazF-like toxin protein of the type II toxin-antitoxin system (TAS) which was consistent with BLASTp analysis. Through secondary structure analysis, we found the random coil to be the most frequent. The Alpha Fold 2 Protein Structure Prediction Database was used to determine the three-dimensional (3D) structure of the HP using the template structure of a type II TAS PemK/MazF family toxin protein (DB ID_AFDB: A0A4B9HQB9) with 99.1% sequence identity. Various quality evaluation tools, such as PROCHECK, ERRAT, Verify 3D, and QMEAN were used to validate the 3D structure. Following the YASARA energy minimization method, the target protein's 3D structure became more stable. The active site of the developed 3D structure was determined by the CASTp server. Most pathogens that harbor TAS create a crucial risk to human health. Our aim to annotate the HP Imo088 found in Listeria could offer a chance to understand bacterial pathogenicity and identify a number of potential targets for drug development.