• 제목/요약/키워드: protein sequences

검색결과 1,067건 처리시간 0.028초

유전자 및 유전체 연구 기술과 동향 (Trend and Technology of Gene and Genome Research)

  • 이진성;김기환;서동상;강석우;황재삼
    • 한국잠사곤충학회지
    • /
    • 제42권2호
    • /
    • pp.126-141
    • /
    • 2000
  • A major step towards understanding of the genetic basis of an organism is the complete sequence determination of all genes in target genome. The nucleotide sequence encoded in the genome contains the information that specifies the amino acid sequence of every protein and functional RNA molecule. In principle, it will be possible to identify every protein resposible for the structure and function of the body of the target organism. The pattern of expression in different cell types will specify where and when each protein is used. The amino acid sequence of the proteins encoded by each gene will be derived from the conceptional translation of the nucleotide sequence. Comparison of these sequences with those of known proteins, whose sequences are sorted in database, will suggest an approximate function for many proteins. This mini review describes the development of new sequencing methods and the optimization of sequencing strategies for whole genome, various cDNA and genomic analysis.

  • PDF

Streptomyces lividans에서 secE 유전자의 클로닝과 염기서열 결정

  • 김순옥;서주원
    • 한국미생물·생명공학회지
    • /
    • 제25권3호
    • /
    • pp.253-257
    • /
    • 1997
  • The secE gene of Streptomyces lividans TK24 was cloned by the polymerase chain reaction method with synthetic oligonucleo- tide primers designed on the basis of the nucleotide sequences of Streptomyces coelicolor secE-nusG-rplK operon. The deduced amino acid sequences of the SecE were highly homologous to those of other known SecE protein, that is 36.8%, 30.4%, 80.0%, and 80.9%, similarity to E. coli, Bacillus subtilis, Streptomyces griseus, Streptomyces virginiae SecE, respectively and exactly same with Streptomyces coelicolor SecE. It means that in spite of evolutionary differences, the genes for protein translocation machinery are highly conserved in eubacteria. The gene organization of secE-nusG-rplK is also similar to that of E. coli, B. subtilis, and streptomycetes.

  • PDF

Algorithm for Predicting Functionally Equivalent Proteins from BLAST and HMMER Searches

  • Yu, Dong Su;Lee, Dae-Hee;Kim, Seong Keun;Lee, Choong Hoon;Song, Ju Yeon;Kong, Eun Bae;Kim, Jihyun F.
    • Journal of Microbiology and Biotechnology
    • /
    • 제22권8호
    • /
    • pp.1054-1058
    • /
    • 2012
  • In order to predict biologically significant attributes such as function from protein sequences, searching against large databases for homologous proteins is a common practice. In particular, BLAST and HMMER are widely used in a variety of biological fields. However, sequence-homologous proteins determined by BLAST and proteins having the same domains predicted by HMMER are not always functionally equivalent, even though their sequences are aligning with high similarity. Thus, accurate assignment of functionally equivalent proteins from aligned sequences remains a challenge in bioinformatics. We have developed the FEP-BH algorithm to predict functionally equivalent proteins from protein-protein pairs identified by BLAST and from protein-domain pairs predicted by HMMER. When examined against domain classes of the Pfam-A seed database, FEP-BH showed 71.53% accuracy, whereas BLAST and HMMER were 57.72% and 36.62%, respectively. We expect that the FEP-BH algorithm will be effective in predicting functionally equivalent proteins from BLAST and HMMER outputs and will also suit biologists who want to search out functionally equivalent proteins from among sequence-homologous proteins.

Draft Genome of Toxocara canis, a Pathogen Responsible for Visceral Larva Migrans

  • Kong, Jinhwa;Won, Jungim;Yoon, Jeehee;Lee, UnJoo;Kim, Jong-Il;Huh, Sun
    • Parasites, Hosts and Diseases
    • /
    • 제54권6호
    • /
    • pp.751-758
    • /
    • 2016
  • This study aimed at constructing a draft genome of the adult female worm Toxocara canis using next-generation sequencing (NGS) and de novo assembly, as well as to find new genes after annotation using functional genomics tools. Using an NGS machine, we produced DNA read data of T. canis. The de novo assembly of the read data was performed using SOAPdenovo. RNA read data were assembled using Trinity. Structural annotation, homology search, functional annotation, classification of protein domains, and KEGG pathway analysis were carried out. Besides them, recently developed tools such as MAKER, PASA, Evidence Modeler, and Blast2GO were used. The scaffold DNA was obtained, the N50 was 108,950 bp, and the overall length was 341,776,187 bp. The N50 of the transcriptome was 940 bp, and its length was 53,046,952 bp. The GC content of the entire genome was 39.3%. The total number of genes was 20,178, and the total number of protein sequences was 22,358. Of the 22,358 protein sequences, 4,992 were newly observed in T. canis. Following proteins previously unknown were found: E3 ubiquitin-protein ligase cbl-b and antigen T-cell receptor, zeta chain for T-cell and B-cell regulation; endoprotease bli-4 for cuticle metabolism; mucin 12Ea and polymorphic mucin variant C6/1/40r2.1 for mucin production; tropomodulin-family protein and ryanodine receptor calcium release channels for muscle movement. We were able to find new hypothetical polypeptides sequences unique to T. canis, and the findings of this study are capable of serving as a basis for extending our biological understanding of T. canis.

Structural Analysis of Recombinant Human Preproinsulins by Structure Prediction, Molecular Dynamics, and Protein-Protein Docking

  • Jung, Sung Hun;Kim, Chang-Kyu;Lee, Gunhee;Yoon, Jonghwan;Lee, Minho
    • Genomics & Informatics
    • /
    • 제15권4호
    • /
    • pp.142-146
    • /
    • 2017
  • More effective production of human insulin is important, because insulin is the main medication that is used to treat multiple types of diabetes and because many people are suffering from diabetes. The current system of insulin production is based on recombinant DNA technology, and the expression vector is composed of a preproinsulin sequence that is a fused form of an artificial leader peptide and the native proinsulin. It has been reported that the sequence of the leader peptide affects the production of insulin. To analyze how the leader peptide affects the maturation of insulin structurally, we adapted several in silico simulations using 13 artificial proinsulin sequences. Three-dimensional structures of models were predicted and compared. Although their sequences had few differences, the predicted structures were somewhat different. The structures were refined by molecular dynamics simulation, and the energy of each model was estimated. Then, protein-protein docking between the models and trypsin was carried out to compare how efficiently the protease could access the cleavage sites of the proinsulin models. The results showed some concordance with experimental results that have been reported; so, we expect our analysis will be used to predict the optimized sequence of artificial proinsulin for more effective production.

N-Terminal Amino Acid Sequences of Receptor-Like Proteins that Bind to preS1 of HBV in HepG2 Cells

  • Lee, Dong-Gun;Liu, Ming-Zhu;Kim, Kil-Lyong;Hahm, Kyung-Soo
    • BMB Reports
    • /
    • 제29권2호
    • /
    • pp.180-182
    • /
    • 1996
  • One of the essential functions of virus surface proteins is the recognition of specific receptors on target cell membranes, and cellular receptors play an important role in viral pathogenesis. But the earliest steps of hepatitis B virus (HBV) infection, such as hepatocyte receptor interaction with the virus, are poorly understood. Previous work has suggested an important role of the preS1 region of HBV envelope protein in mediating viral binding to hepatocytes. Although hepatitis B virus (HBV) infection appears to be initiated by specific binding of virions to cell membrane structures via one or potentially several viral surface proteins, data showing the identification or isolation of the HBV receptor (s) are not yet available. The receptor-like proteins on the plasma membrane surface of HepG2 cells that bind to PreS1 were separated and identified using affinity chromatography, and the amino-terminal amino acid sequences of the receptor-like proteins were determined.

  • PDF

Antibiotic resistance in Neisseria gonorrhoeae: broad-spectrum drug target identification using subtractive genomics

  • Umairah Natasya Mohd Omeershffudin;Suresh Kumar
    • Genomics & Informatics
    • /
    • 제21권1호
    • /
    • pp.5.1-5.13
    • /
    • 2023
  • Neisseria gonorrhoeae is a Gram-negative aerobic diplococcus bacterium that primarily causes sexually transmitted infections through direct human sexual contact. It is a major public health threat due to its impact on reproductive health, the widespread presence of antimicrobial resistance, and the lack of a vaccine. In this study, we used a bioinformatics approach and performed subtractive genomic methods to identify potential drug targets against the core proteome of N. gonorrhoeae (12 strains). In total, 12,300 protein sequences were retrieved, and paralogous proteins were removed using CD-HIT. The remaining sequences were analyzed for non-homology against the human proteome and gut microbiota, and screened for broad-spectrum analysis, druggability, and anti-target analysis. The proteins were also characterized for unique interactions between the host and pathogen through metabolic pathway analysis. Based on the subtractive genomic approach and subcellular localization, we identified one cytoplasmic protein, 2Fe-2S iron-sulfur cluster binding domain-containing protein (NGFG RS03485), as a potential drug target. This protein could be further exploited for drug development to create new medications and therapeutic agents for the treatment of N. gonorrhoeae infections.

Analysis of Partial cDNA Sequence from Human Fetal Liver

  • Kim, Jae-Wha;Song, Jae-Chan;Lee, In-Ae;Lee, Young-Hee;Nam, Myoung-Soo;Hahn, Yoon-Soo;Chung, Jae-Hoon;Choe, In-Seong
    • BMB Reports
    • /
    • 제28권5호
    • /
    • pp.402-407
    • /
    • 1995
  • Single-run Partial cDNA sequencing was conducted on 1,592 randomly selected human fetal liver cDNA clones of Korean origin to isolate novel genes related to liver functions. Each partial cDNA sequence determined was analyzed by comparing it with the databases. GenBank, Protein Information Resource (PIR) and SWISS-PROT Protein Sequence Data Bank. From a set of 1.592 cDNA clones reported here, 1,433 (90.0% of the total) were informative cDNA sequences. The other 159 clones were identified as DNA sequences which had originated from the cloning vector. Among 1,433 informative partial cDNA sequences, 851 (59.3%) clones were revealed to be identical to known human genes. These known genes have been classified into 225 different kinds of genes. In addition, 340 clones (23.7%) showed various degrees of homology to previously known human genes. Ninety four (6.6%) clones contained various repeated sequences. Twenty four (1.7%) partial cDNA sequences were found to have considerable homology to known genes from evolutionarily distant organism such as yeast, rice, Arabidopsis, mouse and rat, based on database matches, whereas 124 (8.7%) had no Significant matches. Human homologues to functionally characterized genes from different organisms could be classified as candidates for novel human genes of similar functions. Information from the partial cDNA sequences in this study may facilitate the analysis of genes expressed in human fetal liver.

  • PDF

IMPLEMENTATION OF SUBSEQUENCE MAPPING METHOD FOR SEQUENTIAL PATTERN MINING

  • Trang, Nguyen Thu;Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.627-630
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

  • PDF

Implementation of Subsequence Mapping Method for Sequential Pattern Mining

  • Trang Nguyen Thu;Lee Bum-Ju;Lee Heon-Gyu;Park Jeong-Seok;Ryu Keun-Ho
    • 대한원격탐사학회지
    • /
    • 제22권5호
    • /
    • pp.457-462
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.