Browse > Article

NOGSEC: A NOnparametric method for Genome SEquence Clustering  

이영복 (부산대학교 전자계산학과 그래픽스응용연구실, 포항공과대학교 생물학전문연구정보센터)
김판규 (부산대학교 전자계산학과 그래픽스응용연구실, 컴퓨터 및 정보통신연구소)
조환규 (부산대학교 전자계산학과 그래픽스응용연구실, 컴퓨터 및 정보통신연구소)
Publication Information
Korean Journal of Microbiology / v.39, no.2, 2003 , pp. 67-75 More about this Journal
Abstract
One large topic in comparative genomics is to predict functional annotation by classifying protein sequences. Computational approaches for function prediction include protein structure prediction, sequence alignment and domain prediction or binding site prediction. This paper is on another computational approach searching for sets of homologous sequences from sequence similarity graph. Methods based on similarity graph do not need previous knowledges about sequences, but largely depend on the researcher's subjective threshold settings. In this paper, we propose a genome sequence clustering method of iterative testing and graph decomposition, and a simple method to calculate a strict threshold having biochemical meaning. Proposed method was applied to known bacterial genome sequences and the result was shown with the BAG algorithm's. Result clusters are lacking some completeness, but the confidence level is very high and the method does not need user-defined thresholds.
Keywords
function prediction; genome sequence clustering; graph decomposition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 /
[] / National Center for Biotechnology Information
2 GeneRAGE: A robust algorithm for sequence clustering and domain detection /
[ Enright,A.J.;C.A.Ouzounis ] / Bioinformatics   DOI   ScienceOn
3 A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation /
[ Bucher,P.;A.Bairoch ] / Proc. Int. Conf. Inteli. Syst. Mol. Biol.
4 Smart, a simple modular architecture research tool: identification of signaling domains /
[ Schultz,J.;F.Milpetz;P.Bork;C.P.Ponting ] / Proc. Natl. Acad. Sci.   DOI   ScienceOn
5 Classifying molecular sequences using a linkage graph with their pairwise similarities /
[ Matsuda,H;l.T.Ishihara;A.Hashimoto ] / Theor. Comp. Sci.   DOI   ScienceOn
6 Ancient conserved regions in new gene sequences and the protein databases /
[ Green,P.;D.Lipman;L.Hillier;R.Waterston;D.State;J.M.Claverie ] / Science   DOI
7 /
[ Pavel,A.Pevzner ] / Computational Molecular Biology, An Algorithmic Approach
8 A general method applicable to the search for similarities in the amino acid sequences of two proteins /
[ Needleman,S.B.;C.D.Wunsch ] / J. Mol. Biol.   DOI
9 The cog database: new developments in phylogenetic classification of proteins from complete genomes /
[ Tatusov,R.L.;D.A.Natale;I.V.Garkavtsev;T.A.Tatusova;U.T.Shankavaram;B.S.Rao;B.Kiryutin;M.Y.Galperin;N.D.Fedorova;E.V.Koonin ] / Nucl. Acids Res.   DOI   ScienceOn
10 Basic local alignment search tool /
[ Altschul,S.F. ] / J. Mol. Biol.   DOI
11 Graph theoretic sequence clustering algorithms and their applications to genome comparison /
[ Kim,S.;J.T.L.Wang(ed.);C.H.Wu(ed.);P.P.Wang(ed.) ] / Computational Biology and Genome Informatics
12 Novel developments with the PRINTS protein fingerprint database /
[ Attwood,T.K.;M.E.Beck;A.J.Bleasby;K.Degtyarenko;A.D.Michie;D.J.Parry Smith ] / Nucl. Acids Res.
13 /
[ Mount,D.W. ] / Bioinformatics: Sequence and Genomo Analysis
14 The cog database: a tool for genome-scale analysis of protein functions and evolution /
[ Tatusov,R.L.;M.Y.Galperin;D.A.Natale;E.V.Koonin ] / Nucl. Acids Res.   DOI
15 Binary codes capable of correcting deletions, insertions and reversals /
[ Levenshtein, V.I. ] / Soviet Physics Doklady
16 Detection of conserved domains in protein sequences using a maximum-density subgraph algorithm /
[ Matsuda,H. ] / IEICE Trans. Fundamentals
17 Identification of common molecular subsequences /
[ Smith,T.F.;M.S.Waterman ] / J. Mol. Biol.   DOI
18 Improved tools for biological sequence comparison /
[ Pearson,W.R.;D.J.Lipman ] / Proc. Batl. Acad. Sci.
19 The block database-a system for protein classification /
[ Shmuel,P.;J.G.Henikoff.;S.Henikoff ] / Nucl. Acids Res.   DOI   ScienceOn
20 Practical limits of function prediction /
[ Davos,D.;A.Valencia ] / PROTEINS, Structure, Function, and Genetics   DOI   ScienceOn
21 Survey of new data and computer methods of analysis /
[ Dayhoff,M.O. ] / Atlas of protein sequence and structure