• Title/Summary/Keyword: DNA Sequence Classification

Search Result 93, Processing Time 0.024 seconds

Could Decimal-binary Vector be a Representative of DNA Sequence for Classification?

  • Sanjaya, Prima;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • v.5 no.3
    • /
    • pp.8-15
    • /
    • 2016
  • In recent years, one of deep learning models called Deep Belief Network (DBN) which formed by stacking restricted Boltzman machine in a greedy fashion has beed widely used for classification and recognition. With an ability to extracting features of high-level abstraction and deal with higher dimensional data structure, this model has ouperformed outstanding result on image and speech recognition. In this research, we assess the applicability of deep learning in dna classification level. Since the training phase of DBN is costly expensive, specially if deals with DNA sequence with thousand of variables, we introduce a new encoding method, using decimal-binary vector to represent the sequence as input to the model, thereafter compare with one-hot-vector encoding in two datasets. We evaluated our proposed model with different contrastive algorithms which achieved significant improvement for the training speed with comparable classification result. This result has shown a potential of using decimal-binary vector on DBN for DNA sequence to solve other sequence problem in bioinformatics.

Negative Selection Algorithm for DNA Sequence Classification

  • Lee, Dong Wook;Sim, Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.2
    • /
    • pp.231-235
    • /
    • 2004
  • According to revealing the DNA sequence of human and living things, it increases that a demand on a new computational processing method which utilizes DNA sequence information. In this paper we propose a classification algorithm based on negative selection of the immune system to classify DNA patterns. Negative selection is the process to determine an antigenic receptor that recognize antigens, nonself cells. The immune cells use this antigen receptor to judge whether a self or not. If one composes n group of antigenic receptor for n different patterns, they can classify into n patterns. In this paper we propose a pattern classification algorithm based on negative selection in nucleotide base level and amino acid level.

Phylogenetic Relationships of the Aphyllophorales Inferred from Sequence analysis of Nuclear Small Subunit Ribosomal DNA

  • Kim, Seon-Young;Jung, Hack-Sung
    • Journal of Microbiology
    • /
    • v.38 no.3
    • /
    • pp.122-131
    • /
    • 2000
  • Phylogenetic classification of the Aphyllophorales was conducted based on the analysis of nuclear small subunit ribosomal RNA (nuc SSU rDNA) sequence. Based on phylogenetic groupings and taxonomic characters, 16 families were recognized and discussed. Although many of the characters had more or less homoplasies, miroscopic characters such ad the mitic system and clamp, spore amyloidity and rot type appeared to be important in the classification of the Aphyllophorales. Phylogenetically significant families were newly defined to improve the classification of the order Aphyllophorales.

  • PDF

Classification of DNA Pattern Using Negative Selection (부정 선택을 이용한 DNA의 패턴 분류)

  • Sim, Kwee-Bo;Lee, Dong-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.5
    • /
    • pp.551-556
    • /
    • 2004
  • According to revealing the DNA sequence of human and living things, it increases that a demand on a new computational processing method which utilizes DNA sequence information. In this paper we propose a classification algorithm based on negative selection of the immune system to classify DNA patterns. Negative selection is the process to determine an antigenic receptor that recognize antigens, nonself cells. The immune cells use this antigen receptor to judge whether a self or not. If one composes n group of antigenic receptor for n different patterns, they can classify into n patterns. In this paper we propose a pattern classification algorithm based on negative selection in nucleotide base level and amino acid level.

Negative Selection Algorithm for DNA Pattern Classification

  • Lee, Dong-Wook;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.190-195
    • /
    • 2004
  • We propose a pattern classification algorithm using self-nonself discrimination principle of immune cells and apply it to DNA pattern classification problem. Pattern classification problem in bioinformatics is very important and frequent one. In this paper, we propose a classification algorithm based on the negative selection of the immune system to classify DNA patterns. The negative selection is the process to determine an antigenic receptor that recognize antigens, nonself cells. The immune cells use this antigen receptor to judge whether a self or not. If one composes ${\eta}$ groups of antigenic receptor for ${\eta}$ different patterns, these receptor groups can classify into ${\eta}$ patterns. We propose a pattern classification algorithm based on the negative selection in nucleotide base level and amino acid level. Also to show the validity of our algorithm, experimental results of RNA group classification are presented.

  • PDF

Phylogeny of Flavobacteria Group Isolated from Freshwater Using Multilocus Sequencing Analysis

  • Mun, Seyoung;Lee, Jungnam;Lee, Siwon;Han, Kyudong;Ahn, Tae-Young
    • Genomics & Informatics
    • /
    • v.11 no.4
    • /
    • pp.272-276
    • /
    • 2013
  • Sequence analysis of the 16S rRNA gene has been widely used for the classification of microorganisms. However, we have been unable to clearly identify five Flavobacterium species isolated from a freshwater by using the gene as a single marker, because the evolutionary history is incomplete and the pace of DNA substitutions is relatively rapid in the bacteria. In this study, we tried to classify Flavobacterium species through multilocus sequence analysis (MLSA), which is a practical and reliable technique for the identification or classification of bacteria. The five Flavobacterium species isolated from freshwater and 37 other strains were classified based on six housekeeping genes: gyrB, dnaK, tuf, murG, atpA, and glyA. The genes were amplified by PCR and subjected to DNA sequencing. Based on the combined DNA sequence (4,412 bp) of the six housekeeping genes, we analyzed the phylogenetic relationship among the Flavobacterium species. The results indicated that MLSA, based on the six housekeeping genes, is a trustworthy method for the identification of closely related Flavobacterium species.

Sequence comparisons of 28S ribosomal DNA and mitochondrial cytochrome c oxidase subunit I of Metagonimus yokogawai, M. takahashii and M. miyatai

  • Lee, Soo-Ung;Huh, Sun;Sohn, Woon-Mok;Chai, Jong-Yil
    • Parasites, Hosts and Diseases
    • /
    • v.42 no.3
    • /
    • pp.129-135
    • /
    • 2004
  • We compared the DNA sequences of the genus Metagonimus: M. yokogawai, M. takahashii, and M. miyatai. We obtained 288 D1 ribosomal DNA (rDNA) and mitochondrial cytochrome c oxidase subunit I (mtCOI) fragments from the adult worms by PCR, that were cloned and sequenced. Phylogenetic relationships inferred from the nucleotide sequences of the 28S D1 rDNA and mtCOI gene. M. takahashii and M. yokogawai are placed in the same clade supported by DNA sequence and phylogenie tree analysis in 28S D1 rDNA and mtCOI gene region. The above findings tell us that M. takahashii is closer to M. yokogawai than to M. miyatai genetically. This phylogenetic data also support the nomination of M. miyatai as a separate species.

DNA Sequence Classification Using a Generalized Regression Neural Network and Random Generator (난수발생기와 일반화된 회귀 신경망을 이용한 DNA 서열 분류)

  • 김성모;김근호;김병환
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.7
    • /
    • pp.525-530
    • /
    • 2004
  • A classifier was constructed by using a generalized regression neural network (GRU) and random generator (RG), which was applied to classify DNA sequences. Three data sets evaluated are eukaryotic and prokaryotic sequences (Data-I), eukaryotic sequences (Data-II), and prokaryotic sequences (Data-III). For each data set, the classifier performance was examined in terms of the total classification sensitivity (TCS), individual classification sensitivity (ICS), total prediction accuracy (TPA), and individual prediction accuracy (IPA). For a given spread, the RG played a role of generating a number of sets of spreads for gaussian functions in the pattern layer Compared to the GRNN, the RG-GRNN significantly improved the TCS by more than 50%, 60%, and 40% for Data-I, Data-II, and Data-III, respectively. The RG-GRNN also demonstrated improved TPA for all data types. In conclusion, the proposed RG-GRNN can effectively be used to classify a large, multivariable promoter sequences.

Use of DNA Methylation for Cancer Detection and Molecular Classification

  • Zhu, Jingde;Yao, Xuebiao
    • BMB Reports
    • /
    • v.40 no.2
    • /
    • pp.135-141
    • /
    • 2007
  • Conjugation of the methyl group at the fifth carbon of cytosines within the palindromic dinucleotide 5'-CpG-3' sequence (DNA methylation) is the best studied epigenetic mechanism, which acts together with other epigenetic entities: histone modification, chromatin remodeling and microRNAs to shape the chromatin structure of DNA according to its functional state. The cancer genome is frequently characterized by hypermethylation of specific genes concurrently with an overall decrease in the level of 5-methyl cytosine, the pathological implication of which to the cancerous state has been well established. While the latest genome-wide technologies have been applied to classify and interpret the epigenetic layer of gene regulation in the physiological and disease states, the epigenetic testing has also been seriously explored in clinical practice for early detection, refining tumor staging and predicting disease recurrence. This critique reviews the latest research findings on the use of DNA methylation in cancer diagnosis, prognosis and staging/classification.

Multi-locus Phylogeny Analysis of Korean Isolates of Phytophthora Species Based on Sequence of Ribosomal and Mitochondrial DNA (핵 및 미토콘드리아 DNA 염기서열을 이용한 국내 Phytophthora 속의 Multi-locus phylogeny 분석)

  • Seo, Mun-Won;Song, Jeong-Young;Kim, Hong-Gi
    • The Korean Journal of Mycology
    • /
    • v.38 no.1
    • /
    • pp.40-47
    • /
    • 2010
  • To investigate genetic relationships either interspecies or intraspecies of 14 Korean Phytophthora species, sequence analyses of nuclear DNA (ypt gene and rDNA-IGS region) and mitochondrial DNA (Cox gene, $\beta$-tubuline gene, and EF1A gene) were performed. All of 14 Korean Phytophthora species clearly clustered into foreign isolates of each species. These Korean isolates in Phytophthora species also showed no correlation between molecular classification and morphological classification like as in case of foreigners. P. palmivora KACC 40167 reported previously from genetic groups of Phytophthora species in Korea was not consistent with the classification system, and therefore was required re-examination for the genetic group analysis. Korean isolates of P. drechsleri KACC 40195 showed very close relationship with P. cryptogea KACC 40161 above 94% bootstrap value in P. cryptogea-P. drechsleri complex group. Identification of these isolates is still unclear, because P. cryptogea and P. drechsleri were not differentiated in this study. On the other hand, it was required to unify species for these two species, since P. parasitica and P. nicotianae were clustered into a group on the level of 99 to 100% sequence homology. Comparing to the sequences of foreigners, Korean isolates were newly divided to ten groups in the phylogenic system. These results could be prepared useful informations to understand genetic diversity of Phytophthora species in Korea.