• Title/Summary/Keyword: sequence length

Search Result 1,234, Processing Time 0.03 seconds

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences (생물학적 데이터 서열들에서 빈번한 최대길이 연속 서열 마이닝)

  • Kang, Tae-Ho;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.155-162
    • /
    • 2008
  • Biological sequences such as DNA sequences and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological dataset with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with the fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. As the result, the experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

Phylogenetic Diversity of Bacteria in an Earth-Cave in Guizhou Province, Southwest of China

  • Zhou, Jun-Pei;Gu, Ying-Qi;Zou, Chang-Song;Mo, Ming-He
    • Journal of Microbiology
    • /
    • v.45 no.2
    • /
    • pp.105-112
    • /
    • 2007
  • The objective of this study was to analyze the phylogenetic composition of bacterial community in the soil of an earth-cave (Niu Cave) using a culture-independent molecular approach. 16S rRNA genes were amplified directly from soil DNA with universally conserved and Bacteria-specific rRNA gene primers and cloned. The clone library was screened by restriction fragment length polymorphism (RFLP), and representative rRNA gene sequences were determined. A total of 115 bacterial sequence types were found in 190 analyzed clones. Phylogenetic sequence analyses revealed novel 16S rRNA gene sequence types and a high diversity of putative bacterial community. Members of these bacteria included Proteobacteria (42.6%), Acidobacteria (18.6%), Planctomycetes (9.0 %), Chloroflexi (Green nonsulfur bacteria, 7.5%), Bacteroidetes (2.1%), Gemmatimonadetes (2.7%), Nitrospirae (8.0%), Actinobacteria (High G+C Gram-positive bacteria, 6.4%) and candidate divisions (including the OP3, GN08, and SBR1093, 3.2%). Thirty-five clones were affiliated with bacteria that were related to nitrogen, sulfur, iron or manganese cycles. The comparison of the present data with the data obtained previously from caves based on 16S rRNA gene analysis revealed similarities in the bacterial community components, especially in the high abundance of Proteobacteria and Acidobacteria. Furthermore, this study provided the novel evidence for presence of Gemmatimonadetes, Nitrosomonadales, Oceanospirillales, and Rubrobacterales in a karstic hypogean environment.

Distribution of Soil Series in Jeju Island by Proximity and Altitude (해발고도 및 인접성에 의한 제주도 토양통 분포특성)

  • Moon, Kyung-Hwan;Lim, Han-Cheol;Hyun, Hae-Nam
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.40 no.3
    • /
    • pp.221-228
    • /
    • 2007
  • Quantitative analysis of distribution characteristics of soils in Jeju Island was conducted by using geographic information system (GIS) technology. Soil series could be classified 5 groups after cluster analysis with proximity ratios among soil series which mean ratios of boundary lengths of other soils to total boundary length. Classification with proximity only was similar to conventional classification system at detailed soil map although conventional system was made from several criteria such as soil color, altitude and chemical characteristics of soils. Altitudinal sequence of soil series was also suggested from representative altitudes of them which could be found from areal distribution curve along altitudes. The sequence was brown forest soils - black soils - very dark brown soils - dark brown soils from the peak of Halla Mt. to the coast on all sides, which maybe related to pedogenesis process in Jeju Island.

Development of Restriction Fragment Length Polymorphism(RELP) Markers in Silkworm, Bombyx mori (누에 RFLP(제한단편 다형현상)마커 개발)

  • 고승주;김태산;이영승;황재삼;이상몽
    • Korean journal of applied entomology
    • /
    • v.36 no.1
    • /
    • pp.96-104
    • /
    • 1997
  • A silkworm Bombyx mori genomic DNA library was constructed from polyphagous J111 strain and unpolyphagous $C_3$ strain to develop the genomic study by DNA makers. Genomic DNAs of two strains were digested with restriction enzyme EcoRI and ligated into pUC18. The ligated plasmids were transferred into E. coli host strain DH5$\alpha$. When the genomic DNAs were hybridized with insert DNAs from transformant, could be categorized from hybridization patterns to three groups as high repetitive sequence, moderately repetitive sequence, and low-copy number sequences. A total of 219 clones containing single or low-copy number sequence inserts were examined for any polymorphisms between two strains of J111 and $C_3$. Forty six clones showed RFLPs and 10 of these clones were used as a probe of analysis of $F_2$ population derived from crossing between J111 and $C_3$ strain. The genetic inheritance tested with each clones will be important tools to construct the genetic map of the silkworm, Bombyx mori.

  • PDF

Prospective Changes of English Digital Textbook Based on the Universal Design for Learning (보편적 학습 설계에 근거한 영어과 디지털 교과서 개선 방안)

  • Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.674-683
    • /
    • 2015
  • One of the issues with the textbooks pertinent to the current study is whether or not the Universal Design for Learning (UDL) factors have been dealt to satisfy students with different aptitudes in learning the core objectives of the lessons. This study develops a modified version of the UDL analysis criteria from the cross curricular criteria to language teaching and learning and uses it to analyze the sequence of digital English textbooks to investigate the descriptive statistics of the UDL factors in the new textbooks. The result shows that the textbook is designed most favorably to the students with the talent of linguistic aptitude and less favorably to the students with other types of aptitudes. The sequence analysis shows that sentence/word length and appearance of new words are incrementally sequenced as students advance upper grades. However, the syntactic complexity of middle school curves up steeply which is different from the elementary school textbooks. The UDL analysis will provide learning factors to consider when designing digital English textbooks to cover different aptitudinal groups.

Molecular Cloning and Expression of DMRT Gene in Protogynous Wrasse, Halichoeres tenuispinis

  • Jeong, Hyung-Bok;Park, Ji-Gweon;Park, Jin-Young;Jin, Young-Jun;Yang, Myung-Cheon;Hyun, Kyung-Man;Kim, Gi-Ok;Kim, Se-Jae
    • Proceedings of the Korean Society of Developmental Biology Conference
    • /
    • 2003.10a
    • /
    • pp.64-64
    • /
    • 2003
  • The sex differentiation of fishes occurs under the control of genetic and various environmental factors. DM-domain containing genes are novel zinc finger transcription factors and play key roles in sex determination. In order to isolate the wrasse DMRT (wDMRT) cDNA from the protogynous wrasse (Halichoeres tenuispinnis), the wrasse testis cDNA library was screened using the $^{32}$ P-labeled PCR products, which were amplified with the degenerate primers from conserved DM-domain regions of several DMRT genes. Among a few positives obtained through screening, the full length wDMRT cDNA of 2.9kb size encoding a predicted 300 amino acid residues was isolated. The sequence analysis exhibited 60%, 43% sequence identity with rainbow trout and tilapia DMRT1, respectively. RT-PCR assay showed that wDMRT was expressed specifically in male testis. Also, wDMRT gene was strongly expressed in May during reproductive season, when the reproductivity of wrasse is most active. This results suggested that wDMRT gene function in testis differentiation The conserved DM-domain regions were amplified using PCR from DMRT genes of several species among Labridae, and their sequences were determined. The sequence of DM-domain region of Halichoeres. tenuispinis was identical to those of Pseudolabrus japonicus, Pteragogus flagellifera, and showed 94% identity with that of Halichoeres poecioptrerus.

  • PDF

Phylogenetic Relationships among Groupers (Genus Epinephelus) Based on Mitochondrial Cytochrome b DNA Sequences

  • KANG Geo Young;SONG Choon Bok
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.37 no.5
    • /
    • pp.414-422
    • /
    • 2004
  • To infer phylogenetic relationships among Epinephelus species inhabiting coastal regions of Korean peninsula, mitochondrial cytochrome b genes from 9 species belonging to the subfamily Epinephelinae were PCR-amplified, cloned and sequenced. Aligned cytochrome b sequences of 10 species containing one additional sequence from GenBank were 1,140 base pairs in length, including 439 variable and 330 parsimony informative sites. The cytochrome b genes of 10 species, as other vertebrates studied to date, exhibit unequal base compositions: an entirely low G content ($15.2{\pm}0.3{\%}$on average) and almost equal T, C and A contents ($29.3{\pm}0.8{\%},\;30.7{\pm}1.0{\%},\;and\;24.8{\pm}0.5{\%}$ on average, respectively).In third codon positions, transitional substitutions especially between Epinephelus species and outgroup species are almost certainly saturated or near saturation. Phylogenetic analyses were performed with sequence data from 8 Epinephelus species and 2 outgroup species (Cephalopholis urodela and Vaviola louti) by using distance-based (neighbor-joining and minimum evolution) and parsimony-based (maximum parsimony) methods. The results showed that the monophyly of the genus Epinephelus was supported by relatively high bootstrap values. However, phylogenetic relationships among E. areolatus, E. moara, E. septemfasciatus, and Epinephelus sp were poorly resolved. Within the genus Epinephelus, three resolved monophyletic groups were found: clade 1 included E. akaara and E. awoara;, clade 2 included E. fasciatus and E. merra; and clade 3 included E. akaara, E. awoara, E. fasciatus, E. merra, E. areolatus, E. moara, E. septemfasciatus and Epinephelus Sp.

Unraveling Haplotype Diversity of the Apical Membrane Antigen-1 Gene in Plasmodium falciparum Populations in Thailand

  • Lumkul, Lalita;Sawaswong, Vorthon;Simpalipan, Phumin;Kaewthamasorn, Morakot;Harnyuttanakorn, Pongchai;Pattaradilokrat, Sittiporn
    • Parasites, Hosts and Diseases
    • /
    • v.56 no.2
    • /
    • pp.153-165
    • /
    • 2018
  • Development of an effective vaccine is critically needed for the prevention of malaria. One of the key antigens for malaria vaccines is the apical membrane antigen 1 (AMA-1) of the human malaria parasite Plasmodium falciparum, the surface protein for erythrocyte invasion of the parasite. The gene encoding AMA-1 has been sequenced from populations of P. falciparum worldwide, but the haplotype diversity of the gene in P. falciparum populations in the Greater Mekong Subregion (GMS), including Thailand, remains to be characterized. In the present study, the AMA-1 gene was PCR amplified and sequenced from the genomic DNA of 65 P. falciparum isolates from 5 endemic areas in Thailand. The nearly full-length 1,848 nucleotide sequence of AMA-1 was subjected to molecular analyses, including nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity and neutrality tests. Phylogenetic analysis and pair-wise population differentiation ($F_{st}$ indices) were performed to infer the population structure. The analyses identified 60 single nucleotide polymorphic loci, predominately located in domain I of AMA-1. A total of 31 unique AMA-1 haplotypes were identified, which included 11 novel ones. The phylogenetic tree of the AMA-1 haplotypes revealed multiple clades of AMA-1, each of which contained parasites of multiple geographical origins, consistent with the $F_{st}$ indices indicating genetic homogeneity or gene flow among geographically distinct populations of P. falciparum in Thailand's borders with Myanmar, Laos and Cambodia. In summary, the study revealed novel haplotypes and population structure needed for the further advancement of AMA-1-based malaria vaccines in the GMS.

Molecular cloning of a novel cecropin-like peptide gene from the swallowtail butterfly, Papilio xuthus

  • Kim, Seong-Ryul;Choi, Kwang-Ho;Kim, Sung-Wan;Hwang, Jae-Sam;Goo, Tae-Won;Kim, Iksoo
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.31 no.2
    • /
    • pp.79-84
    • /
    • 2015
  • A new cecropin-like antimicrobial peptide (Px-CLP) gene was isolated from the immunechallenged larvae of the swallowtail butterfly, Papilio xuthus, by employing annealing control primer (ACP)-based GeneFishing PCR. The full-length cDNA of Px-CLP is 310 nucleotides encoding a 70 amino acid precursor that contains a putative 22-residue signal peptide, a 4-residue propeptide, a presumed 37-residue mature peptide, and an uncommon 7-residue acidic pro-region at the C-terminus. The deduced amino acid sequence of Px-CLP showed significant identities with other Lepidopteran cecropin D type peptides. RT-PCR revealed that the Px-CLP transcript was detected at significant level after injection with bacterial lipopolysaccharide (LPS). The peptides with or without C-terminal acidic sequence region were synthesized on-solid phage and submitted to antibacterial activity assay. The synthetic 37-mer peptide (Px-CLPa), which removed C-terminal acidic sequence region, was showed exclusively antibacterial activity against E. coli ML35; meanwhile, a 44-mer peptide (Px-CLPb) with C-terminal acidic peptide region was not active. This result suggests that Px-CLP is produced as a larger precursor containing a C-terminal pro-region that is subsequently removed by C-terminal modification.

Antitumor Toxic Protein Abrin and Abrus Agglutinin

  • Liu, Chao-Lin;Lin, Jung-Yaw
    • Toxicological Research
    • /
    • v.17
    • /
    • pp.109-115
    • /
    • 2001
  • Abrus agglutinin was purified from the kernels of Abrus precatorius by Sepharose 4B affinity column chromatography followed by Sephadex G-100 gel filtration column chromatography. About 1.25 g of abrus agglutinin was obtained from 1 kg of the kernels. The LD$_{50}$ of abrus agglutinin is 5 mg/kg of body weight, which is less toxic than that of abrin, 20$\mu\textrm{g}$/kg body weight. The amino acid sequence of abrus agglutinin was determined by protein sequencing techniques and deduced from the nucleotide sequence of a cDNA clone encoding full length of abrus agglutinin. There are 258 residues, 2 residues and 267 residues in the A-chain, the linker peptide and the B-chain of abrus agglutinin, respectively. Abrus agglutinin had high homology to abrin-a (77.8%). The 13 amino acid residues involved in catalytic function, which are highly conserved among abrin and ricin, were also conserved within abrus agglutinin. The protein synthesis inhibitory activity of abrus agglutinin ($IC_{50}$/ = 3.5 nM) was weaker than that of abrin-a (0.05 nM). By molecular modeling followed by site-directed mutagenesis showed that Pro199 of abrus agglutinin A-chain located in amphipathic helix H and corresponding to Asn200 of abrin A-chain, can induce bending of helix H. This bending would presumably affect the binding of abrus agglutinin A-chain to its target sequence GpApGpAp, in the tetraloop structure of 285 r-RNA subunit and this could be one of major factors contributing to the relatively weak protein synthesis inhibitory activity and toxicity of abrus agglutinin.n.

  • PDF