• Title/Summary/Keyword: NCBI nucleotide database

Search Result 33, Processing Time 0.024 seconds

The List of Korean Organisms Registered in the NCBI Nucleotide Database for Environmental DNA Research (환경유전자 연구를 위한 NCBI Nucleotide 데이터베이스에 등록된 국내 생물 목록 현황)

  • Ihn-Sil Kwak;Chang Woo Ji;Won-Seok Kim;Dongsoo Kong
    • Korean Journal of Ecology and Environment
    • /
    • v.55 no.4
    • /
    • pp.352-359
    • /
    • 2022
  • Recently, with the development of genetic technology, interest in environmental DNA (eDNA) to study biodiversity according to molecular biological approaches is increasing. Environmental DNA has many advantages over traditional research methods for biological communities distributed in the environment but highly depends on the established base sequence database. This study conducted a comprehensive analysis of the habitat status and classification at the genus level, which is mainly used in eDNA (12S rRNA, 16S rRNA, 18S rRNA, COI, and CYTB), focusing on Korean registration taxon groups (phytoplankton, zooplankton, macroinvertebrates, and fish). As a result, phytoplankton and zooplankton showed the highest taxa proportion in 18S rRNA, and macroinvertebrates observed the highest ratio in the nucleotide sequence database in COI. In fish, all genes except 18S rRNA showed a high taxon ratio. Based on the Korean registration taxon group, the gene construction of the top 20 genera according to bio density observed that most of the phytoplankton were registered in 18S rRNA, and the most significant number of COI nucleotide sequences were established in macroinvertebrates. In addition, it was confirmed that there is a nucleotide sequence for the top 20 genera in 12S rRNA, 16S rRNA, and CYTB in fish. These results provided comprehensive information on the genes suitable for eDNA research for each taxon group.

Nucleotide and protein researches on anaerobic fungi during four decades

  • Chang, Jongsoo;Park, Hyunjin
    • Journal of Animal Science and Technology
    • /
    • v.62 no.2
    • /
    • pp.121-140
    • /
    • 2020
  • Anaerobic fungi habitat in the gastrointestinal tract of foregut fermenters or hindgut fermenters and degrade fibrous plant biomass through the hydrolysis reactions with a wide variety of cellulolytic enzymes and physical penetration through fiber matrix with their rhizoids. To date, seventeen genera have been described in family Neocallimasticaceae, class Neocallimastigomycetes, phylum Neocallimastigomycota and one genus has been described in phylum Neocallimastigomycota. In National Center for Biotechnology Information (NCBI) database (DB), 23,830 nucleotide sequences and 59,512 protein sequences have been deposited and most of them were originated from Piromyces, Neocallimastix and Anaeromyces. Most of protein sequences (44,025) were acquired with PacBio next generation sequencing system. The whole genome sequences of Anaeromyces robustus, Neocallimastix californiae, Pecoramyces ruminantium, Piromyces finnis and Piromyces sp. E2 are available in Joint Genome Institute (JGI) database. According to the results of protein prediction, average Isoelectric points (pIs) were ranged from 5.88 (Anaeromyces) to 6.57 (Piromyces) and average molecular weights were ranged from 38.7 kDa (Orpinomyces) to 56.6 kDa (Piromyces). In Carbohydrate-Active enZYmes (CAZY) database, glycoside hydrolases (36), carbohydrate binding module (11), carbohydrate esterases (8), glycosyltransferase (5) and polysaccharide lyases (3) from anaerobic fungi were registered. During four decades, 1,031 research articles about anaerobic fungi were published and 444 and 719 articles were available in PubMed (PM) and PubMed Central (PMC) DB.

Development of DNA Markers for Trehalose Synthesis Genes in Brassica rapa L. (배추 trehalose 합성 유전자와 연관된 DNA 마커 개발)

  • Jeong, Ye-Sol;Lim, Yong-Pyo;Hur, Yoon-kang;Chung, Sang-Min
    • Journal of Life Science
    • /
    • v.19 no.5
    • /
    • pp.639-643
    • /
    • 2009
  • High temperature stress might affect the yield and quality of Chinese cabbage. In order to develop cultivars resistant to high temperature stress, we developed polymorphic DNA markers for trehalose synthesis genes related to abiotic stress resistance. A total of 28 Brassica rapa ESTs homologous to trehalose synthesis genes of Arabidopsis were found from the NCBI database. The polymorphic DNA sequences were searched between Chinese cabbages - Chiifu, which is relatively susceptible to high temperature stress, and Kenshin, which is tolerant to high temperature stress. Among the 28 ESTs, we found 10 ESTs that have either insertion/deletion and/or single nucleotide polymorphism between the two cultivars. Those polymorphic sites were then targeted for the development of 10 PCR based markers. These molecular markers related to trehalose genes could be used not only to test their relationship with abiotic stress resistance in Chinese cabbage, but also the development of abiotic stress resistant cultivars using MAS.

Metagenome Analysis of Protein Domain Collocation within Cellulase Genes of Goat Rumen Microbes

  • Lim, SooYeon;Seo, Jaehyun;Choi, Hyunbong;Yoon, Duhak;Nam, Jungrye;Kim, Heebal;Cho, Seoae;Chang, Jongsoo
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.8
    • /
    • pp.1144-1151
    • /
    • 2013
  • In this study, protein domains with cellulase activity in goat rumen microbes were investigated using metagenomic and bioinformatic analyses. After the complete genome of goat rumen microbes was obtained using a shotgun sequencing method, 217,892,109 pair reads were filtered, including only those with 70% identity, 100-bp matches, and thresholds below $E^{-10}$ using METAIDBA. These filtered contigs were assembled and annotated using blastN against the NCBI nucleotide database. As a result, a microbial community structure with 1431 species was analyzed, among which Prevotella ruminicola 23 bacteria and Butyrivibrio proteoclasticus B316 were the dominant groups. In parallel, 201 sequences related with cellulase activities (EC.3.2.1.4) were obtained through blast searches using the enzyme.dat file provided by the NCBI database. After translating the nucleotide sequence into a protein sequence using Interproscan, 28 protein domains with cellulase activity were identified using the HMMER package with threshold E values below $10^{-5}$. Cellulase activity protein domain profiling showed that the major protein domains such as lipase GDSL, cellulase, and Glyco hydro 10 were present in bacterial species with strong cellulase activities. Furthermore, correlation plots clearly displayed the strong positive correlation between some protein domain groups, which was indicative of microbial adaption in the goat rumen based on feeding habits. This is the first metagenomic analysis of cellulase activity protein domains using bioinformatics from the goat rumen.

Transcriptome analysis of internal and external stress mechanisms in Aster spathulifolius Maxim.

  • Sivagami, Jean Claude;Park, SeonJoo
    • Proceedings of the Plant Resources Society of Korea Conference
    • /
    • 2019.04a
    • /
    • pp.35-35
    • /
    • 2019
  • Aster spathulifolius Maxim. is belongs to the Asteraceae family which is distributed only in Korea and Japan. It is recognize as a traditionally medicinal plants and economically valuable in ornamental field. However, among the Asteraceae family, the Aster genus, which is lacks in genomic resources and information of molecular function. Therefore, we used high throughput RNA-sequencing transcriptome data of the A. spathulifolius to know molecular level function. DeNovo assembly produced 98,660 unigene with N50 value 1126 bp. Unigenes was performed to analyses the functional annotation against NCBI database like plant database of nucleotide (Nt) and non-redundant protein (Nr), Pfam, Uniprot, KEGG and Transcriptional factor (TF). In addition, Distribution of SSR markers also analyzed for future perfectives. Further, Comparing with other two Asteraceae family species like, Karelinia caspica and Chrysanthemum morifolium to the A. spathulifolius shows the number of gene that regulated in internal and external stress respectively salt-tolerant and heat and drought stress to understand the molecular basis related to the different environments stress.

  • PDF

Mollusks Sequence Database: Version II (연체동물 전용 BLAST 서버 업데이트 (Version II))

  • Kang, Se Won;Hwang, Hee Ju;Park, So Young;Wang, Tae Hun;Park, Eun Bi;Lee, Tae Hee;Hwang, Ui Wook;Lee, Jun-Sang;Park, Hong Seog;Han, Yeon Soo;Lim, Chae Eun;Kim, Soonok;Lee, Yong Seok
    • The Korean Journal of Malacology
    • /
    • v.30 no.4
    • /
    • pp.429-431
    • /
    • 2014
  • Since we reported a BLAST server for the mollusk in 2004, no work has reported the usability or modification of the server. To improve its usability, the BLAST server for the mollusk has been updated as version II (http://www.malacol.or.kr/blast) in the present study. The database was constructed by using the Intel server Platform ZSS130 dual Xeon 3.20 GHz CPU and Linux CentOS system and with NCBI WebBLAST package. We downloaded the mollusk nucleotide, amino acid, EST, GSS and mitochondrial genome sequences which can be opened through NCBI web BLAST and used them to build up the database. The updated database consists of 520,977 nucleotide sequences, 229,857 amino acid sequences, 586,498 EST sequences, 23,112 GSS and 565 mitochondrial genome sequences. Total database size is 1.2 GB. Furthermore, we have added repeat sequences, Escherichia coli sequences and vector sequences to facilitate data validation. The newly updated BLAST server for the mollusk will be useful for many malacological researchers as it will save time to identify and study various molluscan genes.

Construction of a Full-length cDNA Library from Korean Stewartia (Stewartia koreana Nakai) and Characterization of EST Dataset (노각나무(Stewartia koreana Nakai)의 cDNA library 제작 및 EST 분석)

  • Im, Su-Bin;Kim, Joon-Ki;Choi, Young-In;Choi, Sun-Hee;Kwon, Hye-Jin;Song, Ho-Kyung;Lim, Yong-Pyo
    • Horticultural Science & Technology
    • /
    • v.29 no.2
    • /
    • pp.116-122
    • /
    • 2011
  • In this study, we report the generation and analysis of 1,392 expressed sequence tags (ESTs) from Korean Stewartia (Stewartia koreana Nakai). A cDNA library was generated from the young leaf tissue and a total of 1,392 cDNA were partially sequenced. EST and unigene sequence quality were determined by computational filtering, manual review, and BLAST analyses. Finally, 1,301 ESTs were acquired after the removal of the vector sequence and filtering over a minimum length 100 nucleotides. A total of 893 unigene, consisting of 150 contigs and 743 singletons, was identified after assembling. Also, we identified 95 new microsatellite-containing sequences from the unigenes and classified the structure according to their repeat unit. According to homology search with BLASTX against the NCBI database, 65% of ESTs were homologous with known function and 11.6% of ESTs were matched with putative or unknown function. The remaining 23.2% of ESTs showed no significant similarity to any protein sequences found in the public database. Annotation based searches against multiple databases including wine grape and populus sequences helped to identify putative functions of ESTs and unigenes. Gene ontology (GO) classification showed that the most abundant GO terms were transport, nucleotide binding, plastid, in terms biological process, molecular function and cellular component, respectively. The sequence data will be used to characterize potential roles of new genes in Stewartia and provided for the useful tools as a genetic resource.

Loss of Heterozygosity at the Calcium Regulation Gene Locus on Chromosome 10q in Human Pancreatic Cancer

  • Long, Jin;Zhang, Zhong-Bo;Liu, Zhe;Xu, Yuan-Hong;Ge, Chun-Lin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.6
    • /
    • pp.2489-2493
    • /
    • 2015
  • Background: Loss of heterozygosity (LOH) on chromosomal regions is crucial in tumor progression and this study aimed to identify genome-wide LOH in pancreatic cancer. Materials and Methods: Single-nucleotide polymorphism (SNP) profiling data GSE32682 of human pancreatic samples snap-frozen during surgery were downloaded from Gene Expression Omnibus database. Genotype console software was used to perform data processing. Candidate genes with LOH were screened based on the genotype calls, SNP loci of LOH and dbSNP database. Gene annotation was performed to identify the functions of candidate genes using NCBI (the National Center for Biotechnology Information) database, followed by Gene Ontology, INTERPRO, PFAM and SMART annotation and UCSC Genome Browser track to the unannotated genes using DAVID (the Database for Annotation, Visualization and Integration Discovery). Results: The candidate genes with LOH identified in this study were MCU, MICU1 and OIT3 on chromosome 10. MCU was found to encode a calcium transporter and MICU1 could encode an essential regulator of mitochondrial $Ca^{2+}$ uptake. OIT3 possibly correlated with calcium binding revealed by the annotation analyses and was regulated by a large number of transcription factors including STAT, SOX9, CREB, NF-kB, PPARG and p53. Conclusions: Global genomic analysis of SNPs identified MICU1, MCU and OIT3 with LOH on chromosome 10, implying involvement of these genes in progression of pancreatic cancer.

Hot Pepper Functional Genomics: Monitoring of Global Gene Expression Profiles During Non-Host Resistance Reactions in Hot Pepper Plant ( Capsicum annuum).

  • Lee, Sanghyeob;Chung, Eun-Joo;Park, Doil
    • Proceedings of the Korean Society of Plant Pathology Conference
    • /
    • 2003.10a
    • /
    • pp.80.2-81
    • /
    • 2003
  • Since hot peppers (Capsicum annuum L.) are getting reputation as an important source of vitamins, medicine and many other areas, consumption and cultivation is being increased in the world. In spite of this usefulness, so little attention has been given to the hot pepper plants. To date, less than 500 nucleotide sequences including redundancy has been identified in NCBI database. Therefore we started to EST sequencing project for initial characterization of the genome, because of the large genome size of hot pepper (2.7 3.3 ${\times}$ 109 bp), To date, a set of 10,000 non-redundant genes were identified by EST sequencing for microarray-based gene expression studies. At present, cDNA microarrays containing 4,685 unigene clones are used for hybridization labeled targets derived from pathogen infected and uninoculated leaf tissues. Monitoring of gene expression profiles of hot pepper interactions with soybean pustule pathogen (Xag;Xanthomonas axonopodis pv. glycine) will be presented.

  • PDF

Construction of EST Database for Comparative Gene Studies of Acanthamoeba

  • Moon, Eun-Kyung;Kim, Joung-Ok;Xuan, Ying-Hua;Yun, Young-Sun;Kang, Se-Won;Lee, Yong-Seok;Ahn, Tae-In;Hong, Yeon-Chul;Chung, Dong-Il;Kong, Hyun-Hee
    • Parasites, Hosts and Diseases
    • /
    • v.47 no.2
    • /
    • pp.103-107
    • /
    • 2009
  • The genus Acanthamoeba can cause severe infections such as granulomatous amebic encephalitis and amebic keratitis in humans. However, little genomic information of Acanthamoeba has been reported. Here, we constructed Acanthamoeba expressed sequence tags (EST) database (Acanthamoeba EST DB) derived from our 4 kinds of Acanthamoeba cDNA library. The Acanthamoeba EST DB contains 3,897 EST generated from amebae under various conditions of long term in vitro culture, mouse brain passage, or encystation, and downloaded data of Acanthamoeba from National Center for Biotechnology Information (NCBI) and Taxonomically Broad EST Database (TBestDB). The almost reported eDNA/genomic sequences of Acanthamoeba provide stand alone BLAST system with nucleotide (BLAST NT) and amino acid (BLAST AA) sequence database. In BLAST results, each gene links for the significant information including sequence data, gene orthology annotations, relevant references, and a BlastX result. This is the first attempt for construction of Acanthamoeba database with genes expressed in diverse conditions. These data were integrated into a database (http://www. amoeba.or.kr).