• Title/Summary/Keyword: Gene Database

Search Result 575, Processing Time 0.029 seconds

Design and Implementation of gene sequence database with streptomyces data (유전자 데이터베이스의 설계 및 구현: streptomyces data를 예로)

  • Kim, Jin;Kim, Bun-Joon;Kim, Jeong-Mi;Kim, Dong-Hoi
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.160-162
    • /
    • 2001
  • 유전자의 서열 및 관련 정보가 폭발적으로 증가함에 따라, 사용자들에 대한 유전자정보 서비스, 온라인 상에서의 효율적이 서열정보 분석, 서열정보에 대한 효율적인 관리, 관련된 연구자들과의 정보공유 등이 필요하게 되었다. 본 논문에서는 인터넷 상에서 streptomyces 유전자 data를 효율적으로 관리하는 한편, 사용자들에게 유용한 서비스를 제공하는 시스템의 설계 및 구현에 관하여 논의하였다. 사용자는 본 시스템으로부터 원하는 유전자 정보를 다운로드 받을 수 있다. 또한 분석을 원하는 유전자를 streptomyces database내의 유전자들과 비교하여 유용한 정보를 추론할 수 있다.

  • PDF

ChimerDB - Database of Chimeric Sequences in the GenBank

  • Kim, Namshin;Shin, Seokmin;Cho, Kwang-Hwi;Lee, Sanghyuk
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.61-66
    • /
    • 2004
  • Fusion proteins resulting from chimeric sequences are excellent targets for therapeutic drug development. We developed a database of chimeric sequences by examining the genomic alignment of mRNA and EST sequences in the GenBank. We identified 688 chimeric mRNA and 20,998 chimeric EST sequences. Including EST sequences greatly expands the scope of chimeric sequences even though it inevitably accompanies many artifacts. Chimeric sequences are clustered according to the ECgene ID so that the user can easily find chimeric sequences related to a specific gene. Alignments of chimeric sequences are displayed as custom tracks in the UCSC genome browser. ChimerDB, available at http://genome.ewha.ac.kr/ECgene/ChimerDB/, should be a valuable resource for finding drug targets to treat cancers.

Molecular genetic analysis of phytochelatin synthase genes in Arabidopsis

  • Ha, Suk-Bong
    • Proceedings of the Botanical Society of Korea Conference
    • /
    • 2002.04a
    • /
    • pp.62-72
    • /
    • 2002
  • This study has investigated the biosynthesis and function of the heavy metal binding peptides, the phytochelatins, in plants. PCs are synthesised enzymatically from glutathione by the enzyme PC synthase in the presence of heavy metal ions. Using Arabidopsis thaliana as a model organism cadmium-sensitive, phytochelatin-deficient mutants have been isolated and characterised in previous studies. The cadl mutants have wildtype levels of glutathione, are PC deficient and lack PC synthase activity. Thus, the CADl gene has been proposed to encode PC synthase. The CADl gene was isolated by a positional cloning strategy The gene was mapped and a candidate identified. Each of four cadl mutants had a single base pair change in the candidate gene and the cadmium-sensitive, cadl phenotype was complemented by the candidate gene. This demonstrated the CADl gene had been cloned. A homologous gene in the fission yeast, Schizosaccharomyces pombe was identified through database searches. A targeted-deletion mutation of this gene was constructed and the mutant, like cadl mutants of Arabidopsis, was cadmium-sensitive and PC-deficient. A comparison of the redicted amino acid sequences reveals a highly conserved N-terminal region Presumed to be the catalytic domain and a variable C-terminal region containing multiple Cys residues proposed to be involved in activation of the enzyme by metal ions. Similar genes were also identified in animal species. The Arabidopsis CADl/AtPCSl and S. pombe SpbPCS genes were expressed in E. coli and were shown to be sufficient for glutathione-dependent, heavy metal activate PC synthesis in vitro, thus demonstrating these genes encode PC synthase enzymes. Using RT-PCR, AtPCSl expression appeared to be independent of Cd exposure. However, at higher levels of Cd exposure a AtPCSl-CUS reporter gene construct appeared to be more highly expressed. Using the reporter gene construct, AtPCSl was expressed most tissues. Expression appeared to be greater in younger tissues and same higher levels of expression was observed in some regions, including carpels and the base of siliques. AtPCS2 was a functional gene encoding an active PC synthase. However, its Pattern of expression and the phenotype of a mutant (or antisense line) have not been determined. Assuming the gene is functional then it has clearly been maintained through evolution and must provide some selective advantage. This implies that, at least in some cells or tissue, it is likely to be the dominant PC synthase expressed. This remains to be determined

  • PDF

The Brassica rapa Tissue-specific EST Database (배추의 조직 특이적 발현유전자 데이터베이스)

  • Yu, Hee-Ju;Park, Sin-Gi;Oh, Mi-Jin;Hwang, Hyun-Ju;Kim, Nam-Shin;Chung, Hee;Sohn, Seong-Han;Park, Beom-Seok;Mun, Jeong-Hwan
    • Horticultural Science & Technology
    • /
    • v.29 no.6
    • /
    • pp.633-640
    • /
    • 2011
  • Brassica rapa is an A genome model species for Brassica crop genetics, genomics, and breeding. With the completion of sequencing the B. rapa genome, functional analysis of the genome is forthcoming issue. The expressed sequence tags are fundamental resources supporting annotation and functional analysis of the genome including identification of tissue-specific genes and promoters. As of July 2011, 147,217 ESTs from 39 cDNA libraries of B. rapa are reported in the public database. However, little information can be retrieved from the sequences due to lack of organized databases. To leverage the sequence information and to maximize the use of publicly-available EST collections, the Brassica rapa tissue-specific EST database (BrTED) is developed. BrTED includes sequence information of 23,962 unigenes assembled by StackPack program. The unigene set is used as a query unit for various analyses such as BLAST against TAIR gene model, functional annotation using MIPS and UniProt, gene ontology analysis, and prediction of tissue-specific unigene sets based on statistics test. The database is composed of two main units, EST sequence processing and information retrieving unit and tissue-specific expression profile analysis unit. Information and data in both units are tightly inter-connected to each other using a web based browsing system. RT-PCR evaluation of 29 selected unigene sets successfully amplified amplicons from the target tissues of B. rapa. BrTED provided here allows the user to identify and analyze the expression of genes of interest and aid efforts to interpret the B. rapa genome through functional genomics. In addition, it can be used as a public resource in providing reference information to study the genus Brassica and other closely related crop crucifer plants.

Retrieving Protein Domain Encoding DNA Sequences Automatically Through Database Cross-referencing

  • Choi, Yoon-Sup;Yang, Jae-Seong;Ryu, Sung-Ho;Kim, Sang-Uk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.95-98
    • /
    • 2006
  • Recent proteomic studies of protein domains require high-throughput and systematic approaches. Since most experiments using protein domains, the modules of protein-protein interactions, require gene cloning, the first experimental step should be retrieving DNA sequences of domain encoding regions from databases. For a large scale proteomic research, however, it is a laborious task to extract a large number of domain sequences manually from several inter-linked databases. We present a new methodology to retrieve DNA sequences of domain encoding regions through automatic database cross-referencing. To extract protein domain encoding regions, it traverses several inter-connected database with validation process. And we applied this method to retrieve all the EGF domain encoding DNA sequences of homo sapiens. This new algorithm was implemented using Python library PAMIE, which enables to cross-reference across distinct databases automatically.

  • PDF

Prediction of Mammalian MicroRNA Targets - Comparative Genomics Approach with Longer 3' UTR Databases

  • Nam, Seungyoon;Kim, Young-Kook;Kim, Pora;Kim, V. Narry;Shin, Seokmin;Lee, Sanghyuk
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.53-62
    • /
    • 2005
  • MicroRNAs play an important role in regulating gene expression, but their target identification is a difficult task due to their short length and imperfect complementarity. Burge and coworkers developed a program called TargetScan that allowed imperfect complementarity and established a procedure favoring targets with multiple binding sites conserved in multiple organisms. We improved their algorithm in two major aspects - (i) using well-defined UTR (untranslated region) database, (ii) examining the extent of conservation inside the 3' UTR specifically. Average length in our UTR database, based on the ECgene annotation, is more than twice longer than the Ensembl. Then, TargetScan was used to identify putative binding sites. The extent of conservation varies significantly inside the 3' UTR. We used the 'tight' tracks in the UCSC genome browser to select the conserved binding sites in multiple species. By combining the longer 3' UTR data, TargetScan, and tightly conserved blocks of genomic DNA, we identified 107 putative target genes with multiple binding sites conserved in multiple species, of which 85 putative targets are novel.

A Meta-Analysis of Fecal Bacterial Diversity in Dogs (메타분석을 통한 반려견 분변 박테리아 군집 조사)

  • Jeong, Jin Young;Kim, Minseok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.1
    • /
    • pp.141-147
    • /
    • 2017
  • In this study, a meta-analysis of fecal bacteria in dogs was conducted using 16S rRNA gene sequences that have been recovered from cloning and Sanger sequencing. For this meta-analysis, we retrieved all 16S rRNA gene sequences recovered from fecal bacteria in dogs in the RDP database (Release 11, Update 3). A total of 420 sequences were identified from the RDP database, 42 of which were also recovered from cultured isolates. The 420 sequences were assigned to five phyla, of which Firmicutes was the most predominant phylum, accounting for 55.2% of all 420 sequences. Bacteroidetes was the second most predominant phylum, accounting for 32.1% of the 420 sequences, followed by Actinobacteria (6.4%), Fusobacteria (3.8%), and Proteobacteria (2.4%). The genus Bacteroides within Bacteroidetes was the largest, representing 30.0% of all 420 sequences, while the putative genus Clostridium XI within Firmicutes was the second largest, representing 27.4% of all 420 sequences. A total of 82 operational taxonomic units (OTUs) that are putative species were identified from the retrieved sequences. The results of this study will improve understanding of the diversity of fecal bacteria in dogs and guide future studies on the health and well-being of dogs.

The role of RNA epigenetic modification-related genes in the immune response of cattle to mastitis induced by Staphylococcus aureus

  • Yue Xing;Yongjie Tang;Quanzhen Chen;Siqian Chen;Wenlong Li;Siyuan Mi;Ying Yu
    • Animal Bioscience
    • /
    • v.37 no.7
    • /
    • pp.1141-1155
    • /
    • 2024
  • Objective: RNA epigenetic modifications play an important role in regulating immune response of mammals. Bovine mastitis induced by Staphylococcus aureus (S. aureus) is a threat to the health of dairy cattle. There are numerous RNA modifications, and how these modification-associated enzymes systematically coordinate their immunomodulatory effects during bovine mastitis is not well reported. Therefore, the role of common RNA modification-related genes (RMRGs) in bovine S. aureus mastitis was investigated in this study. Methods: In total, 80 RMRGs were selected for this study. Four public RNA-seq data sets about bovine S. aureus mastitis were collected and one additional RNA-seq data set was generated by this study. Firstly, quantitative trait locus (QTL) database, transcriptome-wide association studies (TWAS) database and differential expression analyses were employed to characterize the potential functions of selected enzyme genes in bovine S. aureus mastitis. Correlation analysis and weighted gene co-expression network analysis (WGCNA) were used to further investigate the relationships of RMRGs from different types at the mRNA expression level. Interference experiments targeting the m6 A demethylase FTO and utilizing public MeRIP-seq dataset from bovine Mac-T cells were used to investigate the potential interaction mechanisms among various RNA modifications. Results: Bovine QTL and TWAS database in cattle revealed associations between RMRGs and immune-related complex traits. S. aureus challenged and control groups were effectively distinguished by principal component analysis based on the expression of selected RMRGs. WGCNA and correlation analysis identified modules grouping different RMRGs, with highly correlated mRNA expression. The m6 A modification gene FTO showed significant effects on the expression of m6 A and other RMRGs (such as NSUN2, CPSF2, and METTLE), indicating complex co-expression relationships among different RNA modifications in the regulation of bovine S. aureus mastitis. Conclusion: RNA epigenetic modification genes play important immunoregulatory roles in bovine S. aureus mastitis, and there are extensive interactions of mRNA expression among different RMRGs. It is necessary to investigate the interactions between RNA modification genes regulating complex traits in the future.

Structure-based Functional Discovery of Proteins: Structural Proteomics

  • Jung, Jin-Won;Lee, Weon-Tae
    • BMB Reports
    • /
    • v.37 no.1
    • /
    • pp.28-34
    • /
    • 2004
  • The discovery of biochemical and cellular functions of unannotated gene products begins with a database search of proteins with structure/sequence homologues based on known genes. Very recently, a number of frontier groups in structural biology proposed a new paradigm to predict biological functions of an unknown protein on the basis of its three-dimensional structure on a genomic scale. Structural proteomics (genomics), a research area for structure-based functional discovery, aims to complete the protein-folding universe of all gene products in a cell. It would lead us to a complete understanding of a living organism from protein structure. Two major complementary experimental techniques, X-ray crystallography and NMR spectroscopy, combined with recently developed high throughput methods have played a central role in structural proteomics research; however, an integration of these methodologies together with comparative modeling and electron microscopy would speed up the goal for completing a full dictionary of protein folding space in the near future.

Genomic Tree of Gene Contents Based on Functional Groups of KEGG Orthology

  • Kim Jin-Sik;Lee Sang-Yup
    • Journal of Microbiology and Biotechnology
    • /
    • v.16 no.5
    • /
    • pp.748-756
    • /
    • 2006
  • We propose a genome-scale clustering approach to identify whole genome relationships using the functional groups given by the Kyoto Encyclopedia of Genes and Genomes Orthology (KO) database. The metabolic capabilities of each organism were defined by the number of genes in each functional category. The archaeal, bacterial, and eukaryotic genomes were compared by simultaneously applying a two-step clustering method, comprised of a self-organizing tree algorithm followed by unsupervised hierarchical clustering. The clustering results were consistent with various phenotypic characteristics of the organisms analyzed and, additionally, showed a different aspect of the relationship between genomes that have previously been established through rRNA-based comparisons. The proposed approach to collect and cluster the metabolic functional capabilities of organisms should make it a useful tool in predicting relationships among organisms.