• 제목/요약/키워드: Exons

검색결과 266건 처리시간 0.026초

Identification of Causal and/or Rare Genetic Variants for Complex Traits by Targeted Resequencing in Population-based Cohorts

  • Kim, Yun-Kyoung;Hong, Chang-Bum;Cho, Yoon-Shin
    • Genomics & Informatics
    • /
    • 제8권3호
    • /
    • pp.131-137
    • /
    • 2010
  • Genome-wide association studies (GWASs) have greatly contributed to the identification of common variants responsible for numerous complex traits. There are, however, unavoidable limitations in detecting causal and/or rare variants for traits in this approach, which depends on an LD-based tagging SNP microarray chip. In an effort to detect potential casual and/or rare variants for complex traits, such as type 2 diabetes (T2D) and triglycerides (TGs), we conducted a targeted resequencing of loci identified by the Korea Association REsource (KARE) GWAS. The target regions for resequencing comprised whole exons, exon-intron boundaries, and regulatory regions of genes that appeared within 1 Mb of the GWA signal boundary. From 124 individuals selected in population-based cohorts, a total of 0.7 Mb target regions were captured by the NimbleGen sequence capture 385K array. Subsequent sequencing, carried out by the Roche 454 Genome Sequencer FLX, generated about 110,000 sequence reads per individual. Mapping of sequence reads to the human reference genome was performed using the SSAHA2 program. An average of 62.2% of total reads was mapped to targets with an average 22X-fold coverage. A total of 5,983 SNPs (average 846 SNPs per individual) were called and annotated by GATK software, with 96.5% accuracy that was estimated by comparison with Affymetrix 5.0 genotyped data in identical individuals. About 51% of total SNPs were singletons that can be considered possible rare variants in the population. Among SNPs that appeared in exons, which occupies about 20% of total SNPs, 304 nonsynonymous singletons were tested with Polyphen to predict the protein damage caused by mutation. In total, we were able to detect 9 and 6 potentially functional rare SNPs for T2D and triglycerides, respectively, evoking a further step of replication genotyping in independent populations to prove their bona fide relevance to traits.

Molecular cloning, sequence polymorphism and genomic organization of far eastern catfish (Silurus asotus) GH gene

  • Park, Byul-Nim;Bang, In-Chul;Kim, Dong-Soo;Nam, Yoon-Kwon
    • 한국양식학회:학술대회논문집
    • /
    • 한국양식학회 2003년도 추계학술발표대회 논문요약집
    • /
    • pp.42-42
    • /
    • 2003
  • The far eastern catfish (Silurus asotus) growth hormone (GH) gene was cloned and characterized. The complete nucleotide sequences of genomic GH gene sequences as well as a catfish GH cDNA were obtained by RT-PCR and gene filter screening. The GH cDNA and genomic gene span 1.0 and 1.8 kb from the start codon to the polyadenylation signal, respectively. Both on cDNA and gDNA GH genes, the sequence polymorphism was detected including various silence mutations. The genomic GH gene comprised of only four exons and three introns, which was novel type of fish GH gene structure. The evolutionary relation of the catfish GH gene was inferred based on the comparative phylogenic analysis using the gene structures and sequences.

  • PDF

Molecular Cloning and Characterization of a Lipocalin in the Bumblebee Bombus Ignitus

  • Hu, Zhigang;Yoon, Hyung-Joo;Sohn, Hung-Dae;Jin, Byung-Rae
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제19권2호
    • /
    • pp.229-235
    • /
    • 2009
  • We have cloned and characterized a lipocalin from the bumblebee Bombus ignitus (Bi-lipocalin). The Bi-lipocalin gene spans 2284 bp and consists of four exons coding for 270 amino acid residues. Sequence analysis revealed that Bi-lipocalin possesses three structurally conserved regions (SCTs) that characterize lipocalins. Recombinant Bi-lipocalin, expressed as a 37 kDa protein in baculovirus-infected insect cells, was N-glycosylated, indicating that the carbohydrate moieties are necessary for secretion. Tissue distribution analysis revealed ubiquitous expression of Bi-lipocalin in all tissues examined. Bi-lipocalin transcripts were upregulated by stress, such as wounding, $H_2O_2$ exposure, and external temperature shock. These results indicate that Bi-lipocalin is a stress-inducible protein that acts on wounding, $H_2O_2$ overexposure and temperature stimulation.

CONVIRT: A web-based tool for transcriptional regulatory site identification using a conserved virtual chromosome

  • Ryu, Tae-Woo;Lee, Se-Joon;Hur, Cheol-Goo;Lee, Do-Heon
    • BMB Reports
    • /
    • 제42권12호
    • /
    • pp.823-828
    • /
    • 2009
  • Techniques for analyzing protein-DNA interactions on a genome-wide scale have recently established regulatory roles for distal enhancers. However, the large sizes of higher eukaryotic genomes have made identification of these elements difficult. Information regarding sequence conservation, exon annotation and repetitive regions can be used to reduce the size of the search region. However, previously developed resources are inadequate for consolidating such information. CONVIRT is a web resource for the identification of transcription factor binding sites and also features comparative genomics. Genomic information on ortholog-independent conserved regions, exons, repeats and sequences is integrated into the virtual chromosome, and statistically over-represented single or combinations of transcription factor binding sites are sought. CONVIRT provides regulatory network analysis for several organisms with long promoter regions and permits inter-species genome alignments. CONVIRT is freely available at http://biosoft.kaist.ac.kr/convirt.

A Method for Identifying Splice Sites and Translation Start Sites in Human Genomic Sequences

  • Kim, Ki-Bong;Park, Kie-Jung;Kong, Eun-Bae
    • BMB Reports
    • /
    • 제35권5호
    • /
    • pp.513-517
    • /
    • 2002
  • We describe a new method for identifying the sequences that signal the start of translation, and the boundaries between exons and introns (donor and acceptor sites) in human mRNA. According to the mandatory keyword, ORGANISM, and feature key, CDS, a large set of standard data for each signal site was extracted from the ASCII flat file, gbpri.seq, in the GenBank release 108.0. This was used to generate the scoring matrices, which summarize the sequence information for each signal site. The scoring matrices take into account the independent nucleotide frequencies between adjacent bases in each position within the signal site regions, and the relative weight on each nucleotide in proportion to their probabilities in the known signal sites. Using a scoring scheme that is based on the nucleotide scoring matrices, the method has great sensitivity and specificity when used to locate signals in uncharacterized human genomic DNA. These matrices are especially effective at distinguishing true and false sites.

RGISS: Rice (Oryza sativa L. ssp. japonica) Genome Information Service System

  • Lee, Dae-Sang;Seo, Hwa-Jung;Hahn, Jang-Ho;Kong, Eun-Bae;Park, Kie-Jung
    • Genomics & Informatics
    • /
    • 제5권4호
    • /
    • pp.194-195
    • /
    • 2007
  • We have constructed the Rice Genome Information Service System (RGISS), which is an information service system of the Oryza sativa L. ssp. japonica (rice) genome, using the released version of rice Build 3.0 pseudomolecules based on the Ensembl architecture. The nonredundant library, composed of 3,360 clones of BACs, PACs, and fosmids, was used to construct supercontigs. RGISS contains 50,717 annotated genes from GenBank, 56,161 predicted genes from FgeneSH, and information on 9,587 markers, which includes STS, SSR, and EST-based RFLP. The 20,180 ESTs sequenced by the Korea National Institute of Agricultural Biotechnology (NIAB) were aligned and mapped into 168,792 exons. By gene ontology analysis, the classified protein numbers in the rice genome were 6158, 4531, and 12,364 proteins, which were mapped to molecular function, cellular component, and biological process, respectively.

Analyzing Exon Structure with PCA and ICA of Short-Time Fourier Transform

  • Hwang Changha;Sohn Insuk
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2004년도 학술발표논문집
    • /
    • pp.79-84
    • /
    • 2004
  • We use principal component analysis (PCA) to identify exons of a gene and further analyze their internal structures. The PCA is conducted on the short-time Fourier transform (STFT) based on the 64 codon sequences and the 4 nucleotide sequences. By comparing to independent component analysis (ICA), we can differentiate between the exon and intron regions, and how they are correlated in terms of the square magnitudes of STFTs. The experiment is done on the gene F56F11.4 in the chromosome III of C. elegans. For this data, the nucleotide based PCA identifies the exon and intron regions clearly. The codon based PCA reveals a weak internal structure in some exon regions, but not the others. The result of ICA shows that the nucleotides thymine (T) and guanine (G) have almost all the information of the exon and intron regions for this data. We hypothesize the existence of complex exon structures that deserve more detailed analysis.

  • PDF

미요시근육병 환자에서 밝혀진 Dysferlin 유전자 돌연변이 (Identification of a Dysferlin Gene Mutation in One Patient Showing Clinical Manifestation of Miyoshi Myopathy)

  • 지명구;김남희;김대성;최영철
    • Annals of Clinical Neurophysiology
    • /
    • 제11권2호
    • /
    • pp.59-63
    • /
    • 2009
  • Miyoshi myopathy (MM) is caused by the mutations of dysferlin gene (DYSF), which impairs the function of dysferlin protein causing muscle membrane dysfunction. We report a patient showing the MM phenotype who has a sister with LGMD 2B phenotype, along with the results of the immunohistochemical and molecular analyses of the DYSF gene. Immunohistochemical analysis noted negative immunoreactivity against dysferlin. Direct DNA sequencing of whole exons of DYSF gene revealed heterozygous nonsense mutations (c.610C>T + c.2494C>T). To our knowledge, this is the first reported MM case with this very combination of heterozygous mutations.

  • PDF

Cloning and Characterization of hydroxypyruvate isomerase (EC 5.3.1.22) gene in silkworm Bombyx mori

  • Lv, HongGang;Chen, KePing;Yao, Qin;Wang, Lin
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제17권2호
    • /
    • pp.189-195
    • /
    • 2008
  • The sequence of hydroxypyruvate isomerase gene was obtained in NCBI. In this study, the hydroxypyruvate isomerase gene of Bombyx.mori was identified and annotated with bioinformatics tools. The result was confirmed by RT-PCR, prokaryotic expression, mass spectrographic analysis and sub-cellular localization. The hydroxypyruvate isomerase cDNA comtains a 783bp ORF, and has 4 exons. The deduced protein has 260 amino acid residues with the predicted molecular weight of 29169.30 Da, isoelectric point of 6.10, and contains conserved PRK09997 and Hfi domains. The hydroxypyruvate isomerases of Nasonia vitripennis and Bombyx mori have a high homology. Through RTPCR analysis, we found that this transcript was present in testis, ovary, blood-lymph, fat body, midgut, silk gland and tuba Malpighii. This protein was located in cytoplasm through immunohistochemistry. We submitted the cloned gene under the accession number EU344910. The enzyme has been classified under accession number EC 5.3.1.22.

Bridging a Gap between DNA sequences and expression patterns of genes

  • Morishita, Shinichi
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.69-70
    • /
    • 2000
  • The completion of sequencing human genome would motivate us to map millions of human cDNAs onto the unique ruler "genome sequence", in order to identify the exact address of each cDNA together with its exons, its promoter region, and its alternative splicing patterns. The expression patterns of some cDNAs could therefore be associated with these precise gene addresses, which further accelerate studies on mining correlations between motifs of promoters and expressions of genes in tissues. Towards the realization of this goal, we have developed a time-and-space efficient software named SQUALL that is able to map one cDNA sequence of length a few thousand onto a long genome sequence of length thirty million in a couple of minutes on average. Using SQUALL, we have mapped twenty thousand of our Bodymap (http://bodymap.ims.u-tokyo.ac.jp) cDNAs onto the genome sequences of Chr.21st and 22nd. In this talk, I will report the status of this ongoing project.

  • PDF