• Title/Summary/Keyword: Seq2Seq(Sequence to Sequence)

Search Result 47, Processing Time 0.023 seconds

Efficiency to Discovery Transgenic Loci in GM Rice Using Next Generation Sequencing Whole Genome Re-sequencing

  • Park, Doori;Kim, Dongin;Jang, Green;Lim, Jongsung;Shin, Yun-Ji;Kim, Jina;Seo, Mi-Seong;Park, Su-Hyun;Kim, Ju-Kon;Kwon, Tae-Ho;Choi, Ik-Young
    • Genomics & Informatics
    • /
    • v.13 no.3
    • /
    • pp.81-85
    • /
    • 2015
  • Molecular characterization technology in genetically modified organisms, in addition to how transgenic biotechnologies are developed now require full transparency to assess the risk to living modified and non-modified organisms. Next generation sequencing (NGS) methodology is suggested as an effective means in genome characterization and detection of transgenic insertion locations. In the present study, we applied NGS to insert transgenic loci, specifically the epidermal growth factor (EGF) in genetically modified rice cells. A total of 29.3 Gb (${\sim}72{\times}coverage$) was sequenced with a $2{\times}150bp$ paired end method by Illumina HiSeq2500, which was consecutively mapped to the rice genome and T-vector sequence. The compatible pairs of reads were successfully mapped to 10 loci on the rice chromosome and vector sequences were validated to the insertion location by polymerase chain reaction (PCR) amplification. The EGF transgenic site was confirmed only on chromosome 4 by PCR. Results of this study demonstrated the success of NGS data to characterize the rice genome. Bioinformatics analyses must be developed in association with NGS data to identify highly accurate transgenic sites.

In silico approaches to discover the functional impact of non-synonymous single nucleotide polymorphisms in selective sweep regions of the Landrace genome

  • Shin, Donghyun;Won, Kyung-Hye;Song, Ki-Duk
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.12
    • /
    • pp.1980-1990
    • /
    • 2018
  • Objective: The aim of this study was to discover the functional impact of non-synonymous single nucleotide polymorphisms (nsSNPs) that were found in selective sweep regions of the Landrace genome Methods: Whole-genome re-sequencing data were obtained from 40 pigs, including 14 Landrace, 16 Yorkshire, and 10 wild boars, which were generated with the Illumina HiSeq 2000 platform. The nsSNPs in the selective sweep regions of the Landrace genome were identified, and the impacts of these variations on protein function were predicted to reveal their potential association with traits of the Landrace breed, such as reproductive capacity. Results: Total of 53,998 nsSNPs in the mapped regions of pigs were identified, and among them, 345 nsSNPs were found in the selective sweep regions of the Landrace genome which were reported previously. The genes featuring these nsSNPs fell into various functional categories, such as reproductive capacity or growth and development during the perinatal period. The impacts of amino acid sequence changes by nsSNPs on protein function were predicted using two in silico SNP prediction algorithms, i.e., sorting intolerant from tolerant and polymorphism phenotyping v2, to reveal their potential roles in biological processes that might be associated with the reproductive capacity of the Landrace breed. Conclusion: The findings elucidated the domestication history of the Landrace breed and illustrated how Landrace domestication led to patterns of genetic variation related to superior reproductive capacity. Our novel findings will help understand the process of Landrace domestication at the genome level and provide SNPs that are informative for breeding.

Structural analysis of expressed sequence tags inimmature seed of Oryza sativa L. (벼 미숙종자의 발현유전자 구조특성분석)

  • Yoon, Ung-Han;Lee, Gang-Seob;Lee, Jung-Sook;Hahn, Jang-Ho;Kim, Chang-Kug;Kikuch, Shoshi;Satoh, Kouji;Kim, Jin-A;Lee, Jeong-Hwa;Lee, Tae-Ho;Kim, Yong-Hwan
    • Journal of Plant Biotechnology
    • /
    • v.36 no.2
    • /
    • pp.130-136
    • /
    • 2009
  • Rice (Oryza sativa) is the most important staple crop in Korea. With its small genome size of 389Mb, rice is a model plant for genome research. We analyzed expressed sequence tag (EST) clones from immature seeds of rice (cv. Ilpum) at 20 days after heading. The 25,668 EST clones were clustered by using SeqMan program and 7,509 clones were selected as unique clones. We compared the 7,509 unique genes with KOME database including the 32,127 FL-cDNA in rice. Finally, 4,990 clones were homologous and 2,519 clones non-homologous to FL-cDNA clones. In addition, we mapped the 7,509 cDNA clones by using TIGR rice pseudomolecule version 5. Ultimately, 7,347 clones were matched to be significant clones related to the TIGR rice pseudomolecules, but 162 clones were unmapped. For the clustering of orthologous group genes, we further analyzed the 7,509 EST clones from immature seeds using NCBI clusters of orthologous groups database. Among the clones, 4,968 clones were categorized into information storage and processing, cellular processes and signaling, metabolism and poorly characterized genes, proportioning 799 (14.89%), 1,536 (28.3%), 1,148 (21.2%) and 1,936 (35.7%) clones to the previous four categories, respectively.

Transcriptome analysis of the livers of ducklings hatched normally and with assistance

  • Liu, Yali;He, Shishan;Zeng, Tao;Du, Xue;Shen, Junda;Zhao, Ayong;Lu, Lizhi
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.6
    • /
    • pp.773-780
    • /
    • 2017
  • Objective: "Hatchability" is an important economic trait in domestic poultry. Studies on poultry hatchability focus mainly on the genetic background, egg quality, and incubation conditions, whereas the molecular mechanisms behind the phenomenon that some ducklings failed to break their eggshells are poorly understood. Methods: In this study, the transcriptional differences between the livers of normally hatched and assisted ducklings were systematically analyzed. Results: The results showed that the clean reads were de novo assembled into 161,804 and 159,083 unigenes (${\geq}200-bp$ long) by using Trinity, with an average length of 1,206 bp and 882 bp, respectively. The defined criteria of the absolute value of log2 fold-change ${\geq}1$ and false discovery rate${\leq}0.05$ were differentially expressed and were significant. As a result, 1,629 unigenes were identified, the assisted ducklings showed 510 significantly upregulated and 1,119 significantly down-regulated unigenes. In general, the metabolic rate in the livers of the assisted ducklings was lower than that in the normal ducklings; however, compared to normal ducklings, glucose-6-phosphatase and ATP synthase subunit alpha 1 associated with energy metabolism were significantly upregulated in the assisted group. The genes involved in immune defense such as major histocompatibility complex (MHC) class I antigen alpha chain and MHC class II beta chain 1 were downregulated in the assisted ducklings. Conclusion: These data provide abundant sequence resources for studying the functional genome of the livers in ducks and other poultry. In addition, our study provided insight into the molecular mechanism by which the phenomenon of weak embryos is regulated.

Blood transcriptome resources of chinstrap (Pygoscelis antarcticus) and gentoo (Pygoscelis papua) penguins from the South Shetland Islands, Antarctica

  • Kim, Bo-Mi;Jeong, Jihye;Jo, Euna;Ahn, Do-Hwan;Kim, Jeong-Hoon;Rhee, Jae-Sung;Park, Hyun
    • Genomics & Informatics
    • /
    • v.17 no.1
    • /
    • pp.5.1-5.9
    • /
    • 2019
  • The chinstrap (Pygoscelis antarcticus) and gentoo (P. papua) penguins are distributed throughout Antarctica and the sub-Antarctic islands. In this study, high-quality de novo assemblies of blood transcriptomes from these penguins were generated using the Illumina MiSeq platform. A total of 22.2 and 21.8 raw reads were obtained from chinstrap and gentoo penguins, respectively. These reads were assembled using the Oases assembly platform and resulted in 26,036 and 21,854 contigs with N50 values of 929 and 933 base pairs, respectively. Functional gene annotations through pathway analyses of the Gene Ontology, EuKaryotic Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were performed for each blood transcriptome, resulting in a similar compositional order between the two transcriptomes. Ortholog comparisons with previously published transcriptomes from the $Ad{\acute{e}}lie$ (P. adeliae) and emperor (Aptenodytes forsteri) penguins revealed that a high proportion of the four penguins' transcriptomes had significant sequence homology. Because blood and tissues of penguins have been used to monitor pollution in Antarctica, immune parameters in blood could be important indicators for understanding the health status of penguins and other Antarctic animals. In the blood transcriptomes, KEGG analyses detected many essential genes involved in the major innate immunity pathways, which are key metabolic pathways for maintaining homeostasis against exogenous infections or toxins. Blood transcriptome studies such as this may be useful for checking the immune and health status of penguins without sacrifice.

Molecular Phylogenetic Study of the Endangered Land Snail Satsuma myomphala Based on Metallothionein Gene. (Metallothionein 유전자를 기초로 한 멸종위기 육상 달팽이 Satsuma myomphala (거제외줄달팽이) 의 분자계통학적 연구)

  • Sang, Min Kyu;Kang, Se Won;Hwang, Hee-Ju;Chung, Jong Min;Song, Dae Kwon;Min, Hye Rin;Park, Jie Eun;Ha, Hee Cheol;Lee, Hyun Jun;Hong, Chan Eui;Ahn, Young Mo;Park, So Young;Park, Young-Su;Park, Hong Seog;Han, Yeon Soo;Lee, Jun Sang;Lee, Yong Seok
    • The Korean Journal of Malacology
    • /
    • v.32 no.4
    • /
    • pp.263-268
    • /
    • 2016
  • Metallothionein (MT) family of metal-binding proteins are involved in maintaining homeostasis and heavy metal poisoning. Recently, MT has been considered as a biomarker that can identify a particular species, very similar to the use of cytochrome oxidase I (COI) gene. Satsuma myomphala species of land snails have been reported from North-East Asia, including South Korea and Japan. In particular, the land snail species have been known from only a limited area of Geoje Island, Gyeongsangnam-do province of South Korea. Genetic studies of S. myomphala has been limited with only 6 nucleotide, 2 protein registered on the NCBI server. For elucidating the genetic information of S. myomphala, we conducted RNA sequencing analysis using Illumina HiSeq 2500 next-generation platform. We screened the MT gene from the RNA-Seq database to confirm the molecular phylogenetic relationship. After sequencing, the de novo analysis and clustering generated 103,774 unigenes. After annotation against PANM database using BLAST program, we obtained MT sequence of 74 amino acid residues containing the coding region of 222 bp. Based on this sequence, we found about 53 sequences using the BLAST program in NCBI nr database. Using ClustalX alignment, Maximum-Likehood Tree of MEGA program, we confirmed the molecular phylogenetic relationships that showed similarity with mollusks such as Helix pomatia and H. aspersa, Megathura crenulata.

Novel splice isoforms of pig myoneurin and their diverse mRNA expression patterns

  • Guo, Xiaohong;Li, Meng;Gao, Pengfei;Cao, Guoqing;Cheng, Zhimin;Zhang, Wanfeng;Liu, Jianfeng;Liu, Xiaojun;Li, Bugao
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.10
    • /
    • pp.1581-1590
    • /
    • 2018
  • Objective: The aim of this study was to clone alternative splicing isoforms of pig myoneurin (MYNN), predict the structure and function of coding protein, and study temporal and spatial expression characteristics of each transcript. Methods: Alternative splice isoforms of MYNN were identified using RNA sequencing (RNA-seq) and cloning techniques. Quantitative real-time polymerase chain reaction (qPCR) was employed to detect expression patterns in 11 tissues of Large White (LW) and Mashen (MS) pigs, and to study developmental expression patterns in cerebellum (CE), stomach (ST), and longissimus dorsi (LD). Results: The results showed that MYNN had two alternatively spliced isoforms, MYNN-1 (GenBank accession number: KY470829) and MYNN-2 (GenBank accession number: KY670835). MYNN-1 coding sequence (CDS) is composed of 1,830 bp encoding 609 AA, whereas MYNN-2 CDS is composed of 1,746 bp encoding 581 AA. MYNN-2 was 84 bp less than MYNN-1 and lacked the sixth exon. MYNN-2 was found to have one $C_2H_2$ type zinc finger protein domain less than MYNN-1. Two variants were ubiquitously expressed in all pig tissues, and there were significant differences in expression of different tissues (p<0.05; p<0.01). The expression of MYNN-1 was significantly higher than that of MYNN-2 in almost tissues (p<0.05; p<0.01), which testified that MYNN-1 is the main variant. The expression of two isoforms decreased gradually with increase of age in ST and CE of MS pig, whereas increased gradually in LW pig. In LD, the expression of two isoforms increased first and then decreased with increase of age in MS pig, and decreased gradually in LW pig. Conclusion: Two transcripts of pig MYNN were successfully cloned and MYNN-1 was main variant. MYNN was highly expressed in ST, CE, and LD, and their expression was regular. We speculated that MYNN plays important roles in digestion/absorption and skeletal muscle growth, whereas the specific mechanisms require further elucidation.