• Title/Summary/Keyword: genomic visualization

Search Result 23, Processing Time 0.022 seconds

High-performance computing for SARS-CoV-2 RNAs clustering: a data science-based genomics approach

  • Oujja, Anas;Abid, Mohamed Riduan;Boumhidi, Jaouad;Bourhnane, Safae;Mourhir, Asmaa;Merchant, Fatima;Benhaddou, Driss
    • Genomics & Informatics
    • /
    • v.19 no.4
    • /
    • pp.49.1-49.11
    • /
    • 2021
  • Nowadays, Genomic data constitutes one of the fastest growing datasets in the world. As of 2025, it is supposed to become the fourth largest source of Big Data, and thus mandating adequate high-performance computing (HPC) platform for processing. With the latest unprecedented and unpredictable mutations in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the research community is in crucial need for ICT tools to process SARS-CoV-2 RNA data, e.g., by classifying it (i.e., clustering) and thus assisting in tracking virus mutations and predict future ones. In this paper, we are presenting an HPC-based SARS-CoV-2 RNAs clustering tool. We are adopting a data science approach, from data collection, through analysis, to visualization. In the analysis step, we present how our clustering approach leverages on HPC and the longest common subsequence (LCS) algorithm. The approach uses the Hadoop MapReduce programming paradigm and adapts the LCS algorithm in order to efficiently compute the length of the LCS for each pair of SARS-CoV-2 RNA sequences. The latter are extracted from the U.S. National Center for Biotechnology Information (NCBI) Virus repository. The computed LCS lengths are used to measure the dissimilarities between RNA sequences in order to work out existing clusters. In addition to that, we present a comparative study of the LCS algorithm performance based on variable workloads and different numbers of Hadoop worker nodes.

A Rapid and Sensitive Detection of Aflatoxin-producing Fungus Using an Optimized Polymerase Chain Reaction (PCR)

  • Bintvihok, Anong;Treebonmuang, Supitchaya;Srisakwattana, Kitiya;Nuanchun, Wisut;Patthanachai, Koranis;Usawang, Sungworn
    • Toxicological Research
    • /
    • v.32 no.1
    • /
    • pp.81-87
    • /
    • 2016
  • Aflatoxin B1 (AFB1) is produced by Aspergillus flavus growing in feedstuffs. Early detection of maize contamination by aflatoxigenic fungi is advantageous since aflatoxins exert adverse health effects. In this study, we report the development of an optimized conventional PCR for AFB1 detection and a rapid, sensitive and simple screening Real-time PCR (qPCR) with SYBR Green and two pairs of primers targeting the aflR genes which involved aflatoxin biosynthesis. AFB1 contaminated maize samples were divided into three groups by the toxin concentration. Genomic DNA was extracted from those samples. The target genes for A. flavus were tested by conventional PCR and the PCR products were analyzed by electrophoresis. A conventional PCR was carried out as nested PCR to verify the gene amplicon sizes. PCR-RFLP patterns, obtained with Hinc II and Pvu II enzyme analysis showed the differences to distinguish aflatoxin-producing fungi. However, they are not quantitative and need a separation of the products on gel and their visualization under UV light. On the other hand, qPCR facilitates the monitoring of the reaction as it progresses. It does not require post-PCR handling, which reduces the risk of cross-contamination and handling errors. It results in a much faster throughout. We found that the optimal primer annealing temperature was $65^{\circ}C$. The optimized template and primer concentration were $1.5{\mu}L\;(50ng/{\mu}L)$ and $3{\mu}L\;(10{\mu}M/{\mu}L)$ respectively. SYBR Green qPCR of four genes demonstrated amplification curves and melting peaks for tub1, afIM, afIR, and afID genes are at $88.0^{\circ}C$, $87.5^{\circ}C$, $83.5^{\circ}C$, and $89.5^{\circ}C$ respectively. Consequently, it was found that the four primers had elevated annealing temperatures, nevertheless it is desirable since it enhances the DNA binding specificity of the dye. New qPCR protocol could be employed for the determination of aflatoxin content in feedstuff samples.

Construction of Gene Network System Associated with Economic Traits in Cattle (소의 경제형질 관련 유전자 네트워크 분석 시스템 구축)

  • Lim, Dajeong;Kim, Hyung-Yong;Cho, Yong-Min;Chai, Han-Ha;Park, Jong-Eun;Lim, Kyu-Sang;Lee, Seung-Su
    • Journal of Life Science
    • /
    • v.26 no.8
    • /
    • pp.904-910
    • /
    • 2016
  • Complex traits are determined by the combined effects of many loci and are affected by gene networks or biological pathways. Systems biology approaches have an important role in the identification of candidate genes related to complex diseases or traits at the system level. The gene network analysis has been performed by diverse types of methods such as gene co-expression, gene regulatory relationships, protein-protein interaction (PPI) and genetic networks. Moreover, the network-based methods were described for predicting gene functions such as graph theoretic method, neighborhood counting based methods and weighted function. However, there are a limited number of researches in livestock. The present study systemically analyzed genes associated with 102 types of economic traits based on the Animal Trait Ontology (ATO) and identified their relationships based on the gene co-expression network and PPI network in cattle. Then, we constructed the two types of gene network databases and network visualization system (http://www.nabc.go.kr/cg). We used a gene co-expression network analysis from the bovine expression value of bovine genes to generate gene co-expression network. PPI network was constructed from Human protein reference database based on the orthologous relationship between human and cattle. Finally, candidate genes and their network relationships were identified in each trait. They were typologically centered with large degree and betweenness centrality (BC) value in the gene network. The ontle program was applied to generate the database and to visualize the gene network results. This information would serve as valuable resources for exploiting genomic functions that influence economically and agriculturally important traits in cattle.