• 제목/요약/키워드: genome annotation

검색결과 179건 처리시간 0.023초

De novo assembly, annotation and gene expression profiles of gonads of Cytorace-3, a hybrid lineage of Drosophila nasuta nasuta and D. n. albomicans

  • Ponnanna, Koushik;DSouza, Stafny M.;Ramachandra, Nallur B.
    • Genomics & Informatics
    • /
    • 제19권1호
    • /
    • pp.8.1-8.12
    • /
    • 2021
  • Cytorace-3 is a laboratory evolved hybrid lineage of Drosophila nasuta nasuta males and Drosophila nasuta albomicans females currently passing ~850 generations. To assess interracial hybridization effects on gene expression in Cytorace-3 we profiled the transcriptomes of mature ovaries and testes by employing Illumina sequencing technology and de novo transcriptome assembling strategies. We found 26% of the ovarian, and 14% of testis genes to be differentially expressed in Cytorace-3 relative to the expressed genes in the parental gonadal transcriptomes. About 5% of genes exhibited additive gene expression pattern in the ovary and 3% in the testis, while the remaining genes were misexpressed in Cytorace-3. Nearly 772 of these misexpressed genes in the ovary and 413 in the testis were either over-or under-dominant. Genes following D. n. nasuta dominance was twice (270 genes) than D. n. albomicans dominance (133 genes) in the ovary. In contrast, only 105 genes showed D. n. nasuta dominance and 207 showed D. n. albomicans dominance in testis transcriptome. Of the six expression inheritance patterns, conserved inheritance pattern was predominant for both ovary (73%) and testis (85%) in Cytorace-3. This study is the first to provide an overview of the expression divergence and inheritance patterns of the transcriptomes in an independently evolving distinct hybrid lineage of Drosophila. This recorded expression divergence in Cytorace-3 surpasses that between parental lineages illustrating the strong impact of hybridization driving rapid gene expression changes.

A bioinformatics approach to characterize a hypothetical protein Q6S8D9_SARS of SARS-CoV

  • Md Foyzur Rahman;Rubait Hasan;Mohammad Shahangir Biswas;Jamiatul Husna Shathi;Md Faruk Hossain;Aoulia Yeasmin;Mohammad Zakerin Abedin;Md Tofazzal Hossain
    • Genomics & Informatics
    • /
    • 제21권1호
    • /
    • pp.3.1-3.10
    • /
    • 2023
  • Characterization as well as prediction of the secondary and tertiary structure of hypothetical proteins from their amino acid sequences uploaded in databases by in silico approach are the critical issues in computational biology. Severe acute respiratory syndrome-associated coronavirus (SARS-CoV), which is responsible for pneumonia alike diseases, possesses a wide range of proteins of which many are still uncharacterized. The current study was conducted to reveal the physicochemical characteristics and structures of an uncharacterized protein Q6S8D9_SARS of SARS-CoV. Following the common flowchart of characterizing a hypothetical protein, several sophisticated computerized tools e.g., ExPASy Protparam, CD Search, SOPMA, PSIPRED, HHpred, etc. were employed to discover the functions and structures of Q6S8D9_SARS. After delineating the secondary and tertiary structures of the protein, some quality evaluating tools e.g., PROCHECK, ProSA-web etc. were performed to assess the structures and later the active site was identified also by CASTp v.3.0. The protein contains more negatively charged residues than positively charged residues and a high aliphatic index value which make the protein more stable. The 2D and 3D structures modeled by several bioinformatics tools ensured that the proteins had domain in it which indicated it was functional protein having the ability to trouble host antiviral inflammatory cytokine and interferon production pathways. Moreover, active site was found in the protein where ligand could bind. The study was aimed to unveil the features and structures of an uncharacterized protein of SARS-CoV which can be a therapeutic target for development of vaccines against the virus. Further research are needed to accomplish the task.

In silico annotation of a hypothetical protein from Listeria monocytogenes EGD-e unfolds a toxin protein of the type II secretion system

  • Maisha Tasneem;Shipan Das Gupta;Monira Binte Momin;Kazi Modasser Hossain;Tasnim Binta Osman;Fazley Rabbi
    • Genomics & Informatics
    • /
    • 제21권1호
    • /
    • pp.7.1-7.11
    • /
    • 2023
  • The gram-positive bacterium Listeria monocytogenes is an important foodborne intracellular pathogen that is widespread in the environment. The functions of hypothetical proteins (HP) from various pathogenic bacteria have been successfully annotated using a variety of bioinformatics strategies. In this study, a HP Imo0888 (NP_464414.1) from the Listeria monocytogenes EGD-e strain was annotated using several bioinformatics tools. Various techniques, including CELLO, PSORTb, and SOSUIGramN, identified the candidate protein as cytoplasmic. Domain and motif analysis revealed that the target protein is a PemK/MazF-like toxin protein of the type II toxin-antitoxin system (TAS) which was consistent with BLASTp analysis. Through secondary structure analysis, we found the random coil to be the most frequent. The Alpha Fold 2 Protein Structure Prediction Database was used to determine the three-dimensional (3D) structure of the HP using the template structure of a type II TAS PemK/MazF family toxin protein (DB ID_AFDB: A0A4B9HQB9) with 99.1% sequence identity. Various quality evaluation tools, such as PROCHECK, ERRAT, Verify 3D, and QMEAN were used to validate the 3D structure. Following the YASARA energy minimization method, the target protein's 3D structure became more stable. The active site of the developed 3D structure was determined by the CASTp server. Most pathogens that harbor TAS create a crucial risk to human health. Our aim to annotate the HP Imo088 found in Listeria could offer a chance to understand bacterial pathogenicity and identify a number of potential targets for drug development.

Transcriptome Analysis Reveals the Putative Polyketide Synthase Gene Involved in Hispidin Biosynthesis in Sanghuangporus sanghuang

  • Jiansheng Wei;Liangyan Liu;Xiaolong Yuan;Dong Wang;Xinyue Wang;Wei Bi;Yan Yang;Yi Wang
    • Mycobiology
    • /
    • 제51권5호
    • /
    • pp.360-371
    • /
    • 2023
  • Hispidin is an important styrylpyrone produced by Sanghuangporus sanghuang. To analyze hispidin biosynthesis in S. sanghuang, the transcriptomes of hispidin-producing and non-producing S. sanghuang were determined by Illumina sequencing. Five PKSs were identified using genome annotation. Comparative analysis with the reference transcriptome showed that two PKSs (ShPKS3 and ShPKS4) had low expression levels in four types of media. The gene expression pattern of only ShPKS1 was consistent with the yield variation of hispidin. The combined analyses of gene expression with qPCR and hispidin detection by liquid chromatography-mass spectrometry coupled with ion-trap and time-of-flight technologies (LCMS-IT-TOF) showed that ShPKS1 was involved in hispidin biosynthesis in S. sanghuang. ShPKS1 is a partially reducing PKS gene with extra AMP and ACP domains before the KS domain. The domain architecture of ShPKS1 was AMP-ACP-KS-AT-DH-KR-ACP-ACP. Phylogenetic analysis shows that ShPKS1 and other PKS genes from Hymenochaetaceae form a unique monophyletic clade closely related to the clade containing Agaricales hispidin synthase. Taken together, our data indicate that ShPKS1 is a novel PKS of S. sanghuang involved in hispidin biosynthesis.

Expression of anoctamin 7 (ANO7) is associated with poor prognosis and mucin 2 (MUC2) in colon adenocarcinoma: a study based on TCGA data

  • Chen, Chen;Siripat Aluksanasuwan;Keerakarn Somsuan
    • Genomics & Informatics
    • /
    • 제21권4호
    • /
    • pp.46.1-46.10
    • /
    • 2023
  • Colon adenocarcinoma (COAD) is the predominant type of colorectal cancer. Early diagnosis and treatment can significantly improve the prognosis of COAD patients. Anoctamin 7 (ANO7), an anion channel protein, has been implicated in prostate cancer and other types of cancer. In this study, we analyzed the expression of ANO7 and its correlation with clinicopathological characteristics among COAD patients using the Gene Expression Profiling Interactive Analysis 2 (GEPIA2) and the University of Alabama at Birmingham CANcer (UALCAN) databases. The GEPIA2, Kaplan-Meier plotter, and the Survival Genie platform were employed for survival analysis. The co-expression network and potential function of ANO7 in COAD were analyzed using GeneFriends, the Database for Annotation, Visualization and Integrated Discovery (DAVID), GeneMANIA, and Pathway Studio. Our data analysis revealed a significant reduction in ANO7 expression levels within COAD tissues compared to normal tissues. Additionally, ANO7 expression was found to be associated with race and histological subtype. The COAD patients exhibiting low ANO7 expression had lower survival rates compared to those with high ANO7 expression. The genes correlated with ANO7 were significantly enriched in proteolysis and mucin type O-glycan biosynthesis pathway. Furthermore, ANO7 demonstrated a direct interaction and a positive co-expression correlation with mucin 2 (MUC2). In conclusion, our findings suggest that ANO7 might serve as a potential prognostic biomarker and potentially plays a role in proteolysis and mucin biosynthesis in the context of COAD.

Antioxidant capacity in seedling of colored-grain wheat under water deficit condition

  • Kim, Dae Yeon;Hong, Min Jeong;Jung, Woo Joo;Seo, Yong Weon
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2017년도 9th Asian Crop Science Association conference
    • /
    • pp.140-140
    • /
    • 2017
  • Nutritious and functional foods from crop have received great attention in recent years. Colored-grain wheat contains high phenolic compound and a large number of flavonoid. The anthocyanin and polyphenolic synthesis and accumulation is generally stimulated in response to biotic or abiotic stresses. Here, we analyzed genome wide transcripts in seedling of colored-grain wheat response to ABA and PEG treatment. About 900 and 1500 transcripts (p-value < 0.05) from ABA and PEG treatment were aligned to IWGSC1+popseq DB which is composed of over 110,000 transcripts including 100,934 coding genes. NR protein sequences of Poaceae from NCBI and protein sequence of transcription factors originated from 83 species in plant transcription factor database v3.0 were used for annotation of putative transcripts. Gene ontology analysis were conducted and KEGG mapping was performed to show expression pattern of biosynthesis genes related in flavonoid, isoflavonoid, flavons and anthocyanin biopathway. DroughtDB (http://pgsb.helmholtz-muenchen.de/droughtdb/) was used for detection of DEGs to explain that physiological and molecular drought avoidance by drought tolerance mechanisms. Drought response pathway, such as ABA signaling, water and ion channels, detoxification signaling, enzymes of osmolyte biosynthesis, phospholipid metabolism, signal transduction, and transcription factors related DEGs were selected to explain response mechanism under water deficit condition. Anthocyanin, phenol compound, and DPPH radical scavenging activity were measured and antioxidant activity enzyme assays were conducted to show biochemical adaptation under water deficit condition. Several MYB and bHLH transcription factors were up-regulated in both ABA and PEG treated condition, which means highly expressed MYB and bHLH transcription factors enhanced the expression of genes related in the biosynthesis pathways of flavonoids, such as anthocyanin and dihydroflavonols in colored wheat seedlings. Subsequently, the accumulation of total anthocyanin and phenol contents were observed in colored wheat seedlings, and antioxidant capacity was promoted by upregulation of genes involved in maintaining redox state and activation of antioxidant scavengers, such as CAT, APX, POD, and SOD in colored wheat seedlings under water deficit condition. This work may provide valuable and basic information for further investigation of the molecular responses of colored-grain wheat to water deficit stress and for further gene-based studies.

  • PDF

Stage specific transcriptome analysis of liver tissue from a crossbred Korean Native Pig (KNP × Yorkshire)

  • Kumar, Himansu;Srikanth, Krishnamoorthy;Park, Woncheol;Lee, Kyung-Tai;Choi, Bong-Hwan;Kim, Jun-Mo;Lim, Dajeong;Park, Jong-Eun
    • Journal of Biomedical and Translational Research
    • /
    • 제19권4호
    • /
    • pp.116-124
    • /
    • 2018
  • Korean Native Pig (KNP) has a uniform black coat color, excellent meat quality, white colored fat, solid fat structure and good marbling. However, its growth performance is low, while the western origin Yorkshire pig has high growth performance. To take advantage of the unique performance of the two pig breeds, we raised crossbreeds (KNP ${\times}$ Yorkshire to make use of the heterotic effect. We then analyzed the liver transcriptome as it plays an important role in fat metabolism. We sampled at two stages: 10 weeks and at 26 weeks. The stages were chosen to correspond to the change in feeding system. A total of 16 pigs (8 from each stage) were sampled and RNA sequencing was performed. The reads were mapped to the reference genome and differential expression analysis was performed with edgeR package. A total of 324 genes were found to be significantly differentially expressed (${\left|log2FC\right|}$ > 1 & q < 0.01), out of which 180 genes were up-regulated and 144 genes were down-regulated. Principal Component Analysis (PCA) showed that the samples clustered according to stages. Functional annotation of significant DEGs (differentially expressed genes) showed that GO terms such as DNA replication, cell division, protein phosphorylation, regulation of signal transduction by p53 class mediator, ribosome, focal adhesion, DNA helicase activity, protein kinase activity etc. were enriched. KEGG pathway analysis showed that the DEGs functioned in cell cycle, Ras signaling pathway, p53 signaling pathway, MAPK signaling pathway etc. Twenty-nine transcripts were also part of the DEGs, these were predominantly Cys2His2-like fold group (C2H2) family of zinc fingers. A protein-protein interaction (PPI) network analysis showed that there were three highly interconnected clusters, suggesting an enrichment of genes with similar biological function. This study presents the first report of liver tissue specific gene regulation in a cross-bred Korean pig.

Genome-wide identification and analysis of long noncoding RNAs in longissimus muscle tissue from Kazakh cattle and Xinjiang brown cattle

  • Yan, Xiang-Min;Zhang, Zhe;Liu, Jian-Bo;Li, Na;Yang, Guang-Wei;Luo, Dan;Zhang, Yang;Yuan, Bao;Jiang, Hao;Zhang, Jia-Bao
    • Animal Bioscience
    • /
    • 제34권11호
    • /
    • pp.1739-1748
    • /
    • 2021
  • Objective: In recent years, long noncoding RNAs (lncRNAs) have been identified in many species, and some of them have been shown to play important roles in muscle development and myogenesis. However, the differences in lncRNAs between Kazakh cattle and Xinjiang brown cattle remain undefined; therefore, we aimed to confirm whether lncRNAs are differentially expressed in the longissimus dorsi between these two types of cattle and whether differentially expressed lncRNAs regulate muscle differentiation. Methods: We used RNA-seq technology to identify lncRNAs in longissimus muscles from these cattle. The expression of lncRNAs were analyzed using StringTie (1.3.1) in terms of the fragments per kilobase of transcript per million mapped reads values of the encoding genes. The differential expression of the transcripts in the two samples were analyzed using the DESeq R software package. The resulting false discovery rate was controlled by the Benjamini and Hochberg's approach. KOBAS software was utilized to measure the expression of different genes in Kyoto encyclopedia of genes and genomes pathways. We randomly selected eight lncRNA genes and validated them by quantitative reverse transcription polymerase chain reaction (RT-qPCR). Results: We found that 182 lncRNA transcripts, including 102 upregulated and 80 downregulated transcripts, were differentially expressed between Kazakh cattle and Xinjiang brown cattle. The results of RT-qPCR were consistent with the sequencing results. Enrichment analysis and functional annotation of the target genes revealed that the differentially expressed lncRNAs were associated with the mitogen-activated protein kinase, Ras, and phosphatidylinositol 3-kinase (PI3k)/Akt signaling pathways. We also constructed a lncRNA/mRNA coexpression network for the PI3k/Akt signaling pathway. Conclusion: Our study provides insights into cattle muscle-associated lncRNAs and will contribute to a more thorough understanding of the molecular mechanism underlying muscle growth and development in cattle.

Organizing an in-class hackathon to correct PDF-to-text conversion errors of Genomics & Informatics 1.0

  • Kim, Sunho;Kim, Royoung;Nam, Hee-Jo;Kim, Ryeo-Gyeong;Ko, Enjin;Kim, Han-Su;Shin, Jihye;Cho, Daeun;Jin, Yurhee;Bae, Soyeon;Jo, Ye Won;Jeong, San Ah;Kim, Yena;Ahn, Seoyeon;Jang, Bomi;Seong, Jiheyon;Lee, Yujin;Seo, Si Eun;Kim, Yujin;Kim, Ha-Jeong;Kim, Hyeji;Sung, Hye-Lynn;Lho, Hyoyoung;Koo, Jaywon;Chu, Jion;Lim, Juwon;Kim, Youngju;Lee, Kyungyeon;Lim, Yuri;Kim, Meongeun;Hwang, Seonjeong;Han, Shinhye;Bae, Sohyeun;Kim, Sua;Yoo, Suhyeon;Seo, Yeonjeong;Shin, Yerim;Kim, Yonsoo;Ko, You-Jung;Baek, Jihee;Hyun, Hyejin;Choi, Hyemin;Oh, Ji-Hye;Kim, Da-Young;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • 제18권3호
    • /
    • pp.33.1-33.7
    • /
    • 2020
  • This paper describes a community effort to improve earlier versions of the full-text corpus of Genomics & Informatics by semi-automatically detecting and correcting PDF-to-text conversion errors and optical character recognition errors during the first hackathon of Genomics & Informatics Annotation Hackathon (GIAH) event. Extracting text from multi-column biomedical documents such as Genomics & Informatics is known to be notoriously difficult. The hackathon was piloted as part of a coding competition of the ELTEC College of Engineering at Ewha Womans University in order to enable researchers and students to create or annotate their own versions of the Genomics & Informatics corpus, to gain and create knowledge about corpus linguistics, and simultaneously to acquire tangible and transferable skills. The proposed projects during the hackathon harness an internal database containing different versions of the corpus and annotations.