• 제목/요약/키워드: genome annotation

검색결과 179건 처리시간 0.023초

Enabling a fast annotation process with the Table2Annotation tool

  • Larmande, Pierre;Jibril, Kazim Muhammed
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.19.1-19.6
    • /
    • 2020
  • In semantic annotation, semantic concepts are linked to natural language. Semantic annotation helps in boosting the ability to search and access resources and can be used in information retrieval systems to augment the queries from the user. In the research described in this paper, we aimed to identify ontological concepts in scientific text contained in spreadsheets. We developed a tool that can handle various types of spreadsheets. Furthermore, we used the NCBO Annotator API provided by BioPortal to enhance the semantic annotation functionality to cover spreadsheet data. Table2Annotation has strengths in certain criteria such as speed, error handling, and complex concept matching.

Integration of the PubAnnotation ecosystem in the development of a web-based search tool for alternative methods

  • Neves, Mariana
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.18.1-18.5
    • /
    • 2020
  • Finding publications that propose alternative methods to animal experiments is an important but time-consuming task since researchers need to perform various queries to literature databases and screen many articles to assess two important aspects: the relevance of the article to the research question, and whether the article's proposed approach qualifies to being an alternative method. We are currently developing a Web application to support finding alternative methods to animal experiments. The current (under development) version of the application utilizes external tools and resources for document processing, and relies on the PubAnnotation ecosystem for annotation querying, annotation storage, dictionary-based tagging of cell lines, and annotation visualization. Currently, our two PubAnnotation repositories for discourse elements contain annotations for more than 110k PubMed documents. Further, we created an annotator for cell lines that contain more than 196k terms from Cellosaurus. Finally, we are experimenting with TextAE for annotation visualization and for user feedback.

Annotation of Genes Having Candidate Somatic Mutations in Acute Myeloid Leukemia with Whole-Exome Sequencing Using Concept Lattice Analysis

  • Lee, Kye Hwa;Lim, Jae Hyeun;Kim, Ju Han
    • Genomics & Informatics
    • /
    • 제11권1호
    • /
    • pp.38-45
    • /
    • 2013
  • In cancer genome studies, the annotation of newly detected oncogene/tumor suppressor gene candidates is a challenging process. We propose using concept lattice analysis for the annotation and interpretation of genes having candidate somatic mutations in whole-exome sequencing in acute myeloid leukemia (AML). We selected 45 highly mutated genes with whole-exome sequencing in 10 normal matched samples of the AML-M2 subtype. To evaluate these genes, we performed concept lattice analysis and annotated these genes with existing knowledge databases.

Functional annotation of uncharacterized proteins from Fusobacterium nucleatum: identification of virulence factors

  • Kanchan Rauthan;Saranya Joshi;Lokesh Kumar;Divya Goel;Sudhir Kumar
    • Genomics & Informatics
    • /
    • 제21권2호
    • /
    • pp.21.1-21.14
    • /
    • 2023
  • Fusobacterium nucleatum is a gram-negative bacteria associated with diverse infections like appendicitis and colorectal cancer. It mainly attacks the epithelial cells in the oral cavity and throat of the infected individual. It has a single circular genome of 2.7 Mb. Many proteins in F. nucleatum genome are listed as "Uncharacterized." Annotation of these proteins is crucial for obtaining new facts about the pathogen and deciphering the gene regulation, functions, and pathways along with discovery of novel target proteins. In the light of new genomic information, an armoury of bioinformatic tools were used for predicting the physicochemical parameters, domain and motif search, pattern search, and localization of the uncharacterized proteins. The programs such as receiver operating characteristics determine the efficacy of the databases that have been employed for prediction of different parameters at 83.6%. Functions were successfully assigned to 46 uncharacterized proteins which included enzymes, transporter proteins, membrane proteins, binding proteins, etc. Apart from the function prediction, the proteins were also subjected to string analysis to reveal the interacting partners. The annotated proteins were also put through homology-based structure prediction and modeling using Swiss PDB and Phyre2 servers. Two probable virulent factors were also identified which could be investigated further for potential drug-related studies. The assigning of functions to uncharacterized proteins has shown that some of these proteins are important for cell survival inside the host and can act as effective drug targets.

An empirical evaluation of electronic annotation tools for Twitter data

  • Weissenbacher, Davy;O'Connor, Karen;Hiraki, Aiko T.;Kim, Jin-Dong;Gonzalez-Hernandez, Graciela
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.24.1-24.7
    • /
    • 2020
  • Despite a growing number of natural language processing shared-tasks dedicated to the use of Twitter data, there is currently no ad-hoc annotation tool for the purpose. During the 6th edition of Biomedical Linked Annotation Hackathon (BLAH), after a short review of 19 generic annotation tools, we adapted GATE and TextAE for annotating Twitter timelines. Although none of the tools reviewed allow the annotation of all information inherent of Twitter timelines, a few may be suitable provided the willingness by annotators to compromise on some functionality.

SFannotation: A Simple and Fast Protein Function Annotation System

  • Yu, Dong Su;Kim, Byung Kwon
    • Genomics & Informatics
    • /
    • 제12권2호
    • /
    • pp.76-78
    • /
    • 2014
  • Owing to the generation of vast amounts of sequencing data by using cost-effective, high-throughput sequencing technologies with improved computational approaches, many putative proteins have been discovered after assembly and structural annotation. Putative proteins are typically annotated using a functional annotation system that uses extant databases, but the expansive size of these databases often causes a bottleneck for rapid functional annotation. We developed SFannotation, a simple and fast functional annotation system that rapidly annotates putative proteins against four extant databases, Swiss-Prot, TIGRFAMs, Pfam, and the non-redundant sequence database, by using a best-hit approach with BLASTP and HMMSEARCH.

말 데이터베이스 구축 (HorseDB; an Integrated Horse Resource and Web Service)

  • 김대수;조운종;허재원;최은상;조병욱;김희수
    • 생명과학회지
    • /
    • 제16권3호
    • /
    • pp.472-476
    • /
    • 2006
  • 공개된 데이터베이스들에서 말에 대한 생물학적인 데이터와 지놈 데이터를 분석하여 말 데이터베이스를 구축하였다. 말 데이터베이스는 말의 생물학적인 데이터와 지놈 데이터를 생물정보학적인 분석방법으로 분석하고 이들 데이터를 통합하여 제공하는데 목적을 두고 있다. 본 데이터베이스는 말의 생물학적 데이터와 지놈 분석 데이터 그리고 생물정보학적인 분석프로그램을 제공하는 인터페이스로 구성하였다. 또한 사용자의 편의를 돕기 위해서 쉽게 이용할 수 있도록 웹 메뉴를 구성 하였으며 말에 대한 다양한 정보를 제공할 수 있게 하였다. 말 데이터베이스를 이용할 수 있는 웹 주소는 http://www.primate.or.kr/horse이다.

High quality genome sequence of Treponema phagedenis KS1 isolated from bovine digital dermatitis

  • Espiritu, Hector M.;Mamuad, Lovelia L.;Jin, Su-jeong;Kim, Seon-ho;Lee, Sang-suk;Cho, Yong-il
    • Journal of Animal Science and Technology
    • /
    • 제62권6호
    • /
    • pp.948-951
    • /
    • 2020
  • Treponema phagedenis KS1, a fastidious anaerobe, was isolated from a bovine digital dermatitis (BDD)-infected dairy cattle in Chungnam, Korea. Initial data indicated that T. phagedenis KS1 exhibited putative virulent phenotypic characteristics. This study reports the whole genome assembly and annotation of T. phagedenis KS1 (KCTC14157BP) to assist in the identification of putative pathogenicity related factors. The whole genome of T. phagedenis KS1 was sequenced using PacBio RSII and Illumina HiSeqXTen platforms. The assembled T. phagedenis KS1 genome comprises 16 contigs with a total size of 3,769,422 bp and an overall guanine-cytosine (GC) content of 40.03%. Annotation revealed 3,460 protein-coding genes, as well as 49 transfer RNA- and 6 ribosomal RNA-coding genes. The results of this study provide insight into the pathogenicity of T. phagedenis KS1.