Browse > Article

MediScore: MEDLINE-based Interactive Scoring of Gene and Disease Associations  

Cho, Hye-Young (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
Oh, Bermseok (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
Lee, Jong-Keuk (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
Kim, Kuchan (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
Koh, InSong (Division of Epidemiology and Bioinformatics, National Genome Research Institute, National Institute of Health)
Abstract
MediScore is an information retrieval system, which helps to search for the set of genes associated with a specific disease or the set of diseases associated with a specific gene. Despite recent improvement of natural language processing (NLP) and other text mining approaches to search for disease associated genes, many false positive results come out due to diversity of exceptional cases as well as ambiguities in gene names. In order to overcome the weak points of current text mining approaches, MediScore introduces statistical normalization based on binomial to normal distribution approximation which corrects inaccurate scores caused by common words not representing genes and interactive rescoring by the user to remove the false positive results. Interactive rescoring includes individual alias scoring for each gene to remove false gene synonyms, referring MEDLINE abstracts, and cross referencing between OMIM and other related information.
Keywords
interactive scoring; MEDLINE; text mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chaussabel, D. and Sher A. (2002). Mining microarray expression data by literature profiling. Genome BioI. 3(10):RESEARCH0055
2 Hu, Y., Hines, L.M., Weng, H., Zuo, D., Rivera, M., Richardson, A, and LaBaer, J. (2003). Analysis of genomic and proteomic data using advanced literature mining. J. Proteome Res. 2(4), 405-412   DOI   ScienceOn
3 Marcotte, E.M., Xenarios, I., and Eisenberg, D. (2001). Mining literature for protein-protein interactions. Bioinformatics 17(4),359-63   DOI   ScienceOn
4 Perez-Iratxeta, C., Bork, P., and Andrade M.A. (2002). Association of genes to genetically inherited diseases using data mining. Nature Genetics. 31, 316-319   DOI   PUBMED
5 Friedman, C., Kra, P., Yu, H., Krauthammer, M., and Rzhetscky, A. (2001). GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 17, Suppl. 1, S74-S82   DOI   PUBMED