A bio-text mining system using keywords and patterns in a grid environment

  • Kwon, Hyuk-Ryul (Dept. of Information Industrial Engineering, Chungbuk National University) ;
  • Jung, Tae-Sung (Dept. of Information Industrial Engineering, Chungbuk National University) ;
  • Kim, Kyoung-Ran (Dept. of Management Information Systems, Chungbuk National University) ;
  • Jahng, Hye-Kyoung (Dept. of Management Information Systems, Chungbuk National University) ;
  • Cho, Wan-Sup (Dept. of Management Information Systems, Chungbuk National University) ;
  • Yoo, Jae-Soo (Dept. of Computer and Communication Engineering, Chungbuk National University)
  • Published : 2007.02.08

Abstract

As huge amount of literature including biological data is being generated after post genome era, it becomes difficult for researcher to find useful knowledge from the biological databases. Bio-text mining and related natural language processing technique are the key issues in the intelligent knowledge retrieval from the biological databases. We propose a bio-text mining technique for the biologists who find Knowledge from the huge literature. At first, web robot is used to extract and transform related literature from remote databases. To improve retrieval speed, we generate an inverted file for keywords in the literature. Then, text mining system is used for extracting given knowledge patterns and keywords. Finally, we construct a grid computing environment to guarantee processing speed in the text mining even for huge literature databases. In the real experiment for 10,000 bio-literatures, the system shows 95% precision and 98% recall.

Keywords