Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2004.11D.5.1049

An Efficient Web Search Method Based on a Style-based Keyword Extraction and a Keyword Mining Profile  

Joo, Kil-Hong (연세대학교 대학원 컴퓨터학과)
Lee, Jun-Hwl (소프트그램 기술연구소)
Lee, Won-Suk (연세대학교 컴퓨터학과)
Abstract
With the popularization of a World Wide Web (WWW), the quantity of web information has been increased. Therefore, an efficient searching system is needed to offer the exact result of diverse Information to user. Due to this reason, it is important to extract and analysis of user requirements in the distributed information environment. The conventional searching method used the only keyword for the web searching. However, the searching method proposed in this paper adds the context information of keyword for the effective searching. In addition, this searching method extracts keywords by the new keyword extraction method proposed in this paper and it executes the web searching based on a keyword mining profile generated by the extracted keywords. Unlike the conventional searching method which searched for information by a representative word, this searching method proposed in this paper is much more efficient and exact. This is because this searching method proposed in this paper is searched by the example based query included content information as well as a representative word. Moreover, this searching method makes a domain keyword list in order to perform search quietly. The domain keyword is a representative word of a special domain. The performance of the proposed algorithm is analyzed by a series of experiments to identify its various characteristic.
Keywords
Keyword Extraction; Data Mining; Web Information Searching; Web Document; Pattern Analysis; Profile Analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 E. shakshuki and H. Ghenniwa, 'A multi-agent system architecture for information gathering,' Database and Expert Systems Applications, Proceedings, 11th International Workshop on, pp.732-736, 2000   DOI
2 S. Harabagiu, D. Moldovan, M. Pasca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus and P. Morarescu, 'FALCON : Boosting Knowledge for Answer Engines,' In the Proceedings of Text REtrieval Conference (TREC-9), 2000
3 S. Alpha, P. Dixon, C. Liao, 'Oracle at TREC 10,' In the Proceedings of Text REtrieval Conference (TREC 2001), 2001
4 E. Hovy, U. Hermjakob, C-Y Lin, 'The Use of External Knowledge in Factoid QA,' In the Proceedings of Text REtrieval Conference (TREC 2001), 2001
5 A. Savasere, E. Omiencinsky and S. Navathe, 'An efficient algorithm for mining association rules in large databases,' In Proceedings of the 21th VLDB Conference, Zurich, Swizerland, pp.432-444, 1995
6 J. S. Park, P. S. Yu and M.-S. Chen, 'Mining Association Rules with Adjustable Accuracy,' In Proceedings of ACM CIKM '97, Las Vegas, Nevada, pp.151-160, November, 1997   DOI
7 S. Brin, R. Motwani, J D. Ullman and S. Tsur, 'Dynamic itemset Counting and Implication Rules for Market Basket Data,' In Proceedings of ACM SIGMOD Conference on Management of Data, Tucson, Arizona, pp.255-264, May, 1997   DOI
8 R. Agrawal and R. Srikant, 'Fast algorithms for mining association rules,' In Proceedings of the 20th VLDB Conference, Santiago, Chile, Sept., 1994
9 J. S. Park, M-S. Chen and P. S. Ui, 'An effective hash-based algorithm for mining association rules,' In Proceedings of ACM SIGMOD Conference on Management of Data, San Jose, California, pp.175-186, May, 1995   DOI
10 C-H. Chang and C-C. Hsu, 'Customizable mulit-engine search tool with clustering,' In 6th Int. WWW Conference, Santa Clara, Ca, USA, April, 1997
11 Jiawei Han, 'Data Mining,' Encyclopedia of Distributed Computing, Kluwer Academic Publisher
12 R. Agrawal and R. Srikant, 'Mining association rules between sets of items in large databases,' Proceeding of the ACM SIGMOD Conference on Management of Data, Washington, D.C., pp.207-216, May, 1993   DOI
13 A. Broder, S. Glassman, M. Manasse and G. Zweig, 'Syntactic clustering of the web,' In 6th Int. WWW Conference, Snata Clara, CA, USA, pp.391-404, April, 1997   DOI   ScienceOn
14 I. Aalbersberg, 'A Document Retrieval Model Based on Term Frequency Ranks,' 17th international ACM SIGIR Conference on Research and Development in Information Retrieval, pp.163-172, 1994
15 R. Weiss, B. Velez, M. Sheldon, C. Nemprempre, P. Szilagyi and D. K. Gifford, 'HyPursuit: A hierachical Network engine that exploits content-link hypertext clustering,' In Proc. Of the 7th ACM Conference on Hypertext and Hypermedia, Washington, DC, USA, pp.180-193, 1996   DOI
16 Cazalens S., Desmontils S., Jacquin C. and Lamarre P., 'A Web site indexing process for an Internet information retrieval agent system,' Web Information Systems Engineering 2000, Proceedings of the First International Conference on, Vol.1, pp.254-258, 2000   DOI
17 M. Scmidt and U. Ruckert, 'Content-based information retrieval using an embedded neural associative memory,' Parallel and Distributed Processing 2001 Proceedings, Ninth Euromicro Workshop on, pp.443-450   DOI
18 Weifeng Li, Baowen Xu, Hongji Yang, Cheng-Chung Chu W. and Chih-Wei Lu at Dept. of Compt. Sci. & Eng. Southeast Univ., Nanjing, China, 'Application of genetic algorithm in search engine,' Multimedia Software Engineering, Proceedings, International Symposium on, pp. 366-371, 2000   DOI
19 Ricardo Baeza-Yates and Berthier Ribeiro-Neto, 'Modem Information Retrieval,' ADDISON WESLEY, pp.29- 30, 1999
20 Amit Singhal, Chris Buckley and Mandar Mitra, 'Pivoted Document Length Normalization,' Proceedings of 19th ACM International Conference on Research and Development in Information Retrieval, 1996   DOI