웹 페이지 비교통합 기반의 정보 수집 시스템 설계 및 개발에 대한 연구

A Study on Design and Development of Web Information Collection System Based Compare and Merge Method

  • 장진욱 (건국대학교 정보통신대학교 인터넷미디어공학부)
  • 투고 : 2013.12.30
  • 심사 : 2014.03.04
  • 발행 : 2014.03.31


Recently, the quantity of information that is accessible from the Internet is being dramatically increased. Searching the Web for useful information has therefore become increasingly difficult. Thus, much research has been done on web robots which perform internet information filtering based on user interest. If a web site which users want to visit is found, its content is searched by following the searching list or Web sites links in order. This search process takes a long time according as the number of page or site increases so that its performance need to be improved. In order to minimize unnecessary search with web robots, this paper proposes an efficient information collection system based on compare and merge method. In the proposed system, a web robot initially collects information from web sites which users register. From the next visit to the web sites, the web robot compares what it collected with what the web sites have currently. If they are different, the web robot updates what it collected. Only updated web page information is classified according to subject and provided to users so that users can access the updated information quickly.



  1. 이상렬, "최신 정보검색론", 2013.
  2. 임해창, 임희석 외 1명 역, "검색엔진 최신정보검색", 휴먼싸이언스, 2012.
  3. Christopher D. Manning, Prabhakar Raghavan 저 안동언, 김재훈 외 1명 역, "최신 정보검색론", 2010.
  4. 도용태, "인공지능 개념 및 응용", 사이텍미디어, 2003.
  5. 김광영, 이원구, 이민호, 윤화묵, 신성호, "웹자원 아카이빙을 위한 웹 크롤러 연구개발", 한국콘텐츠학회논문지, 제11권, 제9호(2013).
  6. 김광영, 이원구, 이민호, 윤화묵, 신성호, "웹자원 아카이빙을 위한 웹 크롤러 연구 개발", 한국콘텐츠학회논문지, 제11권, 제9호(2011).
  7. 강한훈, 유성준, 한동일, "전문 분야 정보검색 시스템을 위한 웹 크롤러 래퍼의 설계 및 구현", Proceedings of KIIS Fall Conference 2010.
  8. 이홍주, 양근우, 김규중, 백승기, 김종우, 허순영, 박성주, "과학기술 연구팀을 위한 지식포탈 아키텍쳐", 대한산업공학회/한국경영과학회 2002 춘계공동학술대회 한국과학기술원(KAIST) 2002.
  9. Lee, S.-M. and Kim, T.-Y., "A News on Demand Service System based on Robot Agent", 1998.
  10. Coffman, E. G., Z. Lin, and R. R. Weber, "Optimal robot scheduling for Web search engines", France, December, 1997.
  11. Spiders, Wanderers, Broker, and Bots "Fah- Chun Cheong, Internet Agent", New Rider Publishing, 1996.
  12. Baeza-Yates, "Ribeiro-Neto Modern Information Retrieval", 1995.
  13. Wei Tang, Ling Liu, and Calton Pu, "Web- CQ : Detecting and delivering Information Changes on the Web", Georgia Institute of Technolgy College of computing, http://, 1995.
  14. Junghoo Cho, and Hector Garcia-Molina, "Efficient Crawing Through URL Ordering", Lawrence Pages Department of Computer science stanford University, 1995.
  15. Selberg, E. and O. Etzioni, "Multi-services search and comparison using the Meta- Crawler", 4th Int WWW Conference, December, 1995.
  16. David Butter, Ling Liu, and Calton Pu, "A Fully Automated Object Extraction System for the World Wide Web", Georgia Institute of Technology College of conputing Atlanta, GA 30332, U.S.A., 1995.
  17. Martijn Korster, "Guidelines for Robot writers", 1993.