A Minimum Missing Aggregation Policy for RSS Services

RSS 서비스를 위한 최소 누락 수집 정책

  • Published : 2008.10.15

Abstract

RSS is the XML-based format for the syndication of web contents, and users aggregate RSS feeds with RSS feed aggregators. In order to effectively aggregate RSS feeds, an RSS aggregation policy is necessary. In this paper, we first propose an aggregation policy to minimize the number of postings being missed within an aggregation. Second, we analyze and compare our aggregation policy with existing aggregation policies. Our analysis reveals that our aggregation policy can reduce approximately 23% of the aggregation missing in comparison with the other aggregation policies while it increases only 6% of the aggregation delay.

RSS는 웹 콘텐츠 배급을 위한 XML기반 포맷으로, 사용자는 RSS 피드 수집기를 통해 RSS 피드를 수집한다. RSS 피드를 효과적으로 수집하기 위해서는 RSS 피드에 대한 수집 정책이 필요하다. 본 논문은 RSS 피드 수집 시에 누락되는 포스팅을 최소화하기 위한 RSS 피드 수집 정책을 제안하고, 실험을 통해 제안한 수집 정책과 기존 수집 정책을 비교 분석하였다. 본 논문에서 제안한 수집 정책은 기존 수집 정책과 비교하여 6%의 수집 지연 증가로 23%의 수집 누락이 감소됨을 실험을 통하여 알 수 있었다.

Keywords

References

  1. K. E. Gill, "Blogging, RSS and the Information Landscape: A Look At Online News," WWW 2005 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2005
  2. RSS 2.0 Specification, http://blogs.law.harvard.edu/ tech/rss
  3. What is RSS?, http://www.xml.com/pub/a/2002/ 12/18/dive-into-xml.html
  4. K. C. Sia, J. Cho and H. K. Cho, "Efficient Monitoring Algorithm for Fast News Alert," IEEE Transaction on Knowledge and Data Engineering, Vol.19, No.7, pp. 950-961, 2007 https://doi.org/10.1109/TKDE.2007.1041
  5. K. C. Sia, J. Cho, K. Hino, Y. Chi, S. Zhu and B. L. Tseng, "Monitoring RSS Feeds Based on User Browsing Pattern," In Proceedings of the International Conference on Weblogs and Social Media, 2007
  6. Allblog, http://www.allblog.net
  7. B. E. Brewington and G. Cybenko, "How Dynamic is the Web?" In Proceedings of the 9th International World Wide Web Conference, pp. 257- 276, 2000 https://doi.org/10.1016/S1389-1286(00)00045-1
  8. J. Cho and H. Garcia-Molina, "Synchronizing a Database to Improve Freshness," In Proceedings the 26th ACM SIGMOD International Conference on Management of Data, pp. 117-128, 2000
  9. J. Cho and H. Garcia-Molina, "The Evolution of the Web and Implications for an Incremental Crawler," In Proceedings of the 26th International Conference on Very Large Data Bases, pp. 200- 209, 2000
  10. S. J. Kim and S. H. Lee, "An Empirical Study on the Change of Web Pages," In Proceedings of the 7th Asia-Pacific Web Conference, pp. 632-642, 2005
  11. S. J. Kim and S. H. Lee, "Estimating the Change of Web Pages," In Proceedings of the International Conference on Computational Science 2007, pp. 798-805, 2007
  12. A. Ntoulas, J. Cho, and C. Olston, "What's New on the Web? The Evolution of the Web from a Search Engine Perspective," In Proceedings of the 13th International World Wide Web Conference, pp. 1-12, 2004