Fig. 1. Domestic data solution market share
Fig. 2. Architecture of web crawler
Fig. 3. Typical Web Crawl algorithm
Fig. 4. Architecture of Apache Kafka
Fig. 5. Architecture of R-WCMS Agent
Fig. 6. Product Listing Page URL Pattern
Fig. 7. Product detail page URL pattern
Fig. 8. Messages to the R-WCMS Manager
Fig. 9. Architecture of R-WCMS Manager
Fig. 10. Find messages in the R-WCMS
Fig.11. Real-time behavioral model
Fig. 12. Code List Web Crawling
Fig. 13. Estimate update time of R-WCMS
Fig. 14. The accuracy of seached data within the same time frame
Fig. 15. The time it took to process the updated search data
Table 1. The Performance data
Table. 2 Equation parameter
Table 3. The OSS Ver. and OSS List
References
- K. Y. Kim, W. Lee, M. H. Lee, H.M.Yoon & S. H. Shin(2011). Development of Web Crawler for Archiving Web Resources, International J ournal of contents, 11(9), 9-16. DOI : 10.5392/JKDA
- J. h. Cho & H. Garcia-Molina. (2009), Parallel crawlers , Proceedings of the 11th international conference on World Wide Web. Honolulu, Hawaii, USA:ACM. pp.(124-135). DOI :10.1145/511446.511464.ISBN
- H. J. Kim, J. Y Lee & S. S Shin. (2017), Multi-threaded Web Crawling Design using Queues. Journal of Convergence for Information Technology, 7(2) , 43-51. https://doi.org/10.14801/jaitc.2017.7.2.43
- H. J. Mun. (2015). Polling Method based on Weight Table for Efficient Monitoring. Journal of Convergence for Information Technology, 5(4), 5-10. https://doi.org/10.22156/CS4SMB.2015.5.4.005
- Olston, Christopher. et al. (2010). Foundations and Trends(R) in Information Retrieval, 4(3), 17. DOI : 10.1561/1500000017
- Y. S. Jeong. (2015). Business Process Model for Efficient SMB using Big Data. Journal of Convergence for Information Technology, 5(4) , 11-16. https://doi.org/10.22156/CS4SMB.2015.5.4.011
- J. H. Ku. (2018). A Study on Adaptive Learning Model for Performance Improvement of Stream Analytics. Journal of Convergence for Information Technology, 8(1), 201-206. https://doi.org/10.22156/CS4SMB.2018.8.1.201
- G. Pant & F. Menczer. (2002). MySpiders: Evolve your own intelligent Web crawlers. Autonomous Agents and Multi-Agent Systems 5(2), 221-229. https://doi.org/10.1023/A:1014853428272
- E. J. Shin, Y. R. Kim, H. S. Heo & K. Y. Whang. (2008). Implementation of a Parallel Web Crawler for the Odysseus Large-Scale Search Engine. Journal of Computing Science and Engineering, 14(6) , 567-581.
- M. Zaharia, M. Chowdhury, M. J. Franklin. (2010). Scott Shenker, and Ion Stoica, Spark: Cluster Computing with Working Set. Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 10(10-10), 95.
- Kafka. https://kafka.apache.org/intro