참고문헌
- Wikipedia. Web Crawler [Internet] Available: https://en.wikipedia.org/wiki/Web_crawler.
- H. Karau, A. Konwinski, P. Wendell, and M. Zaharia, Learning Spark, 1st ed. Sebastopol, CA: O'Reilly Media, pp.1-9, 2015.
- F. Pant, P. Srinivasn, F. Menczer, "Crawling the Web" in Web Dynamics, 1st ed. Berlin, Germany: Springer-Verlag, pp.153-177, 2003.
- M. S. Ahuja , J. Singh, and B. Varnica. "Web Crawler: Extracting the Web Data," International Journal of Computer Trends and Technology(IJCTT), vol. 13, no. 3, pp.132-137, Jul. 2014. https://doi.org/10.14445/22312803/IJCTT-V13P128
- Pycon. Web Scraper in 30 Minutes [Online]. Available: https://www.pycon.kr/2014/program/15.
- H. C. Kim and S. H. Chae. "Design and Implementation of a High Performance Web Crawler," Journal of Digital Contents Society, vol. 4, no. 2, pp.127-137, Dec. 2003.
- D. M. Seo and H. M. Jung. "Intelligent Web Crawler for Supporting Big Data Analysis Services," The Journal of the Korea Contents Association, vol. 13, no. 12, pp.575-584, Dec. 2013. https://doi.org/10.5392/JKCA.2013.13.12.575
- K. Y. Kim, W. G. Lee, H. M. Yoon, S. H. Shin, and M. H. Lee. "Development of Web Crawler for Archiving Web Resources," The Journal of the Korea Contents Association, vol. 11, no. 9, pp.9-16, Sep. 2011. https://doi.org/10.5392/JKCA.2011.11.9.009
- S. H. Hong, "An Implementation of Smart Price Tracker System Using Web Crawling," M.S. Thesis, Seoul National University of Science and Technology, Seoul, Korea, 2015.
- V. K. Vavilapalli, et al., "Apache hadoop yarn: Yet another resource negotiator," in ACM Proceedings of the 4th annual Symposium on Cloud Computing, Santa Clara: CA, 2013.
- Y. K. Lee, "The Comparison Between Hadoop MapReduce and Spark Device's Machine Learning Performance," M.S. Thesis, Soongsil University, Seoul, Korea, 2015.
- B. S. Kim, "Performance Evaluation of HDFS Based SQLOn-Hadoop," M.S. Thesis, Chungbuk National University, Cheongju, Korea, 2015.
- J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, vol. 51, no. 1, pp.107-113, Jan. 2008. https://doi.org/10.1145/1327452.1327492
- USCDataScience. Sparkler [Internet]. Available: https://github.com/USCDataScience/sparkler.
- H. O. Song, A. Y. Kim, and H. K. Jung. "Implement on Search Machine using Open Source Framework," Journal of the Korea Institute of Information and Communication Engineering, vol. 19, no. 3, pp.552-557, Mar. 2015. https://doi.org/10.6109/jkiice.2015.19.3.552
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. "Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing," in Proceedings of the 9th USENIX conference on networked systems design and implementation, San Jose: CA, pp.2-2, 2012.
- C. Klaussne, J. Nioch. (2013, September). Nutch fight! 1.7 vs 2.2.1 [Internet]. Available: http://digitalpebble.blogspot.co.uk/2013/09/nutch-fight-17-vs-221.html.
피인용 문헌
- 빅데이터 분석 기반의 정보 검색을 위한 웹 크롤러 서비스 구현 vol.18, pp.5, 2017, https://doi.org/10.9728/dcs.2017.18.5.933
- K Nearest Neighbor Joins for Big Data Processing based on Spark vol.21, pp.9, 2017, https://doi.org/10.6109/jkiice.2017.21.9.1731