DOI QR코드

DOI QR Code

Design and Implementation of a System for Recommending Related Content Using NoSQL

NoSQL 기반 연관 콘텐츠 추천 시스템의 설계 및 구현

  • Ko, Eun-Jeong (School of Computer and Information Engineering, Kwangwoon University) ;
  • Kim, Ho-Jun (School of Computer and Information Engineering, Kwangwoon University) ;
  • Park, Hyo-Ju (School of Computer and Information Engineering, Kwangwoon University) ;
  • Jeon, Young-Ho (School of Computer and Information Engineering, Kwangwoon University) ;
  • Lee, Ki-Hoon (School of Computer and Information Engineering, Kwangwoon University) ;
  • Shin, Saim (Korea Electronics Technology Institute)
  • Received : 2017.06.26
  • Accepted : 2017.07.19
  • Published : 2017.09.30

Abstract

The increasing number of multimedia content offered to the user demands content recommendation. In this paper, we propose a system for recommending content related to the content that user is watching. In the proposed system, relationship information between content is generated using relationship information between representative keywords of content. Relationship information between keywords is generated by analyzing keyword collocation frequencies in Internet news corpus. In order to handle big corpus data, we design an architecture that consists of a distributed search engine and a distributed data processing engine. Furthermore, we store relationship information between keywords and relationship information between keywords and content in NoSQL to handle big relationship data. Because the query optimizer of NoSQL is not as well developed as RDBMS, we propose query optimization techniques to efficiently process complex queries for recommendation. Experimental results show that the performance is improved by up to 69 times by using the proposed techniques, especially when the number of requested related keywords is small.

Keywords

References

  1. J. Son, S.B. Kim, H. Kim, and S. Cho, "Review and Analysis of Recommender Systems," Journal of the Korean Institute of Industrial Engineers, Vol. 41, No. 2, pp. 185-208, 2015. https://doi.org/10.7232/JKIIE.2015.41.2.185
  2. M. Kumar, D.K. Yadav, A. Singh, and V.K. Gupta, “A Movie Recommender System: MOVREC,” International Journal of Computer Applications, Vol. 124, No. 3, pp. 7-11, 2015. https://doi.org/10.5120/ijca2015904111
  3. K. Matsumura, M.J. Evans, Y. Shishikui, and A. McParland, "Personalization of Broadcast Programs Using Synchronized Internet Content," Proceeding of International Conference on Consumer Electronics, Vol. 4, pp. 1-5, 2010.
  4. R. Cattell, "Scalable SQL and NoSQL Data Stores," Association for Computing Machinery Special Interest Group Management of Data Record, Vol. 39, No. 4, pp. 12-27, 2010.
  5. S. Lee, “Personalized Contents Recommendation System Based On Social Network,” Journal of Broadcast Engineering, Vol. 18, No. 1, pp. 98-105, 2013. https://doi.org/10.5909/JBE.2013.18.1.98
  6. M. Na and J. Lee, “Improvement of UCI Metadata and Resolution Service for Massive Contents Recommendation,” Journal of Korea Multimedia Society, Vol. 13, No. 3, pp. 475-486, 2010.
  7. Y. Jeon, E. Kim, H. Park, and K. Lee, "A Trend Analysis Service Using a Hadoop Cluster of Mini PCs," Proceeding of the Korea Information Processing Society Spring Conference, Vol. 22, No. 1, pp. 710-711, 2015.
  8. H. Lee and J. Kwon, "A New Distributed Graph Data Storage System for Large-scale Recommender Engines," Journal of Korean Institute of Information Technology, Vol. 11, No. 7, pp. 139-149, 2013.
  9. O. Kononenko, O. Baysal, R. Holmes, and M.W. Godfrey, "Mining Modern Repositories with Elasticsearch," Proceeding of the 11th Working Conference on Mining Software Repositories, pp. 328-331, 2014.
  10. Apache Lucene, https://lucene.apache.org (June 29, 2017)
  11. Apache Spark, http://spark.apache.org. (June 29, 2017)
  12. YouTube, https://www.youtube.com. (June 29, 2017)
  13. Apache HBase, https://hbase.apache.org. (June 29, 2017)
  14. M.N. Vora, "Hadoop-HBase for Large-scale Data," Proceeding of the International Conference on Computer Science and Network Technology, pp. 601-605, 2011.
  15. K. Shvachko, H. Kuang, S. Radia, and R. Chansler, "The Hadoop Distributed File System," Proceeding of the Symposium on Mass Storage Systems and Technologies, pp. 1-10, 2010.
  16. Apache ZooKeeper, https://zookeeper.apache.org. (June 29, 2017).
  17. Apache Phoenix, https://phoenix.apache.org. (June 29, 2017).
  18. P.A. Bernstein and D.W. Chiu, "Using Semijoins to Solve Relational Queries," Journal of the Association for Computing Machinery, Vol. 28, No. 1, pp. 25-40, 1981. https://doi.org/10.1145/322234.322238
  19. Naver news. http://news.naver.com. (June 29, 2017).