Big Data based Tourist Attractions Recommendation - Focus on Korean Tourism Organization Linked Open Data -

빅데이터 기반 관광지 추천 시스템 구현 - 한국관광공사 LOD를 중심으로 -

  • Ahn, Jinhyun (Dental Research Institute, Seoul National University) ;
  • Kim, Eung-Hee (BK21 Plus Dental Life Science, Seoul National University) ;
  • Kim, Hong-Gee (Healthcare Management and Informatics, Department of Dentistry, Seoul National University)
  • 안진현 (서울대학교 치학연구소) ;
  • 김응희 (서울대학교 치의학생명과학사업단 BK21 PLUS) ;
  • 김홍기 (서울대학교 치과대학 의료경영과정보학)
  • Received : 2017.10.13
  • Accepted : 2017.11.09
  • Published : 2017.11.30

Abstract

Conventional exhibition management information systems recommend tourist attractions that are close to the place in which an exhibition is held. Some recommended attractions by the location-based recommendation could be meaningless when nothing is related to the exhibition's topic. Our goal is to recommend attractions that are related to the content presented in the exhibition, which can be coined as content-based recommendation. Even though human exhibition curators can do this, the quality is limited to their manual task and knowledge. We propose an automatic way of discovering attractions relevant to an exhibition of interests. Language resources are incorporated to discover attractions that are more meaningful. Because a typical single machine is unable to deal with such large-scale language resources efficiently, we implemented the algorithm on top of Apache Spark, which is a well-known distributed computing framework. As a user interface prototype, a web-based system is implemented that provides users with a list of relevant attractions when users are browsing exhibition information, available at http://bike.snu.ac.kr/WARP. We carried out a case study based on Korean Tourism Organization Linked Open Data with Korean Wikipedia as a language resource. Experimental results are demonstrated to show the efficiency and effectiveness of the proposed system. The effectiveness was evaluated against well-known exhibitions. It is expected that the proposed approach will contribute to the development of both exhibition and tourist industries by motivating exhibition visitors to become active tourists.

기존 전시회 정보 제공 서비스는 전시회가 열리는 장소 주변의 관광지를 추천한다. 이러한 위치기반 추천의 경우 전시회의 내용과 관련이 없는 관광지를 추천할 수 있다는 한계점이 있다. 전시회 내용과 관련된 관광지를 관람객에게 추천함으로써 전시회에서 획득한 지식을 관광지에서 경험하는 데에 도움을 줄 필요가 있다. 전시회 큐레이터들이 전시회 내용과 관련된 관광지를 일일이 찾아 추천하는 방법이 있지만, 수작업이다 보니 큐레이터가 가지고 있는 배경지식의 범위 내에서만 추천이 가능하다는 한계점이 있다. 수작업에 따른 오류가 있을 수도 있기 때문에 자동화된 방법이 필요하다. 본 연구에서는 언어자원 빅데이터를 활용하여 전시회 내용과 관련된 관광지를 자동으로 추천하는 방법을 제안한다. 언어자원으로는 한국관광공사 LOD(Linked Open Data), 위키피디아, 국립국어원 사전 등을 활용했다. 단일 컴퓨터로는 이러한 대용량 언어자원을 효율적으로 처리하기 어렵기 때문에, 클라우드 컴퓨팅 프레임워크인 아파치 스파크(Apache Spark)에 기반하여 구현했다. 사용자가 웹브라우저를 통해 전시회 정보를 열람하면 본 알고리즘에 의해 추천된 관광지들을 같이 보여주는 웹인터페이스도 구현했다(http://bike.snu.ac.kr/WARP). 주요 전시회에 대한 관광지 추천 정확도에 대해 전문가 평가를 진행했다. 기존 방법에 비해 본 논문에서 제안한 방법의 정확도가 더 높았다. 본 연구를 활용하면 전시회 큐레이터의 수작업을 줄여줄 수 있고 전시회 관람자들을 관광지로 자연스럽게 유도할 수 있기 때문에, 전시산업과 관광산업 모두에게 도움이 될 수 있다.

Keywords

References

  1. 박상원.최동현.김은경.최기선(2010), "플러그인 컴포넌트 기반의 한국어 형태소 분석기," 제22회 한글 및 한국어 정보처리 학술대회 논문집, pp.197-201.
  2. 박연진.송경아.황재원.창병모(2015), "온톨로지 기반의 개인화된 여행 추천 시스템의 구현," 한국콘텐츠학회논문지, 15(9), pp.1-10. https://doi.org/10.5392/JKCA.2015.15.09.001
  3. 변상우(2015), "관광지 선택 동기가 관광지 이미지, 재방문의도에 미치는 영향에 관한 연구 - 감천문화마을을 중심으로 -," 경영과 정보연구, 34(3), pp.197-213.
  4. 오창호.남경화.공기열(2011), "Kano모형을 이용한 컨벤션서비스의 요인별 평가와 서비스 회복에 관한 연구," 경영과 정보연구, 30(2), pp.57-79.
  5. 오창호.육풍림.황재위.강선구(2012) , "Means-End Chain과 Laddering을 이용한 컨벤션도시의 브랜드가치 개발에 관한 연구," 경영과 정보연구, 31(2), pp.253-272.
  6. 유성열.이강배(2013), "유비쿼터스 기반의 컨벤션 서비스 모델," 경영과 정보연구, 32(5), pp.89-100.
  7. 윤애선.황순희.이은령.권혁철(2009), "한국어 어휘의미망 KorLex 1.5의 구축," 정보과학회논문지: 소프트웨어 및 응용, 36(1), pp.92-108.
  8. Abdullah Gani, Aisha Siddiqa, Shahaboddin Shamshirband, and Fariza Hanum(2016), "A survey on indexing techniques for big data: taxonomy and performance evaluation," Knowledge and Information Systems, 46(2), pp.241-284. https://doi.org/10.1007/s10115-015-0830-y
  9. Andrzej Bialecki, Robert Muir, and Grant Ingersoll(2012), "Apache lucene 4," SIGIR 2012 workshop on open source information retrieval, pp.17-24.
  10. B. Barla Cambazoglu, Enver Kayaaslan, Simon Jonassen, and Cevdet Aykanat(2013), "A term-based inverted index partitioning model for efficient distributed query processing," ACM Trans. Web, 7(3), Article 15, pp.1-23.
  11. Cesare Concordia, Stefan Gradmann, and Sjoerd Siebinga(2010), "Not just another portal, not just another digital library: A portrait of Europeana as an application program interface," International Federation of Library Associations and Institutions, 36(1), pp.61-69.
  12. Christian Bizer, Tom Heath, and Tim Berners-Lee(2009), "Linked data-the story so far," International Journal on Semantic Web and Information Systems, 5(3), pp. 1-22. https://doi.org/10.4018/jswis.2009081901
  13. Constantia Kakali, Irene Lourdi, Thomais Stasinopoulou, Lina Bountouri, Christos Papatheodorou, Martin Doerr, and Manolis Gergatsoulis(2007), "Integrating Dublin Core Metadata for Cultural Heritage Collections Using Ontologies," Proceedings of the International Conference on Dublin Core and metadata Applications, pp.128-139.
  14. Denny Vrandecic, Markus Krotzsch(2014), "Wikidata: A Free Collaborative Knowledge Base," Communications of the ACM, 57, pp.78-85.
  15. Farzaneh Mahdisoltani, Joanna Biega, and Fabian Suchanek(2014), "Yago3: A knowledge base from multilingual wikipedias," Proceedings of 7th Biennial Conference on Innovative Data Systems Research, pp.697-713.
  16. Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli(2016), "Embeddings for Word Sense Disambiguation: An Evaluation Study," Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp.897-907.
  17. J. Wojciechowski, B. Sakowicz, K. Dura, and A. Napieralski(2004), "MVC model, struts framework and file upload issues in web applications based on J2EE platform," Proceedings of the International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science, pp.342-345.
  18. Jeyavaishnavi Muralikumar, Sri Ananda Seelan, Narendranath Vijayakumar and Vidhya Balasubramanian(2017), "A statistical approach for modeling inter-document semantic relationships in digital libraries," Journal of Intelligent Information Systems, 48(3), pp.1-22. https://doi.org/10.1007/s10844-015-0383-2
  19. Jiaul H. Paik(2013), "A novel TF-IDF weighting scheme for effective ranking," Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pp. 343-352.
  20. Joel P. Lucas, Nuno Luz, Maria N. Moreno, Ricardo Anacleto, Ana Almeida Figueiredo, Constantino Martins(2013), "A hybrid recommendation approach for a tourism system," Expert Systems with Applications, 40(9), pp.3532-3550. https://doi.org/10.1016/j.eswa.2012.12.061
  21. Karen Sparck Jones(1972), "A Statistical Interpretation of Term Specificity and Its Application in Retrieval," Journal of Documentation, 28(1), pp.11-21. https://doi.org/10.1108/eb026526
  22. Kevin Meehan, Tom Lunney, Kevin Curran, Aiden McCaughey(2013), "Context- Aware Intelligent Recommendation System for Tourism", IEEE International Conference on Pervasive Computing and Communications Workshops, pp.328-331.
  23. Kewen Chen, Zuping Zhang, Jun Long and Hao Zhang(2016), "Turning from TF-IDF to TF-IGM for term weighting in text classification," Expert Systems with Applications, 66(C), pp.245-260. https://doi.org/10.1016/j.eswa.2016.09.009
  24. Key-Sun Choi and Hee-Sook Bae(2014), "Korean-Chinese-Japanese Multilingual WordNet with Shared Semantic Hierarchy," Proceedings of the International conference on language resource and evaluation, pp.1131-1134.
  25. Liyang Y(2011), "Linked open data," A Developer's Guide to the Semantic Web, Springer Berlin Heidelberg, pp.409-466.
  26. Martin Doerr, Stefan Gradmann, Steffen Hennicke, Antoine Isaac, Carlo Meghini, and Herbert van de Sompel(2010), "The Europeana Data Model (EDM)" World Library and Information Congress: 76th IFLA General Conference and Assembly, pp.1-12.
  27. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, and Justin Ma(2012), "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing," Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, pp.2-2.
  28. Michael Owens, and Grant Allen(2010), "SQLite," Apress LP.
  29. Soren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives(2007), "DBpedia: A nucleus for a web of open data," Proceedings of the 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, pp.722-735.
  30. Yang Changhui and Meng Hongyan(2016), "Research on Constructing Application System of Exhibition Integrated Information Service Platform in Airport Economic Zone," International Journal of u-and e-Service, Science and Technology, 9(4), pp.165-174.
  31. Yang Haiying(2010), "Exhibition Management Information System Design and Implementation," Proceedings of the International Conference on Computer and Automation Engineering, pp.633-636.
  32. Zhenbin Yang and Atreyi Kankanhalli (2013), "Innovation in government services: The case of open data," Proceedings of the International Working Conference on Transfer and Diffusion of IT, pp.644-651.