DOI QR코드

DOI QR Code

A New Semantic Distance Measurement Method using TF-IDF in Linked Open Data

링크드 오픈 데이터에서 TF-IDF를 이용한 새로운 시맨틱 거리 측정 기법

  • Cho, Jung-Gil (Department of Computer Engineering, Sungkyul University)
  • 조정길 (성결대학교 컴퓨터공학과)
  • Received : 2020.08.22
  • Accepted : 2020.10.20
  • Published : 2020.10.28

Abstract

Linked Data allows structured data to be published in a standard way that datasets from various domains can be interlinked. With the rapid evolution of Linked Open Data(LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. In this paper, we propose a method, on top of the basic concept of Linked Data Semantic Distance (LDSD), for calculating the Linked Data semantic distance between resources that can be used in the LOD-based recommender system. The semantic distance measurement model proposed in this paper is based on a similarity measurement that combines the LOD-based semantic distance and a new link weight using TF-IDF, which is well known in the field of information retrieval. In order to verify the effectiveness of this paper's approach, performance was evaluated in the context of an LOD-based recommendation system using mixed data of DBpedia and MovieLens. Experimental results show that the proposed method shows higher accuracy compared to other similar methods. In addition, it contributed to the improvement of the accuracy of the recommender system by expanding the range of semantic distance calculation.

링크드 데이터는 다양한 영역의 데이터세트를 서로 연결할 수 있는 표준 방식의 구조화된 데이터를 가능하게 한다. 그리고 링크드 오픈 데이터(LOD)의 급속한 발전에 따라 연구자들은 시맨틱 유사도 평가와 같은 특정 문제를 해결하기 위해 LOD를 이용하고 있다. 이 논문에서는 LOD-기반 추천 시스템에서 사용될 수 있는 자원 간의 링크드 데이터 시맨틱 거리를 계산하기위한 방법을 제안한다. 이 논문에서 제안된 시맨틱 거리 측정 모델은 LOD-기반 시맨틱 거리와 정보 검색 분야에서 잘 알려진 TF-IDF를 이용한 새로운 링크 가중치를 결합한 유사도 측정을 기반으로 한다. 이 논문의 접근방식의 효과성을 검증하기 위하여 DBpedia와 MovieLens의 혼합 데이터를 사용하여 LOD-기반 추천 시스템의 맥락에서 성능을 평가하였다. 실험 결과는 제안된 방법이 다른 유사한 방법과 비교하여 더 높은 정확도를 나타내었다. 또한 시맨틱 거리 계산의 범위를 넓혀서 추천 시스템의 정확도 향상에 기여하였다.

Keywords

References

  1. C. Bizer, T. Heath & T. Berners-Lee. (2009). Linked Data-The Story So Far. International Journal on Semantic Web and Information Systems, 5(3), 1-22. DOI : 10.4018/jswis.2009081901
  2. Google. (2004). RDF vocabulary description language 1.0: RDF schema. W3C[Online]. https://www.w3.org/2001/sw/RDFCore/Schema/200212bwm/
  3. V. C. Ostuni, T. D. Noia, E. D. Sciascio & R. Mirizzi. (2013). Top-n recommendations from implicit feedback leveraging linked open data. In Proceedings of the 7th ACM conference on Recommender systems, 85-92. DOI : 10.1145/2507157.2507172
  4. A. Passant. (2010). dbrec: Music Recommendations Using DBpedia. In ISWC 2010 SE-14, 209-224. DOI : 10.1007/978-3-642-17749-1_14
  5. S. E. Middleton, D. De Roure & N. R. Shadbolt. (2009). Ontology-based recommender systems. In Handbook on ontologies, 779-796.
  6. A. Passant. (2010, March). Measuring Semantic Distance on Linking Data and Using it for Resources Recommendations. In AAAI Spring Symposium: Linked Data Meets Artificial Intelligence (Vol. 77, p. 123).
  7. G. Piao, S. S. Ara & J. G. Breslin, (2015). Computing the Semantic Similarity of Resources in DBpedia for Recommendation Purposes. In 5th Joint International Semantic Technology Conference. (pp. 185-200). Springer, Cham. DOI: 10.1007/978-3-319-31676-5
  8. S. Alfarhood, K. Labille & S. Gauch. (2017) PLDSD: Propagated Linked Data Semantic Distance. IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises(WETICE), 278-283. DOI: 10.1109/WETICE.2017.16
  9. G. O. Silva, F. A. Durao & M. Capretz, (2019). PLDSD: Personalized Linked Data Semantic Distance for LOD-Based Recommender Systems. iiWAS2019. DOI: 10.1145/3366030.3306041
  10. S. Alfarhood, S. Gauch & K. Labille. (2019). Semantic Distance Spreading Accross Entities in Linked Open Data. Information 2019, 10(15), 1-15. DOI: 10.3390/info10010015
  11. D. S. Park & H. J. Kim. (2018). A Proposal of Join Vector for Semantic Factor Reflection in TF-IDF Based Keyword Extraction. Journal of KIIT, 16(2), 1-16. DOI : 10.14801/JKIIT.2018.16.2.1
  12. J. P. Leal, V. Rodrigues & R. Queiros. (2012). Computing semantic relatedness using dbpedia. Symposium on Languages, Applications and Technologies, 1st (pp. 133-147). Schloss Dagstuhl. DOI: 10.4230/OASIcs.SLATE.2012.133
  13. G. Piao & J. G. Breslin. (2016). Measuring Semantic Distance for Linked Open Data-enabled Recommander Systems. SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing, 315-320. DOI: 10.1145/2851613.2851839
  14. Google. (2020). Movielens 1M Dataset. grouplens [Online]. https://grouplens.org/datasets/movielens/1m/
  15. Google. (2020). MappingMovielens2DBpedia. researchGate [Online]. https://www.researchgate.net/publication/297369577_mapping-movielens-dbpedia
  16. J. G. Cho. (2020). A location localization method using Smartphone sensor on a subway. Journal of the Korea Convergence Society, 11(3), 37-43. DOI : 10.15207/JKCS.2020.11.3.037
  17. D. Khongorzul, S. M. Lee & M. H. Kim. (2019). OrdinalEncoder based DNN for Natural Gas Leak Prediction. Journal of the Korea Convergence Society, 10(10), 7-13. DOI : 10.15207/JKCS.2019.10.10.007