DOI QR코드

DOI QR Code

Ontology Matching Method for Solving Ontology Heterogeneity Issue

온톨로지 이질성 문제를 해결하기 위한 온톨로지 매칭 방법

  • Hongzhou Duan ;
  • Yongju Lee (Dept. Computer Science and Engineering, Kyungpook National University)
  • 단홍조우 (경북대학교 IT대학 컴퓨터학부) ;
  • 이용주 (경북대학교 IT대학 컴퓨터학부)
  • Received : 2024.04.13
  • Accepted : 2024.06.12
  • Published : 2024.06.30

Abstract

Ontologies are created by domain experts, but the same content may be expressed differently by each expert due to different understandings of domain knowledge. Since the ontology standardization is still lacking, multiple ontologies can be exist within the same domain, resulting in a phenomenon called the ontology heterogeneity. Therefore, we propose a novel ontology matching method that combines SCBOW(: Siames Continuois Bag Of Words) and BERT(: Bidirectional Encoder Representations from Transformers) models to solve the ontology heterogeneity issue. Ontologies are expressed as a graph and the SimRank algorithm is used to solve the one-to-many problem that can occur in ontology matching problems. Experimental results showed that our approach improves performance by about 8% over traditional matching algorithm. Proposed method can enhance and refine the alignment technology used in ontology matching.

온톨로지는 도메인 전문가에 의해 만들어지지만, 동일한 내용이라도 전문가마다 도메인 지식에 대한 이해가 다르기 때문에 상이하게 표현될 수 있다. 아직 온톨로지 표준화가 부족하기 때문에 동일한 도메인 내에 여러 개의 온톨로지가 존재할 수 있으며, 이로 인해 온톨로지 이질성이라는 현상이 발생한다. 따라서 우리는 온톨로지 이질성 문제를 해결하기 위해 SCBOW(: Siames Continuois Bag Of Words)와 BERT(: BiDirectional Encoder Representations from Transformers) 모델을 결합한 새로운 온톨로지 매칭 방법을 제안한다. 온톨로지를 그래프로 표현하며, 온톨로지 매칭 문제에서 발생할 수 있는 일대다 문제를 해결하기 위해 SimRank 알고리즘을 사용한다. 실험 결과 우리의 접근 방식이 전통적인 매칭 알고리즘보다 약 8%의 성능 향상을 보였다. 제안 방법은 온톨로지 매칭에 사용되는 정렬 기술을 향상하고 개선할 수 있다.

Keywords

Acknowledgement

본 논문은 교육부 및 한국연구재단의 4단계 BK21 사업(경북대학교 컴퓨터학부 지능융합 소프트웨어 교육연구단)으로 지원된 연구임(4120240214871). 본 논문은 2016년도 정부(교육부)의 재원으로 한국연구재단의 지원을 받아 수행된 기초연구사업임(No. 2016R1D1A1B02008553).

References

  1. Y. Lee and Y. Sun, "Entity Matching Method of Knowledge Graphs using Graph Convolutional Network and Embedding Techniques," Journal of Korean Institute of Information Technology, vol. 21, no. 6, June 2023, pp. 09-19.
  2. H. Duan and Y. Lee, "Entity Matching Method Using Semantic Similarity and Graph Convolutional Network Techniques," Journal of the Korean Institute of Electronic Communication Sciences, vol. 17, no. 5, Oct. 2022, pp. 801-808.
  3. T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," ICLR( International Conference on Learning Representations), Arizona, USA, Sept. 2013.
  4. J. Pennington, R. Socher, and C. Manning, "GloVe: Global Vectors for Word Representation," Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, Oct. 2014, pp. 1532-1543.
  5. P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," Transactions of the Association for Computational Linguistics, vol. 5, June 2016, pp. 135-146.
  6. T. Kenter, A. Borisov, and M. Rijke, "Siamese CBOW: Optimizing Word Embeddings for Sentence Representations," 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, Aug. 2016, pp. 941-951.
  7. J. Devlin, M Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2019 Conference of the North American Chapter of the Association for Computational Linguistics, vol. 1, June 2019, pp. 4171-4186.
  8. G. Jeh and J. Widom, "SimRank: A Measure of Structural-Context Similarity," 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, July 2002, pp. 538-543.
  9. D. M W Powers, "Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation," Journal of Machine Learning Technologies, vol. 2, no. 1, Jan. 2011, pp. 37-63.
  10. J. Lee and O. Kwon "Performance Assessment of Machine Learning and Deep Learning in Regional Name Identification and Classification in Scientific Documents," Journal of the Korean Institute of Electronic Communication Sciences, vol. 19, no. 2, Apr. 2024, pp. 389-396.