DOI QR코드

DOI QR Code

Evaluation of Knowledge Graph for Interoperating Digital Records

디지털 기록의 상호운용을 위한 지식그래프의 평가

  • 박하람 (중앙대학교 일반대학원 문헌정보학과 문헌정보학전공) ;
  • 김학래 (중앙대학교 사회과학대학 문헌정보학과)
  • Received : 2023.10.17
  • Accepted : 2023.11.20
  • Published : 2023.11.30

Abstract

A digital archive is an online platform for preserving and utilizing digital records worthy of continued preservation. However, there are no shared standards for functionality, metadata, or data technical principles across digital archives in Korea. These issues create challenges in linking distributed digital records. This study proposes a common vocabulary for digital archives to enhance the interoperability of digital records and evaluates the interoperability of the digital archive built with the common vocabulary. We collect and analyze data from the digital archive on the Korean financial crisis of 1997 to construct a knowledge graph and compare its interoperability with the knowledge graph built with RiC-O. The archive and the knowledge graph underwent evaluation using the FAIR data principles evaluation framework. The constructed knowledge graph links various objects in the archive and provides contextual information to aid in understanding the archive. The results demonstrate that a knowledge graph built with a common vocabulary significantly improves the linkage, search, and interoperability of digital records compared to a traditional archive.

디지털 아카이브는 지속적으로 보존할 가치가 있는 디지털 기록을 보존하고 활용하기 위한 온라인 플랫폼이다. 그러나 국내에서 운영되고 있는 디지털 아카이브는 기능 메타데이터, 데이터의 기술원칙과 관련된 공통 원칙이 존재하지 않는다. 이는 분산적으로 존재하는 디지털 기록을 연계하기 힘들게 만드는 요인이 된다. 본 연구는 디지털 기록의 상호운용을 개선하기 위한 방안으로 디지털 아카이브를 위한 공통 어휘를 제안하고, 공통 어휘로 구축된 디지털 아카이브의 상호운용성을 평가한다. 1997 외환위기 아카이브의 데이터를 수집·분석하여 지식그래프를 구축하고, RiC-O로 구축된 지식그래프와 상호운용성을 비교한다, FAIR 데이터 원칙의 평가 프레임워크는 1997 외환위기 아카이브와 지식그래프를 평가하는 데 활용된다. 구축된 지식그래프는 기록의 다양한 개체가 서로 연계되고, 기록의 이해에 도움이 되는 맥락 정보를 제공한다. 검증 결과는 공통 어휘로 구축된 지식그래프가 기존 아카이브에 비해 디지털 기록의 연계와 검색, 상호운용 관점에서 향상된 결과를 보인다.

Keywords

Acknowledgement

이 논문은 2023년 대한민국 교육부와 한국연구재단의 지원을 받아 수행된 연구임

References

  1. Christian Institute for the Study of Justice and Development (2016). Unsan Kim Kwan-seok Archive. Available: http://jpic.org/archive/
  2. Han, Hui-Jeong (2018). A study on the Current Status and Implications of Digital Archive in Korea. Journal of D-Culture Archives, 10(1), 43-52.
  3. Han, Sangeun & Park, Heejin (2022). A Study on Wikidata Utilization for Digital Archives. Journal of Korean Society of Archives and Records Management, 22(1), 201-217. https://doi.org/10.14404/JKSARM.2022.22.1.201
  4. Jeong, Hoemyeong & Lee, Sungsook (2021). A Study on the Application of Records in Contexts-Ontology (RiC-O) for the Description of Archives Contexts in a Digital Environment. Journal of Korean Society of Archives and Records Management, 21(2), 23-48. https://doi.org/10.14404/JKSARM.2021.21.2.023
  5. Jung, Joo Young (2021). A Study on Establishment of Semantic Digital Archive of the Namsan Arts Center. Journal of korean theatre studies association, 1(77), 211- 248. https://doi.org/10.18396/ktsa.2021.1.77.006
  6. Kang, Minjeong & Chang, Wookwon (2021). Study on Design of Protest Song Metadata based on OAIS Reference Model. Journal of Korean Society of Archives and Records Management, 21(1), 211-230. https://doi.org/10.14404/JKSARM.2021.21.1.211
  7. Kim, Haklae (2017). Knowledge Graph. Seoul: CommunicationBooks.
  8. Kim, Haklae (2021). FAIR Principles: Considerations for Implementing Digital Archives from a Data Perspective. Journal of Korean Society of Archives and Records Management, 21(2), 155-172. https://doi.org/10.14404/JKSARM.2021.21.2.155
  9. Kim, Hee-Jung (2003). A Study on e-Journal Archiving based on the OAIS Reference Model. Journal of Korean Society of Archives and Records Management, 3(2), 115-141. https://doi.org/10.14404/JKSARM.2003.3.2.115
  10. Kim, Moonhee & Chang, Wookwon (2021). A Study on the Establishment of Digital Archives by Jang Jae-seong, a Gwangju Student Independence Activist. Journal of Korean Society of Archives and Records Management, 21(4), 19-43. https://doi.org/10.14404/JKSARM.2021.21.4.019
  11. Kim, You-Seung (2010). A Theoretical Study on Establishing Archive 2.0. Journal of Korean Society of Archives and Records Management, 10(2), 31-52. https://doi.org/10.14404/JKSARM.2010.10.2.031
  12. Lee, Seungmin (2017). Construction of Preservation Description Framework for Digital Archiving. Journal of Korean Library and Information Science Society, 48(4), 129- 151. https://doi.org/10.16981/kliss.48.4.201712.129
  13. Lee, Yu-kyeong & Kim, Haklae (2020). A Knowledge Graph of the Korean Financial Crisis of 1997: A Relationship-Oriented Approach to Digital Archives. Journal of Korean Society of Archives and Records Management, 20(4), 1-17. https://doi.org/10.14404/JKSARM.2020.20.4.001
  14. National Library of Korea (2023). Old Newspaper Digital Collection. Korea Newspaper Archive. Available: https://nl.go.kr/newspaper/oldnews_age.do
  15. Park, Haram & Kim, Haklae (2021). A Knowledge Graph on Japanese "Comfort Women": Interlinking Fragmented Digital Archival Resources. Journal of Korean Society of Archives and Records Management, 21(3), 61-78. https://doi.org/10.14404/JKSARM.2021.21.3.061
  16. Park, Haram & Kim, Haklae (2022). DCAT-AP-KR: Application Profile for Interoperability of Data Portals in Korea. Journal of Digital Contents Society, 23(11), 2249-2258. https://doi.org/10.9728/dcs.2022.23.11.2249
  17. Park, Miran (2021). A Study on the Improvement of Work Networks of the National Theater Company of Korea Digital Archive. The Journal of Korean drama and theatre, 73, 51-86. https://doi.org/10.17938/tjkdat.2021..73.51
  18. Park, Sun-hee (2019). A Study on Improving Record Contextual Information and Developing Integrated System -Focusing on RiC-CM and RiC-O-, The Korean Journal of Archival, Information and Cultural Studies 9, 55-96.
  19. Rhee, Hea Lim (2018). Developing the Korean National Archaeological Data Digital Archive: An Exploratory Study. Journal of Korean Society of Archives and Records Management, 18(2), 1-28. https://doi.org/10.14404/JKSARM.2018.18.2.001
  20. Shin, JeongA (2020). Building Local Digital Archives: The Case of "Gyeonggi-do Memory". Journal of Korean Society of Archives and Records Management, 20(3), 161-166. https://doi.org/10.14404/JKSARM.2020.20.3.161
  21. Technical Specification for Long-Term Preservation Package Part 2: Directory Structured Format(NEO3) Version 1.1. NAK 31-2 2020(v1.1).
  22. Yang, Seoung Yoon (2022). Status of classical Japanese digital archives Introduction to the 'NIJL-NW project'and the academic database of classical literature. Journal of Korean Classical Literature and Education, 49, 7-44. https://doi.org/10.17319/cle.2022..49.7
  23. Yim, Jin Hee (2006). The composition and structure of Archival Information Packages(AIP) for a long-term preservation of electronic records. The Korean Journal of Archival Studies, 13, 41-90.
  24. Albertoni, R., Browning, D., Cox, S., Beltran, A. G., Perego, A., Perego, A., & Winstanley, P. (2020). Data Catalog Vocabulary (DCAT) - Version 2. W3C. Available: https://www.w3.org/TR/vocab-dcat-2/
  25. Albertoni, R., Browning, D., Cox, S., Beltran, A. G., Perego, A., Perego, A., & Winstanley, P. (2023). Data Catalog Vocabulary (DCAT) - Version 3. W3C. Available: https://www.w3.org/TR/vocab-dcat-3/
  26. Amdouni, E. & Jonquet, C. (2022). FAIR or FAIRer? An Integrated Quantitative FAIRness Assessment Grid for Semantic Resources and Ontologies. In Garoufallou, E., Ovalle-Perandones, MA., & Vlachidis, A. Eds. Metadata and Semantic Research. MTSR 2021. Communications in Computer and Information Science, 1537. Cham: Springer International Publishing, 67-80.
  27. Bahim, C., Casorran-Amilburu, C., Dekkers, M., Herczog, E., Loozen, N., Repanas, K., Russell, K., & Stall, S. (2020). The FAIR data maturity model: An approach to harmonise FAIR assessments. Data Science Journal, 19, https://doi.org/10.5334/dsj-2020-041
  28. Borst, W. N. (1997). Construction of Engineering Ontologies for Knowledge Sharing and Reuse. Enschede: Centre for Telematics and Information Technology (CTIT).
  29. Clavaud, F. & ICA EGAD (2021). International Council on Archives Records in Contexts Ontology (ICA RiC-O) version 0.2. Available: https://www.ica.org/standards/RiC/RiC-O_v0-2.html
  30. Clavaud, F. & Wildi, T. (2021. September 13). ICA Records in Contexts-Ontology (RiC-O): a Semantic Framework for Describing Archival Resources. Proceedings of Linked Archives International Workshop 2021 co-located with 25th International Conference on Theory and Practice of Digital Libraries (TPDL 2021).
  31. Collins, S., Genova, F., Harrower, N., Hodson, S., Jones, S., Laaksonen, L., Mietchen, D., Petrauskaite, R., & Wittenburg, P. (2018). Turning Fair into Reality: Final Report and Action Plan from the European Commission Expert Group on Fair Data. Research report. European Commission; European Commission.
  32. Corpas, M., Kovalevskaya, N. V., McMurray, A., & Nielsen, F. (2018). A FAIR Guide for Data Providers to Maximise Sharing of Human Genomic Data. PLOS Computational Biology, 14(3), https://doi.org/10.1371/journal.pcbi.1005873
  33. Cox, S., Gonzalez-Beltran, A. N., Magagna, B., & Marinescu, M. C. (2021). Ten simple rules for making a vocabulary FAIR. PLOS Computational Biology, 17(6), https://doi.org/10.1371/journal.pcbi.1009041
  34. Devaraju, A. & Huber, R. (2021). An automated solution for measuring the progress toward FAIR research data. Patterns, 2(11), https://doi.org/10.1016/j.patter.2021.100370
  35. FAIRsFAIR (2021). FAIR Aware. Available: https://fairaware.dans.knaw.nl/
  36. Fons, T., Penka, J., & Wallis, R. (2012). OCLC's linked data initiative: Using schema.org to make library data relevant on the web. Information Standards Quarterly, 24(2-3), 29-33. http://dx.doi.org/10.3789/isqv24n2-3.2012.05
  37. FORCE11 (2016). The Fair Data Principles. Available: https://force11.org/info/the-fair-data-principles/
  38. GO FAIR (2022). FAIR Principles. GO FAIR. Available: https://www.go-fair.org/fair-principles/
  39. Gracy, K. F. (2015). Archival description and linked data: a preliminary study of opportunities and implementation challenges. Archival Science, 15(3), 239-294. https://doi.org/10.1007/s10502-014-9216-2
  40. Gruber, T. (2009). Ontology. In LIU, L. & OZSU, M. T. eds. Encyclopedia of Database Systems. Boston: Springer.
  41. Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199-220. https://doi.org/10.1006/knac.1993.1008
  42. Han, M. K., Cole, T. W., Lampron, P., & Sarol, M. J. (2015). Exposing Library Holdings Metadata in RDF Using Schema.org Semantics. Proceedings of International Conference on Dublin Core and Metadata Applications, Sao Paulo, Brazil.
  43. Haux, C. & Knaup, P. (2019). Using FAIR Metadata for Secondary Use of Administrative Claims Data. Studies in Health Technology and Informatics, 264, 1472-1473. https://doi.org/https://doi.org/10.3233/SHTI190490
  44. Hawkins, A. (2022). Archives, linked data and the digital humanities: increasing access to digitised and born-digital archives via the semantic web. Archival Science, 22, 319-344. https://doi.org/10.1007/s10502-021-09381-0
  45. Hyvonen, E., Heino, E., Leskinen, P., Ikkala, E., Koho, M., Tamper, M., Tuominen, E., & Makela, E. (2016). WarSampo Data Service and Semantic Portal for Publishing Linked Open Data About the Second World War History. In Sack, H., Blomqvist, E., d'Aquin, M., Ghidini, C., Ponzetto, S., & Lange, C. eds. The Semantic Web. Latest Advances and New Domains. ESWC 2016. Lecture Notes in Computer Science, 9678. Cham: Springer.
  46. Koho, M., Ikkala, E., Leskinen, P., Tamper, M., Tuominen, J., & Hyvonen, E. (2021). WarSampo knowledge graph: Finland in the second world war as linked open data. Semantic Web, 12(2), 265-278. https://doi.org/10.3233/sw-200392
  47. Koster, L. & Woutersen-Windhouwer, S. (2018), FAIR Principles for Library, Archive and Museum Collections: A Proposal for Standards for Reusable Collections. Code4Lib Journal, 40,
  48. Lampron, P., Mixter, J., & Han M. K. (2016). Challenges of Mapping Digital Collections Metadata to Schema.org: Working with CONTENTdm. In Garoufallou, E., Subirats Coll, I., Stellato, A., & Greenberg, J. eds. Metadata and Semantics Research. MTSR 2016. Communications in Computer and Information Science, 672. Cham: Springer, 181-186.
  49. Lavoie, B. (2000). Meeting the challenges of digital preservation: the OAIS reference model. OCLC Newsletter, 243, 26-30.
  50. Maali, F., Cyganiak, R., & Peristeras, V. (2010). Enabling interoperability of government data catalogues. In Wimmer, M. A., Chappelet, JL., Janssen, M., & Scholl, H. J. eds. Electronic Government. EGOV 2010. Lecture Notes in Computer Science, 6228. Berlin, Heidelberg: Springer.
  51. Matienzo, M. A., Roke, E. R., & Carlson, S. (2017). Creating a Linked Data-Friendly Metadata Application Profile for Archival Description. International Conference on Dublin Core and Metadata Applications, 112-116.
  52. Mazimwe, A., Hammouda, I., & Gidudu, A. (2021). Implementation of FAIR principles for ontologies in the disaster domain: A systematic literature review. ISPRS International Journal of Geo-Information, 10(5), https://doi.org/10.3390/ijgi10050324
  53. Mikhaylova, D. & Metilli, D. (2023). Extending RiC-O to model historical architectural archives: The ITDT ontology. Journal on Computing and Cultural Heritage, 16(4), 1-15. https://doi.org/10.1145/3606706
  54. Mitchell, E. T. (2013). Linked Data Publishing for Libraries, Archives, and Museums: What Is the Next Step?. Journal of Web Librarianship, 7(2), 231-236. https://doi.org/10.1080/19322909.2013.785849
  55. Mons, B., Neylon, C., Velterop, J., Dumontier, M., da Silva Santos, L., & Wilkinson, M. D. (2017). Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud. Information Services & Use, 37(1), 49-56. https://doi.org/10.3233/ISU-170824
  56. Moss, M., Thomas, D., & Gollins, T. (2018). The reconfiguration of the archive as data to be mined. Archivaria, 86, 118-151.
  57. Park, H. & Kim, H. (2023). Japanese Military 'Comfort Women' Knowledge Graph: Linking Fragmented Digital Records. Information Technology and Libraries. Information Technology and Libraries, 42(1), https://doi.org/10.6017/ital.v42i1.15799
  58. Ridolfo, J., Hart-Davidson, W., & McLeod, M. (2011). Imaging The Michigan State University Israelite Samaritan Scroll Collection as the Foundation for a Thriving Social Network. The Journal of Community Informatics, 7(3S1).
  59. Schema.org (2022). Documentation. Available: https://schema.org/docs/documents.html
  60. Singhal, A. (2012). Introducing the Knowledge Graph: things, not strings. Available: https://blog.google/products/search/introducing-knowledge-graph-things-not/
  61. Studer, R., Benjamins, V. R., & Fensel, D. (1998). Knowledge Engineering: Principles and methods. Data & Knowledge Engineering, 25(1-2), 161-197. https://doi.org/10.1016/S0169-023X(97)00056-6
  62. Trojahn, C. (2022). FAIR Ontologies, FAIR Ontology Alignments. Proceeding of the 23rd International Conference on Knowledge Engineering and Knowledge Management, Bozen-Bolzano, Italy.
  63. Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Gonzelez-Beltran, A., Gray, A., Groth, P., Goble, C., Grethe, J. S., Jeringa, J., Hoen, P., Hooft, R., Kuhn, R., Kok, R., Kok, J., Lusher, S. J., Marton, M. E., Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., Schaik, R., Sansone, S., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M. A., Thompson, M., Lei, J., Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., & Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), https://doi.org/10.1038/sdata.2016.18
  64. Wilkinson, M. D., Dumontier, M., Appleton, I.J., Gabrielle, A., Axton, M., Baak, A., Blomberg, N., Boiten, J., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Gonzalez-Beltran, A., Gray, A., Groth, P., Goble, C., Grethe, J. S., Heringa, J., 't Hoen, P., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S. J., Martone, M. E., Mons, A., Packer, A. L., Persson, B., Rocca-Serra, P., Roos, M., Schaik, R., Sansone, S., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M. A., Thompson, M., Lei, J., Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., & Mons, B. (2019). Evaluating FAIR maturity through a scalable, automated, community-governed framework. Scientific Data, 6(1), https://doi.org/10.1038/s41597-019-0184-5
  65. Yadav, D. (2016). Opportunities and challenges in creating digital archive and preservation: an overview. International Journal of Digital Library Services, 6(2), 63-73.
  66. Zeng, M. L. (2019) Semantic enrichment for enhancing LAM data and supporting digital humanities. review article. Profesional De La informacion, 28(1), https://doi.org/10.3145/epi.2019.ene.03