• Title/Summary/Keyword: DBpedia

Search Result 30, Processing Time 0.027 seconds

Building a Schema of the Korean DBpedia Ontology (한글 DBpedia 온톨로지 스키마 구축)

  • Kang, Min-Seo;Kim, Jae-Sung;Kim, Sun-Dong;Lee, Jae-Gil
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.139-142
    • /
    • 2014
  • 시맨틱웹의 구현 도구로써 온톨로지가 있다. 온톨로지는 지식개념의 의미적 연결을 하는데 사용된다. 영어 위키피디아를 토대로 한 영어 DBpedia 온톨로지는 스키마(owl파일 형태)와 인스턴스 모두 잘 구축이 되어있다. 그리고 영어 DBpedia의 각 Class에 한글은 레이블의 형태로 달려있다. 하지만 한글 레이블을 가지고 있지 않은 영어 DBpedia의 Class들이 절반이 넘기 때문에 한글 Class들만으로 된 스키마 구축은 의미가 있다. 한글 Class들로 만들어진 스키마가 있다면 두 한글 온톨로지 사이의 클래스 매칭 알고리즘을 위한 실험이나 한글 온톨로지 자동 증강 알고리즘의 연구 등에 유용하게 쓰일 수 있을 것이다. 본 논문에서 구축한 한글 DBpedia 온톨로지 스키마는 영문 DBpedia 온톨로지의 계층구조와 한글 클래스와 영문 클래스 사이의 매핑정보를 바탕으로 구축되었다. 그리고 기존에 제공되는 한글 DBpedia 온톨로지 클래스의 영어매핑 정보가 있는 한글 프로퍼티와 영어매핑 정보가 없는 한글 프로퍼티를 모두 한글 클래스의 프로퍼티로 입력해주었다.

  • PDF

Development of Location-based DBpedia Mobile Browser (위치 기반 DBpedia 모바일 브라우저 개발)

  • Lee, Suhyoung;Duan, HongZhou;Jung, Eunmi;Sun, YuXiang;Lee, Yongju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.1047-1048
    • /
    • 2017
  • 본 논문은 위치 기반 DBpedia 모바일 브라우저 개발에 관한 내용으로 사용자의 현재 위치를 중심으로 Google Map과 DBpedia를 매쉬업하여 주변의 DBpedia 개체를 표시하고, 링크를 통해 추가적인 RDF 시맨틱 정보를 탐색할 수 있는 기능을 제공한다. DBpedia는 Wikipedia로부터 구조화된 데이터를 추출하여 RDF 형식으로 저장한 지식베이스로서 오늘날 엄청난 규모의 빅데이터로 발전되고 있는 링크드 오픈 데이터(Linked Open Data)에서 가장 핵심으로 부각되고 있다. DBpedia는 약 73만개의 장소 및 지역에 관한 정보를 포함하여 약 4백 58만 가지의 다양한 개체들에 관한 정보를 가지고 있으며 여러 종류의 위치기반 데이터 세트도 보유하고 있다. 본 연구에서 개발된 브라우저는 이러한 데이터 세트 내용을 스마트폰의 위치정보서비스를 활용하여 주변에 있는 장소나 건물 등을 지도에 표시하고, 해당 개체에 대한 간단한 요약 정보와 추가적인 시맨틱 정보 검색을 위한 링크를 제공한다.

Constructing a Large Interlinked Ontology Network for the Web of Data (데이터의 웹을 위한 상호연결된 대규모 온톨로지 네트워크 구축)

  • Kang, Sin-Jae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.15-23
    • /
    • 2010
  • This paper presents a method of constructing a large interlinked ontology network for the Web of Data through the mapping among typical ontologies. When an ontology is open to the public, and more easily shared and used by people, its value is increased more and more. By linking CoreOnto, an IT core ontology constructed in Korea, to the worldwide ontology network, CoreOnto can be open to abroad and enhanced its usability. YAGO is an ontology constructed by combining category information of Wikipedia and taxonomy of WordNet, and used as the backbone of DBpedia, an ontology constructed by analyzing Wikipedia structure. So a mapping method is suggested by linking CoreOnto to YAGO and DBpedia through the synset of WordNet.

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

  • Kim, JaeHun;Lee, Myungjin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.43-61
    • /
    • 2019
  • Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.

Developing Responsive Web Application for Location Based DBpedia Retrieval (위치기반 DBpedia 검색을 위한 반응형 웹 애플리케이션 개발)

  • Lee, Su-hyoung;Lee, Yong-ju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.975-977
    • /
    • 2017
  • 본 논문은 링크드 오픈 데이터(Linked Open Data)의 일종인 DBpedia 개체를 위치기반으로 검색하여 지도상에 표시해주고, 표시된 개체중 하나를 선택하게 되면 개체에 관한 RDF 형태의 데이터를 파싱하여 개체에 관한 기본적인 정보와 사진과 외부로의 링크 등을 제공한다. 또한 개체가 가지는 특성을 지정하여 지도에 선택적으로 표시하는 필터링 기능을 제공한다. 이러한 웹 애플리케이션을 개발하기 위해 오픈소스 웹 프레임워크인 루비 온 레일즈(Ruby on Rails)를 사용하였고, HTML5와 Google Map API를 활용한 반응형 웹 애플리케이션으로 구현하였다.

DBpedia Web Search Application using Google Cloud Natural Language API (구글 클라우드 자연어 API를 이용한 DBpedia 웹 검색 애플리케이션)

  • Lee, Suhyoung;Kim, Taeyoung;Park, Sunjae;Lee, Yongju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.509-511
    • /
    • 2018
  • 본 논문은 링크드 오픈 데이터(Linked Open Data)의 일종인 DBpedia 개체를 자연어 기반으로 검색하는 애플리케이션 개발에 관한 논문이다. Google Cloud Natural Language API를 이용하여 자연어 입력을 분석하고, 이를 바탕으로 RDF(Resource Description Framework) 검색 언어인 스파클(Sparql) 질의 문장을 작성하여 결과를 웹 형식으로 반환해준다. 이를 통해 비문가도 손쉽게 링크드 오픈 데이터에 접근할 수 있는 기회를 제공하며 다양한 응용 가능성을 가진다.

Design of a Question-answering System Based on SPARQL (SPARQL 기반의 질의응답 시스템 설계)

  • Ahn, HyeokJu;Lee, SungHee;Kim, HarkSoo
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.153-155
    • /
    • 2014
  • 사용자가 질의한 내용에 대한 결과를 찾기 위해 본 논문은 DBPedia에서 제공해주는 트리플 구조를 TDB에 저장하고, 사용자 질의 문장에서 트리플을 찾은 뒤 해당 문장의 규칙을 추론하여 SPARQL 쿼리를 생성한 뒤, 마지막으로 Fuseki를 이용해 결과를 출력하는 Q&A시스템을 제안한다. SPARQL 쿼리를 생성함에 있어 질의의 정답을 찾아내는 타겟이 있다는 점과 한국어의 조사와 부사부분에서 쿼리가 변형될 수 있다는 점을 통해 유동적인 쿼리를 생성한다. 그리고 DBPedia에 없는 단어가 질의에서 나타날 수 있기 때문에 이를 정제해주는 작업 또한 필요하다. 한국어는 어절순서가 고정적이지 않다는 점, 조사, 부사에 의해 문장의 의미가 변형되는 또 다른 부분을 파악하여 앞으로 시스템을 개발함에 있어 정확률을 상승시킬 예정이다.

  • PDF

A New Semantic Distance Measurement Method using TF-IDF in Linked Open Data (링크드 오픈 데이터에서 TF-IDF를 이용한 새로운 시맨틱 거리 측정 기법)

  • Cho, Jung-Gil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2020
  • Linked Data allows structured data to be published in a standard way that datasets from various domains can be interlinked. With the rapid evolution of Linked Open Data(LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. In this paper, we propose a method, on top of the basic concept of Linked Data Semantic Distance (LDSD), for calculating the Linked Data semantic distance between resources that can be used in the LOD-based recommender system. The semantic distance measurement model proposed in this paper is based on a similarity measurement that combines the LOD-based semantic distance and a new link weight using TF-IDF, which is well known in the field of information retrieval. In order to verify the effectiveness of this paper's approach, performance was evaluated in the context of an LOD-based recommendation system using mixed data of DBpedia and MovieLens. Experimental results show that the proposed method shows higher accuracy compared to other similar methods. In addition, it contributed to the improvement of the accuracy of the recommender system by expanding the range of semantic distance calculation.

Conflict Resolution of Patterns for Generating Linked Data From Tables (테이블로부터 링크드 데이터 생성을 위한 패턴 충돌 해소)

  • Han, Yong-Jin;Kim, Kweon Yang;Park, Se Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.3
    • /
    • pp.285-291
    • /
    • 2014
  • Recently, many researchers have paid attention to the study on generation of new linked data from tables by using linked open data (e.g. RDF, OWL). This paper proposes a new method for such generation of linked data. A pattern-based method intrinsically has a conflict problem among patterns. For instance, several patterns, mapping a single header of a table into different properties of linked data, conflict with each others. Existing studies have sacrificed precision by applying a statistically dominant pattern or have ignored conflicting patterns to increase precision. The proposed method finds appropriate patterns for all headers in a given table by connecting patterns applied to the headers. Experiments using DBPedia and Wikipedia showed results that conflicts of patterns are effectively resolved by the proposed method.

An Experimental Study on the Automatic Interlinking of Meaning for the LOD Construction of Record Information (기록정보 LOD 구축을 위한 의미 상호연결 자동화 실험 연구)

  • Ha, Seung-rok;An, Dae-Jin;Yim, Jin-hee
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.17 no.4
    • /
    • pp.177-200
    • /
    • 2017
  • In a new technological environment such as big data and AI, LOD will link record information resources with various data from both inside and outside. At the heart of this connection is the interlinking technology, and interlinked LOD will realize the opening of record information as the highest level of open data. Given the ever-increasing amount of records, automation through interlinking algorithms is essential in building LODs. Therefore, this paper analyzed the structure of record information interlinking with the external data and characteristics of the record information to be considered when interconnecting. After collecting samples from the CAMS data of the National Archives, we constructed a record information's LOD. After that, we conducted a test bed that automatically interlinks the personal information of the record metadata with DBPedia. This confirms the automatic interlinking process and the performance and accuracy of the automation technology. Through the implications of the testbed, we have identified the considerations of the record information resources of the LOD interlinking process.