Search | Korea Science

A Design and Implementation of Heterogeneous Metadata Searching System using Ontology (Ontology를 이용한 이종 메타데이터 검색 시스템의 설계 및 구현)

Choe, Hyun-Jong;Kim, Tae-Young
- Journal of The Korean Association of Information Education
- /
- v.8 no.3
- /
- pp.353-360
- /
- 2004
World Wide Web is not more meaningless sea of information but is becoming the Semantic Web that provides many users with meaningful information. The starting point is the XML and metadata, RDF is a stopover which gives technique to relate arbitrary web resources. And now, the semantic and logic of web resources can be settled in the Ontology. A lot of educational multimedia web resources in Korea have produced their metadata with KERIS's KEM(Korea Educational Metadata). Therefore our country have to start the study of the semantic and logic in web resources. But, many researchers in Korea are more eager to study Dublin Core's DC and SCORM's LOM metadata specification than KEM. Thus the study of method about sharing and integrating these three metadata specifications should be performed before the study of semantic and logic in web resources in Korea. We design the Ontology to integrate these three metadata specifications and implement the prototype system using this Ontology. These three metadata have some elements that have same labels and meanings, and other elements have different labels and same meanings. To match these different labels which have same meanings, we adapted the one-to-one mapping technique in designing our Ontology. This designed Ontology was imported as "integrated schema" in our prototype searching system to integrate three different metadata in databases. Moreover we know that the more specific property design of class in Ontology was needed in order to provide users with more informed searching results such as synonym, antonym, hierarchy and associations.
PDF

A 3-Layered Information Integration System based on MDRs End Ontology (MDR과 온톨로지를 결합한 3계층 정보 통합 시스템)

Baik, Doo-Kwon;Choi, Yo-Han;Park, Sung-Kong;Lee, Jeong-Oog;Jeong, Dong-Won
- The KIPS Transactions:PartD
- /
- v.10D no.2
- /
- pp.247-260
- /
- 2003
To share and standardize information, especially in the database environments, MDR (Metadata Registry) can be used to integrate various heterogeneous databases within a particular domain. But due to the discrepancies of data element representation between organizations, global information integration is not so easy. And users who are searching integrated information on the Web have limitation to obtain schema information for the underlying source databases. To solve those problems, in this paper, we present a 3-layered Information Integration System (LI2S) based on MDRs and Ontology. The purpose of proposed architecture is to define information integration model, which combine both of the nature of MDRs standard specification and functionality of ontology for the concept and relation. Adopting agent technology to the proposed model plays a key role to support the hierarchical and independent information integration architecture. Ontology is used as for a role of semantic network from which it extracts concept from the user query and the establishment of relationship between MDRs for the data element. (MDR and Knowledge Base are used as for the solution of discrepancies of data element representation between MDRs. Based on this architectural concept, LI2S was designed and implemented.
https://doi.org/10.3745/KIPSTD.2003.10D.2.247 인용 PDF KSCI

A Study on Distributed Parallel SWRL Inference in an In-Memory-Based Cluster Environment (인메모리 기반의 클러스터 환경에서 분산 병렬 SWRL 추론에 대한 연구)

Lee, Wan-Gon;Bae, Seok-Hyun;Park, Young-Tack
- Journal of KIISE
- /
- v.45 no.3
- /
- pp.224-233
- /
- 2018
Recently, there are many of studies on SWRL reasoning engine based on user-defined rules in a distributed environment using a large-scale ontology. Unlike the schema based axiom rules, efficient inference orders cannot be defined in SWRL rules. There is also a large volumet of network shuffled data produced by unnecessary iterative processes. To solve these problems, in this study, we propose a method that uses Map-Reduce algorithm and distributed in-memory framework to deduce multiple rules simultaneously and minimizes the volume data shuffling occurring between distributed machines in the cluster. For the experiment, we use WiseKB ontology composed of 200 million triples and 36 user-defined rules. We found that the proposed reasoner makes inferences in 16 minutes and is 2.7 times faster than previous reasoning systems that used LUBM benchmark dataset.
https://doi.org/10.5626/JOK.2018.45.3.224 인용 KSCI

A Study on the Conceptual Modeling and Implementation of a Semantic Search System (시맨틱 검색 시스템의 개념적 모형화와 그 구현에 대한 연구)

Hana, Dong-Il;Kwonb, Hyeong-In;Chong, Hak-Jin
- Journal of Intelligence and Information Systems
- /
- v.14 no.1
- /
- pp.67-84
- /
- 2008
This paper proposes a design and realization for the semantic search system. The proposed model includes three Architecture Layers of a Semantic Search System ; (they are conceptually named as) the Knowledge Acquisition, the Knowledge Representation and the Knowledge Utilization. Each of these three Layers are designed to interactively work together, so as to maximize the users' information needs. The Knowledge Acquisition Layer includes index and storage of Semantic Metadata from various source of web contents(eg : text, image, multimedia and so on). The Knowledge Representation Layer includes the ontology schema and instance, through the process of semantic search by ontology based query expansion. Finally, the Knowledge Utilization Layer includes the users to search query intuitively, and get its results without the users'knowledge of semantic web language or ontology. So far as the design and the realization of the semantic search site is concerned, the proposedsemantic search system will offer useful implications to the researchers and practitioners so as to improve the research level to the commercial use.
PDF

Design of DatawareHouse Real-Time Cleansing System using XMDR (XMDR을 이용한 데이터웨어하우스 실시간 데이터 정제 시스템 설계)

Song, Hong-Youl;Jung, Kye-Dong;Choi, Young-Keum
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.14 no.8
- /
- pp.1861-1867
- /
- 2010
A datawarehouse is generally used in organizations for decision and policy making. And In a distribute environment when a new system is added, there needs considerable amount of time and cost due to the difference between the systems. Therefore, to solve this matter. Firstly, heterogeneous data structures can be handled by creating abstract queries according to the standard schema and by separating the queries using XMDR. Secondly, metadata dictionary which defines synonyms of metadata and methods for data expression is used to overcome difference of definition and expression of data. Especially, work presented in this thesis provides standardized information for data integration and minimizing the effects of integration on local systems in discrete environments using XMDR to create information of data warehouse in realtime.
https://doi.org/10.6109/jkiice.2010.14.8.1861 인용 PDF KSCI

Spatio-Temporal Semantic Sensor Web based on SSNO (SSNO 기반 시공간 시맨틱 센서 웹)

Shin, In-Su;Kim, Su-Jeong;Kim, Jeong-Joon;Han, Ki-Joon
- Spatial Information Research
- /
- v.22 no.5
- /
- pp.9-18
- /
- 2014
According to the recent development of the ubiquitous computing environment, the use of spatio-temporal data from sensors with GPS is increasing, and studies on the Semantic Sensor Web using spatio-temporal data for providing different kinds of services are being actively conducted. Especially, the W3C developed the SSNO(Semantic Sensor Network Ontology) which uses sensor-related standards such as the SWE(Sensor Web Enablement) of OGC and defines classes and properties for expressing sensor data. Since these studies are available for the query processing about non-spatio-temporal sensor data, it is hard to apply them to spatio-temporal sensor data processing which uses spatio-temporal data types and operators. Therefore, in this paper, we developed the SWE based on SSNO which supports the spatio-temporal sensor data types and operators expanding spatial data types and operators in "OpenGIS Simple Feature Specification for SQL" by OGC. The system receives SensorML(Sensor Model Language) and O&M (Observations and Measurements) Schema and converts the data into SSNO. It also performs the efficient query processing which supports spatio-temporal operators and reasoning rules. In addition, we have proved that this system can be utilized for the web service by applying it to a virtual scenario.
https://doi.org/10.12672/ksis.2014.22.5.009 인용 PDF KSCI

A Dynamic Management Method for FOAF Using RSS and OLAP cube (RSS와 OLAP 큐브를 이용한 FOAF의 동적 관리 기법)

Sohn, Jong-Soo;Chung, In-Jeong
- Journal of Intelligence and Information Systems
- /
- v.17 no.2
- /
- pp.39-60
- /
- 2011
Since the introduction of web 2.0 technology, social network service has been recognized as the foundation of an important future information technology. The advent of web 2.0 has led to the change of content creators. In the existing web, content creators are service providers, whereas they have changed into service users in the recent web. Users share experiences with other users improving contents quality, thereby it has increased the importance of social network. As a result, diverse forms of social network service have been emerged from relations and experiences of users. Social network is a network to construct and express social relations among people who share interests and activities. Today's social network service has not merely confined itself to showing user interactions, but it has also developed into a level in which content generation and evaluation are interacting with each other. As the volume of contents generated from social network service and the number of connections between users have drastically increased, the social network extraction method becomes more complicated. Consequently the following problems for the social network extraction arise. First problem lies in insufficiency of representational power of object in the social network. Second problem is incapability of expressional power in the diverse connections among users. Third problem is the difficulty of creating dynamic change in the social network due to change in user interests. And lastly, lack of method capable of integrating and processing data efficiently in the heterogeneous distributed computing environment. The first and last problems can be solved by using FOAF, a tool for describing ontology-based user profiles for construction of social network. However, solving second and third problems require a novel technology to reflect dynamic change of user interests and relations. In this paper, we propose a novel method to overcome the above problems of existing social network extraction method by applying FOAF (a tool for describing user profiles) and RSS (a literary web work publishing mechanism) to OLAP system in order to dynamically innovate and manage FOAF. We employed data interoperability which is an important characteristic of FOAF in this paper. Next we used RSS to reflect such changes as time flow and user interests. RSS, a tool for literary web work, provides standard vocabulary for distribution at web sites and contents in the form of RDF/XML. In this paper, we collect personal information and relations of users by utilizing FOAF. We also collect user contents by utilizing RSS. Finally, collected data is inserted into the database by star schema. The system we proposed in this paper generates OLAP cube using data in the database. 'Dynamic FOAF Management Algorithm' processes generated OLAP cube. Dynamic FOAF Management Algorithm consists of two functions: one is find_id_interest() and the other is find_relation (). Find_id_interest() is used to extract user interests during the input period, and find-relation() extracts users matching user interests. Finally, the proposed system reconstructs FOAF by reflecting extracted relationships and interests of users. For the justification of the suggested idea, we showed the implemented result together with its analysis. We used C# language and MS-SQL database, and input FOAF and RSS as data collected from livejournal.com. The implemented result shows that foaf : interest of users has reached an average of 19 percent increase for four weeks. In proportion to the increased foaf : interest change, the number of foaf : knows of users has grown an average of 9 percent for four weeks. As we use FOAF and RSS as basic data which have a wide support in web 2.0 and social network service, we have a definite advantage in utilizing user data distributed in the diverse web sites and services regardless of language and types of computer. By using suggested method in this paper, we can provide better services coping with the rapid change of user interests with the automatic application of FOAF.
https://doi.org/10.13088/jiis.2011.17.2.039 인용 PDF KSCI

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.111-136
- /
- 2018
In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.
https://doi.org/10.13088/jiis.2018.24.4.111 인용 PDF KSCI HTML

Search Result 98, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)