Search | Korea Science

Semiautomatic Pattern Mining for Training a Relation Extraction Model (관계추출 모델 학습을 위한 반자동 패턴 마이닝)

Choi, GyuHyeon;nam, Sangha;Choi, Key-Sun
- Annual Conference on Human and Language Technology
- /
- 2016.10a
- /
- pp.257-262
- /
- 2016
본 논문은 비구조적인 자연어 문장으로부터 두 개체 사이의 관계를 표현하는 구조적인 트리플을 밝히는 관계추출에 관한 연구를 기술한다. 사람이 직접 언어적 분석을 통해 트리플이 표현되는 형식을 입력하여 관계를 추출하는 규칙 기반 접근법에 비해 기계가 데이터로부터 표현 형식을 학습하는 기계학습 기반 접근법은 더 다양한 표현 형식을 확보할 수 있다. 기계학습을 이용하려면 모델을 훈련하기 위한 학습 데이터가 필요한데 학습 데이터가 수집되는 방식에 따라 지도 학습, 원격지도 학습 등으로 구분할 수 있다. 지도 학습은 사람이 학습 데이터를 만들어야하므로 사람의 노력이 많이 필요한 단점이 있지만 양질의 데이터를 사용하는 만큼 고성능의 관계추출 모델을 만들기 용이하다. 원격지도 학습은 사람의 노력을 필요로 하지 않고 학습 데이터를 만들 수 있지만 데이터의 질이 떨어지는 만큼 높은 관계추출 모델의 성능을 기대하기 어렵다. 본 연구는 기계학습을 통해 관계추출 모델을 훈련하는데 있어 지도 학습과 원격지도 학습이 가지는 단점을 서로 보완하여 타협점을 제시하는 학습 방법을 제안한다.
PDF

Integration of Ontology Open-World and Rule Closed-World Reasoning (온톨로지 Open World 추론과 규칙 Closed World 추론의 통합)

Choi, Jung-Hwa;Park, Young-Tack
- Journal of KIISE:Software and Applications
- /
- v.37 no.4
- /
- pp.282-296
- /
- 2010
OWL is an ontology language for the Semantic Web, and suited to modelling the knowledge of a specific domain in the real-world. Ontology also can infer new implicit knowledge from the explicit knowledge. However, the modeled knowledge cannot be complete as the whole of the common-sense of the human cannot be represented totally. Ontology do not concern handling nonmonotonic reasoning to detect incomplete modeling such as the integrity constraints and exceptions. A default rule can handle the exception about a specific class in ontology. Integrity constraint can be clear that restrictions on class define which and how many relationships the instances of that class must hold. In this paper, we propose a practical reasoning system for open and closed-world reasoning that supports a novel hybrid integration of ontology based on open world assumption (OWA) and non-monotonic rule based on closed-world assumption (CWA). The system utilizes a method to solve the problem which occurs when dealing with the incomplete knowledge under the OWA. The method uses the answer set programming (ASP) to find a solution. ASP is a logic-program, which can be seen as the computational embodiment of non-monotonic reasoning, and enables a query based on CWA to knowledge base (KB) of description logic. Our system not only finds practical cases from examples by the Protege, which require non-monotonic reasoning, but also estimates novel reasoning results for the cases based on KB which realizes a transparent integration of rules and ontologies supported by some well-known projects.
PDF KSCI

Efficient Image Retrieval using Minimal Spatial Relationships (최소 공간관계를 이용한 효율적인 이미지 검색)

Lee, Soo-Cheol;Hwang, Een-Jun;Byeon, Kwang-Jun
- Journal of KIISE:Databases
- /
- v.32 no.4
- /
- pp.383-393
- /
- 2005
Retrieval of images from image databases by spatial relationship can be effectively performed through visual interface systems. In these systems, the representation of image with 2D strings, which are derived from symbolic projections, provides an efficient and natural way to construct image index and is also an ideal representation for the visual query. With this approach, retrieval is reduced to matching two symbolic strings. However, using 2D-string representations, spatial relationships between the objects in the image might not be exactly specified. Ambiguities arise for the retrieval of images of 3D scenes. In order to remove ambiguous description of object spatial relationships, in this paper, images are referred by considering spatial relationships using the spatial location algebra for the 3D image scene. Also, we remove the repetitive spatial relationships using the several reduction rules. A reduction mechanism using these rules can be used in query processing systems that retrieve images by content. This could give better precision and flexibility in image retrieval.
PDF KSCI

Semiautomatic Pattern Mining for Training a Relation Extraction Model (관계추출 모델 학습을 위한 반자동 패턴 마이닝)

Choi, GyuHyeon;nam, Sangha;Choi, Key-Sun
- 한국어정보학회:학술대회논문집
- /
- 2016.10a
- /
- pp.257-262
- /
- 2016
본 논문은 비구조적인 자연어 문장으로부터 두 개체 사이의 관계를 표현하는 구조적인 트리플을 밝히는 관계추출에 관한 연구를 기술한다. 사람이 직접 언어적 분석을 통해 트리플이 표현되는 형식을 입력하여 관계를 추출하는 규칙 기반 접근법에 비해 기계가 데이터로부터 표현 형식을 학습하는 기계학습 기반 접근법은 더 다양한 표현 형식을 확보할 수 있다. 기계학습을 이용하려면 모델을 훈련하기 위한 학습 데이터가 필요한데 학습 데이터가 수집되는 방식에 따라 지도 학습, 원격지도 학습 등으로 구분할 수 있다. 지도 학습은 사람이 학습 데이터를 만들어야하므로 사람의 노력이 많이 필요한 단점이 있지만 양질의 데이터를 사용하는 만큼 고성능의 관계추출 모델을 만들기 용이하다. 원격지도 학습은 사람의 노력을 필요로 하지 않고 학습 데이터를 만들 수 있지만 데이터의 질이 떨어지는 만큼 높은 관계추출 모델의 성능을 기대하기 어렵다. 본 연구는 기계학습을 통해 관계추출 모델을 훈련하는데 있어 지도 학습과 원격지도 학습이 가지는 단점을 서로 보완하여 타협점을 제시하는 학습 방법을 제안한다.
PDF

A New Method for Improving Performance in ACE Relation Detect ion and Characterization (ACE 관계 추출과 특징화 과정에서 성능 향상을 위한 새로운 방법(1))

Kim, Kyung-Duk;Kim, Seok-Hwan;Lee, Gray Geun-Bae;Cha, Jeong-Won
- Annual Conference on Human and Language Technology
- /
- 2005.10a
- /
- pp.1-6
- /
- 2005
텍스트 기반 문서의 급증으로 인해 정보 추출 기술이 더욱 중요해지고 있다 특히 최근에 활발한 연구가 진행되고 있는 개체 간 관계 추출 기술은 정보검색과 질의응답 등 많은 분야에 걸쳐 활용될 수 있는 기술이다 본 논문은 기존의 자질 기반 관계 추출 시스템의 재현율을 향상시키기 위해 WHISK 알고리즘을 도입한 시스템에 관한 것이다. WHISK 알고리즘은 문장으로부터 관계에 참여하는 개체 쌍을 추출하는 규칙을 자동으로 학습한다. 그리고 시스템은 최대 엔트로피 모델을 이용하여 WHISK에 의해 추출된 개체 쌍에 적합한 관계 유형을 파악해 낸다. 본 논문은 시스템에 사용된 WHISK 알고리즘과 최대 엔트로피 모델에 대해서 알아보고, 실제로 WHISK 알고리즘을 도입하여 관계를 가지는 개체 쌍을 추출하여 문제를 해결했을 때 어느 정도의 성능 향상이 있는지 알아본다.
PDF

Semantic Inference System Using Backward Chaining (후방향 추론기법을 이용한 시멘틱 추론 시스템)

함영경;박영택
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.10a
- /
- pp.97-99
- /
- 2003
대부분의 웹 문서들은 HTML이나 XML로 표현된 웹의 정보들은 Syntactic 구조를 기반으로 표현되기 때문에, 소프트웨어가 정보를 처리하는데 한계가 있다. HTML은 문서의 display안을 위한 tag기반의 문서 표현 방식이고, XML은 문서의 구조를 사람이 이해하기 쉽도록 제안된 표현 방식이기 때문이다. 따라서, HTML 및 XML로 표현된 정보들을 가지고 서비스를 제공하는 웹 에이전트들은 사용자들에게 의미있는 서비스를 제공하기 위해 오프라인 상에서 많은 수작업을 수행해야만 했다. 이와 같은 문제점을 극복하기 위해서 미국과 유럽에서는 시멘틱 웹에 대한 연구를 활발히 진행하고 있다. 시멘틱 웹은 기존의 웹과는 달리 소프트웨어가 이해하고 처리 할 수 있는 형태(machine processable)로 정보를 표현하기 때문에 오프라인 상에서 수행되던 많은 작업들을 에이전트가 이해하고 처리할 수 있게 되었다. 그러나. 온톨로지를 구축하는 과정에서도 필연적으로 정보의 31(Incorrect, incomplete, Inconsistence)가 나타나고, 서비스의 결과 또한 온톨로지에 의해 좌우된다는 단점이 있다. 본 논문에서 제안하는 후방향 추론기법을 이용한 추론엔진은 다음과 같은 시스템을 제안한다. 첫째. 시멘틱 웹을 이용함으로써 소프트웨어 에이전트의 자동화 시스템을 제안한다. 둘째 은톨로지 정보의 한계성을 극복하기 위해 규칙기반의 후방향 추론 기법을 사용하는 시멘틱 추론엔진을 제안한다. 본 논문에서 제안하는 후방향 추론기법을 이용한 시멘틱 추론시스템은 사용자의 질의를 입력받아. 온톨로지와 시멘틱 웹 문서의 정보를 이용하여 후방향 추론을 수행함으로써 웹 정보의 불완전성을 완화하고, 온톨로지의 영향력를 감소시킴으로써 웹 서비스의 질을 향상시키는데 목적이 있다.RED에 비해 향상된 성능을 보여주었다.웍스 네트워크상의 다양한 디바이스들간의 네트워크 다양화와 분산화 기능을 얻을 수 있었고, 기존의 고가의 해외 솔루션인 Echelon사의 LonMaker 소프트웨어를 사용하지 않고도 국내의 순수 솔루션인 리눅스 기반의 LonWare 3.0 다중 바인딩 기능을 통해 저 비용으로 홈 네트워크 구성 관리 서버 시스템 개발에 대한 비용을 줄일 수 있다. 기대된다.e 함량이 대체로 높게 나타났다. 점미가 수가용성분에서 goucose대비 용출함량이 고르게 나타나는 경향을 보였고 흑미는 알칼리가용분에서 glucose가 상당량(0.68%) 포함되고 있음을 보여주었고 arabinose(0.68%), xylose(0.05%)도 다른 종류에 비해서 다량 함유한 것으로 나타났다. 흑미는 총식이섬유 함량이 높고 pectic substances, hemicellulose, uronic acid 함량이 높아서 콜레스테롤 저하 등의 효과가 기대되며 고섬유식품으로서 조리 특성 연구가 필요한 것으로 사료된다.리하였다. 얻어진 소견(所見)은 다음과 같았다. 1. 모년령(母年齡), 임신회수(姙娠回數), 임신기간(姙娠其間), 출산시체중등(出産時體重等)의 제요인(諸要因)은 주산기사망(周産基死亡)에 대(對)하여 통계적(統計的)으로 유의(有意)한 영향을 미치고 있어 $25{\sim}29$세(歲)의 연령군에서, 2번째 임신과 2번째의 출산에서 그리고 만삭의 임신 기간에, 출산시체중(出産時體重) $3.50{\sim}3.99kg$사이의 아이에서 그 주산기사망률(周産基死亡率)이 각각 가장 낮았다. 2. 사산(死産)과 초생아사망(初生兒死亡)을 구분(區分)하여 고려해 볼때 사산(死産)은 모성(母性)의 임신력(姙娠歷)과 매우 밀접한 관련이 있는 것으
PDF

A Study on the Acquisition of Usage Statistics based on SUSHI Project (SUSHI 기반 학술정보 이용통계 수집 모델 연구)

Kim, Sun-Tae;Lim, seok-Jong
- Proceedings of the Korea Contents Association Conference
- /
- 2007.11a
- /
- pp.35-39
- /
- 2007
Recently Usage statistics are widely available from online content providers. However. the statistics are not yet available in a consistent data container and the administrative cost of individual provider-by-provider downloads is high. The Standardized Usage Statistics Harvesting Initiative (SUSHI) is developing an automated request and response protocol for moving Project COUNTER (Counting Online Usage of Networked Electronic Resources) Code of Practice usage statistics from providers to library electronic repositories. SUSHI will help libraries make better decisions by reducing the administrative overhead of using Project COUNTER statistics. Publishers in the recording and exchange of usage statistics for electronic resources, initially journals and databases. By following COUNTER's Code of Practice, vendors can provide library customers with Excel or CSV (comma delimited) files of usage data using COUNTER's standardized formats and data elements. The result is a consistent, credible, and compatible set of usage data from multiple content providers. On this study, We propose the acquisition model of usage data based on SUSHI for KESLI that is overseas electronic journal consortium in korea.
PDF

Service-centric Object Fragmentation Model for Efficient Retrieval and Management of XML Documents (XML 문서의 효율적인 검색과 관리를 위한 SCOF 모델)

Jeong, Chang-Hoo
- Proceedings of the Korea Contents Association Conference
- /
- 2007.11a
- /
- pp.595-598
- /
- 2007
Vast amount of XML documents raise interests in how they will be used and how far their usage can be expanded. This paper has two central goals: 1) easy and fast retrieval of XML documents or relevant elements; and 2) efficient and stable management of large-size XML documents. The keys to develop such a practical system are how to segment a large XML document to smaller fragments and how to store them. In order to achieve these goals, we designed SCOF(Service-centric Object Fragmentation) model, which is a semi-decomposition method based on conversion rules provided by XML database managers. Keyword-based search using SCOF model then retrieves the specific elements or attributes of XML documents, just as typical XML query language does. Even though this approach needs the wisdom of managers in XML document collection, SCOF model makes it efficient both retrieval and management of massive XML documents.
PDF

An Agent System for Efficient VOD Services on Web (효율적 웹 기반 VOD 서비스를 위한 에이전트 시스템)

Lee Kyung-Hee;Han Jeong-Hye;Kim Dong-Ho
- Journal of Digital Contents Society
- /
- v.2 no.1
- /
- pp.73-79
- /
- 2001
Most of the existing algorithms try to disseminate the multimedia contents of internet service provider(ISP), without taking into account the needs and characteristics of specific websites including e-learning systems with web-based .educational contents. Sometimes the client must select the best one among the replicated repositories. However, this is a less reliable approach because clients' selections are made without prior information on server load capacity. In this paper we propose an agent system inspired by the need of improving QoS of delivering web-based educational multimedia contents without incurring long access delays. This agent system consists of three components, Analyzer, Knowledge Base, and Automaton embedded the capacity algorithm. It analyzes and investigates traffic information collected from individual replicated server by learners' requests, and selects a server which is available and is expected to provide the fastest latency time and the lowest loaded capacity, and achieves high performance by dynamic replicating web resources among multiple repositories.
PDF

An Efficient Spatial Join Method Using DOT Index (DOT 색인을 이용한 효율적인 공간 조인 기법)

Back, Hyun;Yoon, Jee-Hee;Won, Jung-Im;Park, Sang-Hyun
- Journal of KIISE:Databases
- /
- v.34 no.5
- /
- pp.420-436
- /
- 2007
The choice of an effective indexing method is crucial to guarantee the performance of the spatial join operator which is heavily used in geographical information systems. The $R^*$-tree based method is renowned as one of the most representative indexing methods. In this paper, we propose an efficient spatial join technique based on the DOT(Double Transformation) index, and compare it with the spatial Join technique based on the $R^*$-tree index. The DOT index transforms the MBR of an spatial object into a single numeric value using a space filling curve, and builds the $B^+$-tree from a set of numeric values transformed as such. The DOT index is possible to be employed as a primary index for spatial objects. The proposed spatial join technique exploits the regularities in the moving patterns of space filling curves to divide a query region into a set of maximal sub-regions within which space filling curves traverse without interruption. Such division reduces the number of spatial transformations required to perform the spatial join and thus improves the performance of join processing. The experiments with the data sets of various distributions and sizes revealed that the proposed join technique is up to three times faster than the spatial join method based on the $R^*$-tree index.
PDF KSCI

Search Result 73, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)