Search | Korea Science

Improvement of Transformation Rule-Based Korean Part-Of-Speech Tagger (변형 규칙 기반 한국어 품사 태거의 개선)

Lim, Heui-Seok;Kim, Jin-Dong;Rim, Hae-Chang
- Annual Conference on Human and Language Technology
- /
- 1996.10a
- /
- pp.216-221
- /
- 1996
변형 규칙 기반 품사 태거는 태깅 규칙을 코퍼스로부터 자동 학습할 수 있고, 견고하며 태깅 결과를 이해하고 분석하기가 쉽다는 장점을 갖는다. 이에 최근 한국어 특성을 고려한 변형 규칙 기반 한국어 품사 태거가 개발되었다. 하지만 이 시스템은 오류 어절의 어휘 정보를 사용하지 않으므로 수정 가능 오류에 대한 변형 규칙이 제대로 학습되지 못하며, 변형 규칙 적용 과정에 새로운 오류를 발생시킨다는 문제점이 있다. 이에 본 논문은 오류 어절의 어휘 정보를 참조할 수 있는 세부변형 규칙 추출을 이용한 변형 규칙 기반 한국어 품사 태거의 개선 방안을 제안한다. 어휘 정보를 참조할 수 있는 세부 변형 규칙의 형태는 특정 문맥 C에서 어절 W의 어절 태그 ${\alpha}$를 어절 태그 ${\beta}$로 변형한다와 같다. 제안된 방법은 약 10만 어절 크기의 학습 코퍼스에서 57개의 세부 규칙을 학습하였고, 2만 어절 크기의 실험코퍼스에 적용한 결과 95.6%의 정확도를 보임으로써 기존의 변형 규칙 기반 품사 태거의 정확도를 약 15.4% 향상시켰다.
PDF

Trainable Interface Agents for Informal ion Extract ion (정보추출을 위한 학습 가능한 인터페이스 에이전트)

김용기;양재영;최중민
- Proceedings of the Korean Information Science Society Conference
- /
- 2001.10b
- /
- pp.61-63
- /
- 2001
본 논문의 목적은 기계 학습 방법을 이용하여 정보 추출 규칙의 패턴을 학습할 수 있는 인터페이스 에이전트의 개발에 있다. 인터페이스 에이전트는 사용자와 상호작용이 가능한 지능형 에이전트이다. 사용자는 인터페이스 에이전트와 상호작용을 하게 되며 에이전트는 이 상호 작용에서 사용자가 원하는 정보 추출 규칙을 학습하게 된다. 사용자는 웹 문서에서 원하는 정보의 위치를 지정하여 데이터를 인터페이스 에이전트에게 학습시킨다. 인터페이스 에이전트는 학습된 추출 규칙으로부터 사용자가 원하는 정보를 추출한다.
PDF

An N-version Learning Approach to Enhance the Prediction Accuracy of Classification Systems in Genetics-based Learning Environments (유전학 기반 학습 환경하에서 분류 시스템의 성능 향상을 위한 엔-버전 학습법)

Kim, Yeong-Jun;Hong, Cheol-Ui
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.7
- /
- pp.1841-1848
- /
- 1999
DELVAUX is a genetics-based inductive learning system that learns a rule-set, which consists of Bayesian classification rules, from sets of examples for classification tasks. One problem that DELVAUX faces in the rule-set learning process is that, occasionally, the learning process ends with a local optimum without finding the best rule-set. Another problem is that, occasionally, the learning process ends with a rule-set that performs well for the training examples but not for the unknown testing examples. This paper describes efforts to alleviate these two problems centering on the N-version learning approach, in which multiple rule-sets are learning and a classification system is constructed with those learned rule-sets to improve the overall performance of a classification system. For the implementation of the N-version learning approach, we propose a decision-making scheme that can draw a decision using multiple rule-sets and a genetic algorithm approach to find a good combination of rule-sets from a set of learned rule-sets. We also present empirical results that evaluate the effect of the N-version learning approach in the DELVAUX learning environment.
PDF

A GA-based Inductive Learning System for Extracting the PROSPECTOR`s Classification Rules (프러스펙터의 분류 규칙 습득을 위한 유전자 알고리즘 기반 귀납적 학습 시스템)

Kim, Yeong-Jun
- Journal of KIISE:Software and Applications
- /
- v.28 no.11
- /
- pp.822-832
- /
- 2001
We have implemented an inductive learning system that learns PROSPECTOR-rule-style classification rules from sets of examples. In our a approach, a genetic algorithm is used in which a population consists of rule-sets and rule-sets generate offspring through the exchange of rules relying on genetic operators such as crossover, mutation, and inversion operators. In this paper, we describe our learning environment centering on the syntactic structure and meaning of classification rules, the structure of a population, and the implementation of genetic operators. We also present a method to evaluate the performance of rules and a heuristic approach to generate rules, which are developed to implement mutation operators more efficiently. Moreover, a method to construct a classification system using multiple learned rule-sets to enhance the performance of a classification system is also explained. The performance of our learning system is compared with other learning algorithms, such as neural networks and decision tree algorithms, using various data sets.
PDF

A New Rule-Generation Algorithm (새로운 규칙 생성 알고리즘)

Kim Sang-kwi;Yoon Chung-hwa
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.11b
- /
- pp.721-723
- /
- 2005
패턴 분류에 많이 사용되는 MBR(Memory Based Reasoning) 기법은 메모리에 저장된 학습패턴과 테스트 패턴간의 거리를 계산하여 가장 가까운 학습패턴의 클래스로 분류하기 때문에 테스트 패턴을 분류하는 기준을 설명할 수 없다는 문제점을 가지고 있다. 본 논문에서는 RPA(Recursive Partition Averaging) 기법을 이용하여 분류 기준을 설명할 수 있는 IF-THIN 형태의 규칙을 생성하고 생성된 규칙의 일반화 성능을 향상시키기 위하여 불필요한 조건을 제거하는 규칙 pruning 알고리즘과 생성되는 규칙의 개수를 줄일 수 있는 점진적 규칙 추출 알고리즘을 제안한다.
PDF

Dependency Structure Analysis System for Korean Using Automatically Acquired Transformation Rules (변환 규칙 학습기를 이용한 한국어 의존 구조 분석기)

Lee, Song-Wook;Seo, Jung-Yun
- Annual Conference on Human and Language Technology
- /
- 1997.10a
- /
- pp.360-363
- /
- 1997
코퍼스 속의 언어적 규칙을 직접적으로 사용하여 한국어 의존 구조를 분석하기 위해, 본 한국어 의존 구조 분석기는 의존 구조가 나타나 있는 코퍼스로부터 변환 규칙 학습기로 규칙을 자동적으로 학습하고 그 규칙을 적용함으로써 한국어 의존 구조를 분석한다. 이를 위해 기존의 연구된 구구조 문법의 규칙 틀과는 다른 한국어 의존 구조에 맞는 규칙 틀을 연구하였고 또 의존 구조에서 발생할 수 있는 교차구조(Crossing structure)를 방지하는 연산을 고안하였다.
PDF

A Study on Generation of Adaptive Rule Base and its Dynamic Application (적응하는 규책베이스의 생성 및 이의 동적 활용에 관한 연구)

조선영
- Journal of the Korean Institute of Intelligent Systems
- /
- v.4 no.1
- /
- pp.50-63
- /
- 1994
기존의 지식 기반 시스템들은 그 지식의 형태를 대부분 규책을 통해서 처리하고 있다. 그리고 이런 규책들은 일반적으로 사람에 의해서 외부에서 주어진며 주어진 규칙은 학습이 진행됨에 따라 그 형태가 바뀌게 된다. 그러나 실생활에서 일어나는 대부분의 일들은 주어진 한정된 수의 규칙에 의해서만 수행되기보다는 반복수행 또는 점진적인 학습에 의해서 동적으로 그 수와 적용범위가 바뀌게 된다. 본 논문에서는 외부로부터 얻어지는 데이터를 통해서 그들 사이의 관계를 알아내고, 이를 통해 새로운 규칙을 생성하며, 계속적으로 학습이 진행됨에 따라서 능동적으로 규칙의 수와 적용범위가 변화하는 시스템을 제안한다. 동적 규칙 생성시스템의 유용성을 검증하기 위해서, 세 선분이 연결된 막대기의 한쪽 끝을 고정시킨 상태에서, 다른 쪽 끝이 원하는 위치에 도달하게 하는 문제에 적용하여 로보트 팔의 자동 조절 및 기계 학습의 자동화에 기여할 수 있음을 보여준다.
PDF

Loaming Syntactic Constraints for Improving the Efficiency of Korean Parsing (한국어 구문분석의 효율성을 개선하기 위한 구문제약규칙의 학습)

Park, So-Young;Kwak, Yong-Jae;Chung, Hoo-Jung;Hwang, Young-Sook;Rim, Hae-Chang
- Journal of KIISE:Software and Applications
- /
- v.29 no.10
- /
- pp.755-765
- /
- 2002
In this paper, we observe various syntactic information for Korean parsing and propose a method to learn constraints and improve the efficiency of a parsing model by using the constraints. The proposed method has the following three characteristics. First, it improves the parsing efficiency since we use constraints that can prevent the parser from generating unsuitable candidates. Second, it is robust on a given Korean sentence because the attributes for the constraints are selected based on the syntactic and lexical idiosyncrasy of Korean. Third, it is easy to acquire constraints automatically from a treebank by using a decision tree learning algorithm. The experimental results show that the parser using acquired constraints can reduce the number of overgenerated candidates up to 1/2~1/3 of candidates and it runs 2~3 times faster than the one without any constraints.
PDF KSCI

Rule Generation by Search Space Division Learning Method using Genetic Algorithms (유전자알고리즘을 이용한 탐색공간분할 학습방법에 의한 규칙 생성)

Jang, Su-Hyun;Yoon, Byung-Joo
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.11
- /
- pp.2897-2907
- /
- 1998
The production-rule generation from training examples is a hard problem that has large space and many local optimal solutions. Many learning methods are proposed for production-rule generation and genetic algorithms is an alternative learning method. However, traditional genetic algorithms has been known to have an obstacle in converging at the global solution area and show poor efficiency of production-rules generated. In this paper, we propose a production-rule generating method which uses genetic algorithm learning. By analyzing optimal sub-solutions captured by genetic algorithm learning, our method takes advantage of its schema structure and thus generates relatively small rule set.
PDF

TAKTAG: Two phase learning method for hybrid statistical/rule-based part-of-speech disambiguation (TAKTAG: 통계와 규칙에 기반한 2단계 학습을 통한 품사 중의성 해결)

Shin, Sang-Hyun;Lee, Geun-Bae;Lee, Jong-Hyeok
- Annual Conference on Human and Language Technology
- /
- 1995.10a
- /
- pp.169-174
- /
- 1995
품사 태깅은 형태소 분석 이후 발생한 모호성을 제거하는 것으로, 통계적 방법과 규칙에 기 반한 방법이 널리 사용되고 있다. 하지만, 이들 방법론에는 각기 한계점을 지니고 있다. 통계적인 방법인 은닉 마코프 모델(Hidden Markov Model)은 유연성(flexibility)을 지니지만, 교착어(agglutinative language)인 한국어에 있어서 제한된 윈도우로 인하여, 중의성 해결의 실마리가 되는 어휘나 품사별 제대로 참조하지 못하는 경우가 있다. 반면, 규칙에 기반한 방법은 차체가 품사에 영향을 받으므로 인하여, 새로운 태그집합(tagset)이나 언어에 대하여 유연성이나 정확성을 제공해 주지 못한다. 이러한 각기 서로 다른 방법론의 한계를 극복하기 위하여, 본 논문에서는 통계와 규칙을 통합한 한국어 태깅 모델을 제안한다. 즉 통계적 학습을 통한 통계 모델이후에 2차적으로 규칙을 자동학습 하게 하여, 통계모델이 다루지 못하는 범위의 규칙을 생성하게 된다. 이처럼 2단계의 통계와 규칙의 자동 학습단계를 거치게 됨으로써, 두개 모델의 단점을 보강한 높은 정확도를 가지는 한국어 태거를 개발할 수 있게 하였다.
PDF

Search Result 808, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)