• Title/Summary/Keyword: subcategorization

Search Result 31, Processing Time 0.019 seconds

The Strength of the Relationship between Semantic Similarity and the Subcategorization Frames of the English Verbs: a Stochastic Test based on the ICE-GB and WordNet (영어 동사의 의미적 유사도와 논항 선택 사이의 연관성 : ICE-GB와 WordNet을 이용한 통계적 검증)

  • Song, Sang-Houn;Choe, Jae-Woong
    • Language and Information
    • /
    • v.14 no.1
    • /
    • pp.113-144
    • /
    • 2010
  • The primary goal of this paper is to find a feasible way to answer the question: Does the similarity in meaning between verbs relate to the similarity in their subcategorization? In order to answer this question in a rather concrete way on the basis of a large set of English verbs, this study made use of various language resources, tools, and statistical methodologies. We first compiled a list of 678 verbs that were selected from the most and second most frequent word lists from the Colins Cobuild English Dictionary, which also appeared in WordNet 3.0. We calculated similarity measures between all the pairs of the words based on the 'jcn' algorithm (Jiang and Conrath, 1997) implemented in the WordNet::Similarity module (Pedersen, Patwardhan, and Michelizzi, 2004). The clustering process followed, first building similarity matrices out of the similarity measure values, next drawing dendrograms on the basis of the matricies, then finally getting 177 meaningful clusters (covering 437 verbs) that passed a certain level set by z-score. The subcategorization frames and their frequency values were taken from the ICE-GB. In order to calculate the Selectional Preference Strength (SPS) of the relationship between a verb and its subcategorizations, we relied on the Kullback-Leibler Divergence model (Resnik, 1996). The SPS values of the verbs in the same cluster were compared with each other, which served to give the statistical values that indicate how much the SPS values overlap between the subcategorization frames of the verbs. Our final analysis shows that the degree of overlap, or the relationship between semantic similarity and the subcategorization frames of the verbs in English, is equally spread out from the 'very strongly related' to the 'very weakly related'. Some semantically similar verbs share a lot in terms of their subcategorization frames, and some others indicate an average degree of strength in the relationship, while the others, though still semantically similar, tend to share little in their subcategorization frames.

  • PDF

A Design of Korean Language Parsing based on Subcategorization (하위범주화에 의한 한국어 파싱 설계)

  • Lee, Ho-Suk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.242-247
    • /
    • 2008
  • This paper discusses a design for Korean language parsing based on subcategorization. First, we discuss some important Korean grammar elements such as syntax category, josa, omi-conjugation, syntactic affix, dependent noun and also discuss subcategorization and expression patterns. Then, we show the basic structure of Korean language parsing process. The first stage scans the input sentence and processes article, noun phrase, numeral, josa, affix, dependent noun, adjective, omi-conjugation, adverb, auxiliary verb. The second stage deals with subcategorization patterns and expression patterns. The third stage processes the clauses and the fourth stage deals with SEA(Sentence Ending+Auxiliary).

  • PDF

Case Ambiguity Resolution using Thesaurus and subcategorization Information (시로러스와 하위범주와 사전을 이용한 격모호성 해결)

  • Yang, Jae-Hyeong;Sim, Gwang-Seop
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.9
    • /
    • pp.1132-1140
    • /
    • 1999
  • 한국어에서 보조사로 인해 발생하는 격 모호성(case ambiguity) 문제를 해결하는 알고리즘을 개발하였다. 이 알고리즘은 용언의 하위범주화 사전, 용언과 그 용언의 보어가 되는 체언간의 선택 제약, 체언의 의미 정보를 제공하는 시소러스 등의 구문.의미 지식과 더불어 몇 가지의 휴리스틱 규칙을 이용하며, 필수 보어의 생략이 흔한 한국어의 특성에 잘 대응한다. 중규모의 하위범주화 사전 및 시소러스를 이용한 실험에서 만족할 만한 성능을 보였다.Abstract An algorithm is proposed for the resolution of case ambiguity caused by the use of auxiliary postpositions in Korean language. The algorithm utilizes verb dictionary which provides subcategorization information and selectional restrictions, and the thesaurus as well as a set of simple heuristic rules. The algorithm is appropriate for Korean language where required complements are often omitted. The algorithm performed successfully in an experiment using medium-sized subcategorization dictionary and thesaurus.

A Design & Implementation of Korean Parser using Subcategorization: I (하위범주화에 의한 한국어 파서의 설계와 구현 : I)

  • Lee, Ho Suk
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.1-4
    • /
    • 2008
  • We present and discuss a Korean language parser based on dependency grammar, subcategorization, and the analysis of viable postfix such as josa and omi. We employ an extended form of BNF(Backus Naur Form) to define the dependency grammar and the form of subcategorization. We present the conceptual form of Korean language parser in a C program style. We discuss the structure of Korean parser currently implemented and show the execution results.

  • PDF

Constructing a Korean Subcategorization Dictionary with Semantic Roles using Thesaurus and Predicate Patterns (시소러스와 술어 패턴을 이용한 의미역 부착 한국어 하위범주화 사전의 구축)

  • Yang, Seung-Hyun;Kim, Young-Sum;Woo, Yo-Sub;Yoon, Deok-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.3
    • /
    • pp.364-372
    • /
    • 2000
  • Subcategorization, defining dependency relation between predicates and their complements, is an important source of knowledge for resolving syntactic and semantic ambiguities arising in analyzing sentences. This paper describes a Korean subcategorization dictionary, particularly annotated with semantic roles of complements coupled with thesaural semantic hierarchy as well as syntactic dependencies. For annotating roles, we defined 25 semantic roles associated with surface case markers that can be used to derive semantic structures directly from syntactic ones. In addition, we used more than 120,000 entries of thesaurus to specify concept markers of noun complements, and also used 47 and 17 predicate patterns for verbs and adjectives, respectively, to express dependency relation between predicates and their complements. Using a full-fledged thesaurus for specifying concept markers makes it possible to build an effective selectional restriction mechanism coupled with the subcategorization dictionary, and using the standard predicate patterns for specifying dependency relations makes it possible to avoid inconsistency in the results and to reduce the costs for constructing the dictionary. On the bases of these, we built a Korean subcategorization dictionary for frequently used 13,000 predicates found in corpora with the aid of a tool specially designed to support this task. An experimental result shows that this dictionary can provide 72.7% of predicates in corpora with appropriate subcategorization information.

  • PDF

Dynamic Expansion of Semantic Dictionary for Topic Extraction in Automatic Summarization (자동요약의 주제어 추출을 위한 의미사전의 동적 확장)

  • Choo, Kyo-Nam;Woo, Yo-Seob
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.241-247
    • /
    • 2009
  • This paper suggests the expansion methods of semantic dictionary, taking Korean semantic features account. These methods will be used to extract a practical topic word in the automatic summarization. The first is the method which is constructed the synonym dictionary for improving the performance of semantic-marker analysis. The second is the method which is extracted the probabilistic information from the subcategorization dictionary for resolving the syntactic and semantic ambiguity. The third is the method which is predicted the subcategorization patterns of the unregistered predicate, for the resolution of an affix-derived predicate.

  • PDF

A study on vocabulary instruction to improve English communicative competence: Focus on English verbs (의사소통 능력을 높여주는 어휘 지도에 대한 연구: 동사를 중심으로)

  • Kim, Bu-Ja
    • English Language & Literature Teaching
    • /
    • v.12 no.1
    • /
    • pp.131-158
    • /
    • 2006
  • The purpose of the present study is to explore an effective way of teaching English vocabulary which is geared toward improving students' English communicative competence. This study focuses on English verbs, which may be followed by patterns according to subcategorization. Learning verbs must include learning about patterns as well as meaning in order to improve the ability to use verbs receptively and productively, or communicative competence. On the basis of the language progression proposed by Willis (2003), a teaching strategy which helps learners learn English verb patterns effectively and systematically was proposed. The effect of the teaching strategy was investigated. The subjects of the experimental group who learned English verb patterns intentionally through the teaching strategy proposed by this study significantly improved themselves in the ability to use them receptively and productively. This result shows that the teaching strategy including improvisation, recognition, rehearsal, system building, exploration and consolidation is helpful to improving communicative competence.

  • PDF

Korean Semantic Role Labeling Using Case Frame Dictionary and Subcategorization (격틀 사전과 하위 범주 정보를 이용한 한국어 의미역 결정)

  • Kim, Wan-Su;Ock, Cheol-Young
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1376-1384
    • /
    • 2016
  • Computers require analytic and processing capability for all possibilities of human expression in order to process sentences like human beings. Linguistic information processing thus forms the initial basis. When analyzing a sentence syntactically, it is necessary to divide the sentence into components, find obligatory arguments focusing on predicates, identify the sentence core, and understand semantic relations between the arguments and predicates. In this study, the method applied a case frame dictionary based on The Korean Standard Dictionary of The National Institute of the Korean Language; in addition, we used a CRF Model that constructed subcategorization of predicates as featured in Korean Lexical Semantic Network (UWordMap) for semantic role labeling. Automatically tagged semantic roles based on the CRF model, which established the information of words, predicates, the case-frame dictionary and hypernyms of words as features, were used. This method demonstrated higher performance in comparison with the existing method, with accuracy rate of 83.13% as compared to 81.2%, respectively.

Subcategorization of Dependent Nouns for NLP (자연어 처리를 위한 의존 명사 하위 범주 분류)

  • Yu, Jae-Won
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.136-142
    • /
    • 1997
  • 의존 명사와 이를 꾸미는 관형어는 통사적으로 긴밀한 언어학적 단위를 이루므로 의존 명사에 대한 하위 범주 분류는 한국어 자연어 처리에 있어서 중요하다. 그러나 기존 국어 문법에서는 이 문제가 일관성 있게 다루어지지 않았다. 이 논문에서는 국어 사전(조재수 1997)에 올라 있는 의존 명사 600여 개를 허웅(1996)의 분류 기준을 보완하여 일관성 있게 하위 범주 분류를 시도하였다. 또 수량 단위 명사는 앞에 오는 수사의 종류에 따라 더 세분하였다.

  • PDF