• Title/Summary/Keyword: Korean parsing

Search Result 325, Processing Time 0.03 seconds

Exploiting Chunking for Dependency Parsing in Korean (한국어에서 의존 구문분석을 위한 구묶음의 활용)

  • Namgoong, Young;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.7
    • /
    • pp.291-298
    • /
    • 2022
  • In this paper, we present a method for dependency parsing with chunking in Korean. Dependency parsing is a task of determining a governor of every word in a sentence. In general, we used to determine the syntactic governor in Korean and should transform the syntactic structure into semantic structure for further processing like semantic analysis in natural language processing. There is a notorious problem to determine whether syntactic or semantic governor. For example, the syntactic governor of the word "먹고 (eat)" in the sentence "밥을 먹고 싶다 (would like to eat)" is "싶다 (would like to)", which is an auxiliary verb and therefore can not be a semantic governor. In order to mitigate this somewhat, we propose a Korean dependency parsing after chunking, which is a process of segmenting a sentence into constituents. A constituent is a word or a group of words that function as a single unit within a dependency structure and is called a chunk in this paper. Compared to traditional dependency parsing, there are some advantage of the proposed method: (1) The number of input units in parsing can be reduced and then the parsing speed could be faster. (2) The effectiveness of parsing can be improved by considering the relation between two head words in chunks. Through experiments for Sejong dependency corpus, we have shown that the USA and LAS of the proposed method are 86.48% and 84.56%, respectively and the number of input units is reduced by about 22%p.

Structural Disambiguation of Korean Adverbs Based on Correlative Relation and Morphological Context

  • Seo, Young-Ae;Park, Sang-Kyu;Choi, Key-Sun
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.803-806
    • /
    • 2006
  • This letter addresses a structural disambiguation method for Korean adverbs based on the correlative relation constraints between adverbs and modifiees, and the morphological context information of sentences. Using the proposed method, we improved the dependency parsing accuracy of adverbs from 79.2 to 89%. The experimental result shows that the proposed method is especially expert in parsing adverbs which can modify multiple word classes or have a long distance dependency relation to their modifiees.

  • PDF

A Conditional Unification Based Parsing for Korean Using Sentence-Type Information (문장 형태 정보를 이용한 조건단일화 기반 한국어 파싱)

  • Yang Seungweon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.9 no.4
    • /
    • pp.1-7
    • /
    • 2004
  • In this thesis, we introduce a parsing method which use information of the post position in Korean to get the exact parsing tree. In order to implement this method we classified categories of the predicates, and defined sentence-types based on these categories. We tried to make parsing using the method grasping the grammatical role of the noun phrase that have to exist in each sentence-type. In parser control mechanism, we use some heuristics based on linguistic frame. We use conditional unification to implement analysis. It is Possible to reduce ambiguous because the parsing method suggested helps to Prune the branches which are unnecessary.

  • PDF

Robust Syntactic Annotation of Corpora and Memory-Based Parsing

  • Hinrichs, Erhard W.
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.1-1
    • /
    • 2002
  • This talk provides an overview of current work in my research group on the syntactic annotation of the T bingen corpus of spoken German and of the German Reference Corpus (Deutsches Referenzkorpus: DEREKO) of written texts. Morpho-syntactic and syntactic annotation as well as annotation of function-argument structure for these corpora is performed automatically by a hybrid architecture that combines robust symbolic parsing with finite-state methods ("chunk parsing" in the sense Abney) with memory-based parsing (in the sense of Daelemans). The resulting robust annotations can be used by theoretical linguists, who lire interested in large-scale, empirical data, and by computational linguists, who are in need of training material for a wide range of language technology applications. To aid retrieval of annotated trees from the treebank, a query tool VIQTORYA with a graphical user interface and a logic-based query language has been developed. VIQTORYA allows users to query the treebanks for linguistic structures at the word level, at the level of individual phrases, and at the clausal level.

  • PDF

Biaffine Dependency Parser for Korean (Biaffine 한국어 의존파서)

  • Shadikhodjaev, Uygun;Min, Tae Hong;Youn, Junyoung;Lee, Jae Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.678-681
    • /
    • 2018
  • Dependency parsing is an important task in natural language processing whose results are used in many downstream tasks such as machine translation, information retrieval, relation extraction, question answering and many others. Most of the dependency parsing literature focuses on using end-to-end and sequence-to-sequence neural architectures as the core of the system. One such system, namely Biaffine dependency parser is explored in the current paper for effective dependency parsing of Korean language.

  • PDF

Transition and Parsing State and Incrementality in Dynamic Syntax

  • Kobayashi, Masahiro;Yoshimoto, Kei
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.249-258
    • /
    • 2007
  • This paper presents an implementation of a gramar of Dynamic Syntax for Japanese. Dynamic Syntax is a grammar formalism which enables a parser to process a sentence in an incremental fashion, establishing the semantic representation. Currently the application of lexical rules and transition rules in Dynamic Syntax is carried out arbitrarily and this leads to inefficient parsing. This paper provides an algorithm of rule application and partitioned parsing state for efficient parsing with special reference to processing Japanese, which is one of head-final languages. At the present stage the parser is still small but can parse scrambled sentences, relative clause constructions, and embedded clauses. The parser is written in Prolog and this paper shows that the parser can process null arguments in a complex sentence in Japanese.

  • PDF

Constructing a of Single State Parsing Automaton (단일 상태 파싱 오토마톤의 생성)

  • Lee, Gyung-Ok
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.11
    • /
    • pp.701-704
    • /
    • 2008
  • A general automaton allows multiple input transitions, so a special treatment is required when the history of transitions is needed. An LR automaton keeps the past transitions in the stack to use them during parsing. On the other hand, when each state in an automaton contains in itself the past transition history, the trace overhead of past transitions is unnecessary. The paper suggests a single state parsing automaton that does not depend on the past transitions. The applicable grammar class is less than LR grammars, but each state in a new automaton contains the past information, so the tracing of the history is not required compared to LR automaton.

Syntactic Category Prediction for Improving Parsing Accuracy in English-Korean Machine Translation (영한 기계번역에서 구문 분석 정확성 향상을 위한 구문 범주 예측)

  • Kim Sung-Dong
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.345-352
    • /
    • 2006
  • The practical English-Korean machine translation system should be able to translate long sentences quickly and accurately. The intra-sentence segmentation method has been proposed and contributed to speeding up the syntactic analysis. This paper proposes the syntactic category prediction method using decision trees for getting accurate parsing results. In parsing with segmentation, the segment is separately parsed and combined to generate the sentence structure. The syntactic category prediction would facilitate to select more accurate analysis structures after the partial parsing. Thus, we could improve the parsing accuracy by the prediction. We construct features for predicting syntactic categories from the parsed corpus of Wall Street Journal and generate decision trees. In the experiments, we show the performance comparisons with the predictions by human-built rules, trigram probability and neural networks. Also, we present how much the category prediction would contribute to improving the translation quality.

An Improved Incremental LL(1) Parsing Method (개선된 점진적 LL(1) 파싱 방법)

  • Lee, Gyung-Ok
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.486-490
    • /
    • 2010
  • Incremental parsing has been researched in the intention that the parse result of the original string is reused in the parsing of a new string. This paper proposes an improved method of the previous incremental LL(1) parser with nonterminal lookahead symbols. The previous work is time-inefficient because it repeatedly performs unnecessary steps when an error occurs. This paper gives a solution for the problem.