• Title/Summary/Keyword: syntactic analysis

Search Result 263, Processing Time 0.026 seconds

Determination of Thematic Roles according to Syntactic Relations Using Rules and Statistical Models in Korean Language Processing (한국어 전산처리에서 규칙과 확률을 이용한 구문관계에 따른 의미역 결정)

  • 강신재;박정혜
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.1
    • /
    • pp.33-42
    • /
    • 2003
  • This paper presents an efficient determination method of thematic roles from syntactic relations using rules and statistical model in Korean language processing. This process is one of the main core of semantic analysis and an important issue to be solved in natural language processing. It is problematic to describe rules for determining thematic roles by only using general linguistic knowledge and experience, since the final result may be different according to the subjective views of researchers, and it is impossible to construct rules to cover all cases. However, our hybrid method is objective and efficient by considering large corpora, which contain practical usages of Korean language, and case frames in the Sejong Electronic Lexicon of Korean, which is being developed by dozens of Korean linguistic researchers. To determine thematic roles more correctly, our system uses syntactic relations, semantic classes, morpheme information, position of double subject. Especially by using semantic classes, we can increase the applicability of our system.

  • PDF

An Abstraction Method for State Minimization based on Syntactic and Semantic Patterns in the Execution Space of Real-Time Systems (실시간 시스템의 실행 공간상에서 구문 및 의미패턴에 기반한 상태 최소화를 위한 추상화 방법)

  • 박지연;조기환;이문근
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.103-116
    • /
    • 2003
  • States explosion due to composition of spaces of data, temporal, and locational values is one of the well-known critical problems which cause difficulty in understanding and analysing real-time systems specified with state-based formal methods. In order to overcome this problem, this paper presents an abstraction method for state minimization based on an abstraction in system specification and an abstraction in system execution. The first is named the syntactic in system specification and an abstraction in system execution. The first is named the syntactic abstraction, through which the patterns of the unconditionally internalized computation and the repetition and selection structures are abstracted. The latter is named the semantic abstraction, through which the patterns of the execution space represented with data. Through the abstractions, the components of a system in specification and execution model is hierarchically organized. The system can be analyzed briefly in the upper level in an skeleton manner with low complexity. The system, however, can be abstraction method for the state minimization and the decrease in analysis complexity through the abstraction with examples.

A Comparative Study on Korean Connective Morpheme '-myenseo' to the Chinese expression - based on Korean-Chinese parallel corpus (한국어 연결어미 '-면서'와 중국어 대응표현의 대조연구 -한·중 병렬 말뭉치를 기반으로)

  • YI, CHAO
    • Cross-Cultural Studies
    • /
    • v.37
    • /
    • pp.309-334
    • /
    • 2014
  • This study is based on the Korean-Chinese parallel corpus, utilizing the Korean connective morpheme '-myenseo' and contrasting with the Chinese expression. Korean learners often struggle with the use of Korean Connective Morpheme especially when there is a lexical gap between their mother language. '-myenseo' is of the most use Korean Connective Morpheme, it usually contrast to the Chinese coordinating conjunction. But according to the corpus, the contrastive Chinese expression to '-myenseo' is more than coordinating conjunction. So through this study, can help the Chinese Korean language learners learn easier while studying '-myenseo', because the variety Chinese expression are found from the parallel corpus that related to '-myenseo'. In this study, firstly discussed the semantic features and syntactic characteristics of '-myenseo'. The significant semantic features of '-myenseo' are 'simultaneous' and 'conflict'. So in this chapter the study use examples of usage to analyse the specific usage of '-myenseo'. And then this study analyse syntactic characteristics of '-myenseo' through the subject constraint, predicate constraints, temporal constraints, mood constraints, negatives constraints. then summarize them into a table. And the most important part of this study is Chapter 4. In this chapter, it contrasted the Korean connective morpheme '-myenseo' to the Chinese expression by analysing the Korean-Chinese parallel corpus. As a result of the analysis, the frequency of the Chinese expression that contrasted to '-myenseo' is summarized into

    . It can see from the table that the most common Chinese expression comparative to '-myenseo' is non-marker patterns. That means the connection of sentence in Korean can use connective morpheme what is a clarifying linguistic marker, but in Chinese it often connect the sentence by their intrinsic logical relationships. So the conclusion of this chapter is that '-myenseo' can be comparative to Chinese conjunction, expression, non-marker patterns and liberal translation patterns, which are more than Chinese conjunction that discovered before. In the last Chapter, as the conclusion part of this study, it summarized and suggest the limitations and the future research direction.

  • English Predicate Inversion: Towards Data-driven Learning

    • Kim, Jong-Bok;Kim, Jin-Young
      • Journal of English Language & Literature
      • /
      • v.56 no.6
      • /
      • pp.1047-1065
      • /
      • 2010
    • English inversion constructions are not only hard for non-native speakers to learn but also difficult to teach mainly because of their intriguing grammatical and discourse properties. This paper addresses grammatical issues in learning or teaching the so-called 'predicate inversion (PI)' construction (e.g., Equally important in terms of forest depletion is the continuous logging of the forests). In particular, we chart the grammatical (distributional, syntactic, semantic, pragmatic) properties of the PI construction, and argue for adata-driven teaching for English grammar. To depart from the arm-chaired style of grammar teaching (relying on author-made simple sentences), our teaching method introduces a datadriven teaching. With total 25 university students in a grammar-related class, students together have analyzed the British Component of the International Corpus of English (ICE-GB), containing about one million words distributed across a variety of textual categories. We have identified total 290 PI sentences (206 from spoken and 87 from written texts). The preposed syntactic categories of the PI involve five main types: AdvP, PP, VP(ed/ing), NP, AP, and so, all of which function as the complement of the copula. In terms of discourse, we have observed, supporting Birner and Ward's (1998) observation that these preposed phrases represent more familiar information than the postposed subject. The corpus examples gave us the three possible types: The preposed element is discourse-old whereas the postposed one is discourse-new as in Putting wire mesh over a few bricks is a good idea. Both preposed and postposed elements can also be discourse new as in But a fly in the ointment is inflation. These two elements can also be discourse old as in Racing with him on the near-side is Rinus. The dominant occurrence of the PI in the spoken texts also supports the view that the balance (or scene-setting) in information structure is the main trigger for the use of the PI construction. After being exposed to the real data and in-depth syntactic as well as informationstructure analysis of the PI construction, it is proved that the class students have had a farmore clear understanding of the construction in question and have realized that grammar does not mean to live on by itself but tightly interacts with other important grammatical components such as information structure. The study directs us toward both a datadriven and interactive grammar teaching.

    Noun and affix extraction using conjunctive information (결합정보를 이용한 명사 및 접사 추출)

    • 서창덕;박인칠
      • Journal of the Korean Institute of Telematics and Electronics C
      • /
      • v.34C no.5
      • /
      • pp.71-81
      • /
      • 1997
    • This paper proposes noun and affix extraction methods using conjunctive information for making an automatic indexing system thorugh morphological analysis and syntactic analysis. The korean language has a peculiar spacing words rule, which is different from other languages, and the conjunctive information, which is extracted from the rule, can reduce the number of multiple parts of speech at a minimum cost. The proposed algorithms also solve the problem that one word is seperated by newline charcter. We show efficiency of the proposed algorithms through the process of morhologica analyzing.

    • PDF

    A Pregroup Analysis of Japanese Causatives

    • Cardinal, Kumi
      • Proceedings of the Korean Society for Language and Information Conference
      • /
      • 2007.11a
      • /
      • pp.96-104
      • /
      • 2007
    • We explore a computational algebraic approach to grammar via pregroups. We examine how the structures of Japanese causatives can be treated in the framework of a pregroup grammar. In our grammar, the dictionary assigns one or more syntactic types to each word and the grammar rules are used to infer types to strings of words. We developed a practical parser representing our pregroup grammar, which validates our analysis.

    • PDF

    A Genotypical Analysis of Korean REMCs and Generation of Base Line Data for the Analysis and Evaluation for Future (REMCs) Designs Using Space Syntax

    • Ullah, Ubaid;Park, Jae Seung
      • Journal of The Korea Institute of Healthcare Architecture
      • /
      • v.22 no.1
      • /
      • pp.17-28
      • /
      • 2016
    • Purpose: The purpose of this paper is to analyze the spatial configurations of a sample of Korean regional emergency medical centers (REMCs) to explore its underlying genotypes and thus produce a base line data for the analysis and evaluations of future REMCs designs using space syntax theory. Methods: Space syntax analysis was used as a major tool for the analysis and exploration of Genotype. The measures of Integration(overall integration with exterior and without exterior as well as the integration of individual clinical spaces for each center), base difference factor (DF) and Space link ratio were calculated for a sample of seven Korean REMCs. Results: The result shows a strikingly similar pattern of Syntactic measures across the sample, the mean integration of sample ranges from 0.82-0.99 with exterior (while considering the exterior space as a root) and 0.81-1.01 without exterior (considering the connections of interior spaces only with no outside connection). The base difference factor (DF) of the sample varies from 0.60-0.81 with exterior and from 0.59-0.82 without exterior. Case number-1 was identified as non-genotype with differing order of Syntactic values. Although the genotype had different forms, layouts and even sizes, these results cannot be explained by Phenotypical comparisons. Implications: This study will contribute to the configurational analysis and evaluation of existing and future Korean REMCs design and practice of emergency healthcare delivery system in Korea.

    Verification of the Usefulness of the Mock TOEIC Test using Corpus Indices : Focusing on the Analysis of Difficulty and Discrimination (코퍼스 지표를 활용한 모의 토익시험의 유용성 검증 : 난이도와 변별도 분석을 중심으로)

    • Lee, Yena
      • The Journal of the Korea Contents Association
      • /
      • v.21 no.10
      • /
      • pp.576-593
      • /
      • 2021
    • In this study, in order to investigate the factors that affect the percentage of correct answers and the degree of discrimination of the TOEIC test, a regression analysis was performed using corpus indicators that influence correct answer rate and the degree of discrimination for each part derived from the item analysis. The basic calculation word_length, consistency index LSA_overlap_adjacent_sentences, lexical diversity MTLD_VOCD, conjunction All_logical_causal_connectives_incidence, situational model casual_particles_causal_verbs_Ratio, syntactic complexity Left_embeddedness, and syntactic pattern density Infinitive_density were found to have negative effects. These factors that lower the correct answer rate can be utilized when setting learning goals. Vocabulary diversity index MTLD_VOCD, conjunction Additive_connectives_incidence, syntactic pattern density Infinitive_density, and lexical information person1_2_pronoun_incidence were found to have a positive effect. Factors influencing the increase in discrimination may provide important information for developing a learning program.

    Exploiting Chunking for Dependency Parsing in Korean (한국어에서 의존 구문분석을 위한 구묶음의 활용)

    • Namgoong, Young;Kim, Jae-Hoon
      • KIPS Transactions on Software and Data Engineering
      • /
      • v.11 no.7
      • /
      • pp.291-298
      • /
      • 2022
    • In this paper, we present a method for dependency parsing with chunking in Korean. Dependency parsing is a task of determining a governor of every word in a sentence. In general, we used to determine the syntactic governor in Korean and should transform the syntactic structure into semantic structure for further processing like semantic analysis in natural language processing. There is a notorious problem to determine whether syntactic or semantic governor. For example, the syntactic governor of the word "먹고 (eat)" in the sentence "밥을 먹고 싶다 (would like to eat)" is "싶다 (would like to)", which is an auxiliary verb and therefore can not be a semantic governor. In order to mitigate this somewhat, we propose a Korean dependency parsing after chunking, which is a process of segmenting a sentence into constituents. A constituent is a word or a group of words that function as a single unit within a dependency structure and is called a chunk in this paper. Compared to traditional dependency parsing, there are some advantage of the proposed method: (1) The number of input units in parsing can be reduced and then the parsing speed could be faster. (2) The effectiveness of parsing can be improved by considering the relation between two head words in chunks. Through experiments for Sejong dependency corpus, we have shown that the USA and LAS of the proposed method are 86.48% and 84.56%, respectively and the number of input units is reduced by about 22%p.

    Korean Character processing: Part I. Theoretical Foundation (한글문자의 컴퓨터 처리: I. 이론)

    • 정원량
      • Journal of the Korean Institute of Telematics and Electronics
      • /
      • v.16 no.3
      • /
      • pp.1-8
      • /
      • 1979
    • This is Part I of a two-part article on Korean character processing by a computer. In part I, the problems in Korean character processing are identified and the theoretical foundation is laid out as a viable solution to them. The one-and two-dimensional syntactic structures of Korean characters are formally defined by means of BNF and " Patternal structure " respectively. Formal discussion of lexical and syntactic algorithms is given for character conversion. This character conversion algorithm is applicable to both input and output. For device-independence and implementation-independence, the concept of " cardinal symbol set " is introduced. We will present a historical survey of Korean character processing and discussion of implementation problems for the above algorithm In Part II.lgorithm In Part II.

    • PDF

    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.