• 제목/요약/키워드: the syntactic rules

검색결과 78건 처리시간 0.025초

분리성 통사원자의 유형별 검토 (A Study on Some Types of Separable Syntactic Atoms in Korean)

  • 이호승
    • 비교문화연구
    • /
    • 제38권
    • /
    • pp.433-459
    • /
    • 2015
  • This paper aims at a better understanding of the concept of korean separable syntactic atom, of which inner parts is separable in syntax, and at examining whether or not this concept can apply to derivatives, functional complex constructions, idiomatic expressions in korean. I defined a syntactic atom as a minimum unit which is drawn directly from lexicon and then is applied to syntactic rules. And I insist that so-called 'lexical island constraint' has some problems and that the syntactic rules can be applied to inner parts of syntactic atom, if the syntactic rules is irrelevant to new syntactic atom formation. The greater part of derivatives is non-separable syntactic atoms. But the likes of '반짝거리다', '죄송스럽다', '칭얼대다' are the separable syntactic atoms. The degree of separability of them is different in the insertion of korean particles or negative adverbs and the omission of root of sytactic atom. The derivatives of 'X-적', of which roots is regular nominal roots, permit the syntactic link between roots and the syntactic combination of the root and its argument. These kinds of derivatives is separable syntactic atoms. Also the derivatives of 'bracket paradox' and 'X-답-' derivatives is separable syntactic atoms. All functional complex constructions are not separable syntactic atoms. According to the degree of grammaticalization, inner parts of some are separable, some is non-separable. Separable functional complex constructions only permit the switching of endings or Josas but not application of other syntactic rules. All idiomatic expressions which are composed of two or more syntactic atoms are separable syntactic atoms. Some of them have so strong separability to allow the insertion of syntactic atom, adverb or adnominal modification and the noun in idiomatic expression to become the head of the relative clause. And some idiomatic expressions which have weak separability only permit interrogative's substitution or form change in fraction of idiomatic expressions.

통사적 모호성과 음운 구조 (Syntactic ambiguity and phonological structure)

  • 임운
    • 대한음성학회지:말소리
    • /
    • 제42호
    • /
    • pp.57-69
    • /
    • 2001
  • Syntactic ambiguity can be understood by context usually, especially in reading and writing. Because phonological structure including stress, intonation and phonological phenomena can be pronounced differently according to different syntactic structures, syntactic ambiguity can be solved by phonological structure in listening and speaking. The objectives of this study was to survey how Korean English teachers apply phonological structures in order to solve syntactic ambiguity. The results of this study is as follows: First, Korean English leachers applied Compound Stress Rules well, when the second word was not branched. But they did not apply Compound Stress Rules well, when the second word was branched. Second, several Korean English teachers did not apply Nuclear Stress Rules well. They usually put the strongest stress on the first word. Third Korean English teachers did not differentiate appropriate applying situation of palatalization. They applied palatalization at both the single and the separated Phonological Phrase. Fourth, Korean English teachers did not apply stress shifting when stress crash happened. Because they did not apply stress shifting, they put the strongest stress on inappropriate syllable.

  • PDF

영어 구문 분석의 효율 개선을 위한 3단계 구문 분석 (Three-Phase English Syntactic Analysis for Improving the Parsing Efficiency)

  • 김성동
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제5권1호
    • /
    • pp.21-28
    • /
    • 2016
  • 영어 구문 분석기는 영한 기계번역 시스템의 성능에 가장 큰 영향을 미치는 부분이다. 본 논문에서의 영어 구문 분석기는 규칙 기반 영한 기계번역 시스템의 한 부분으로서, 많은 구문 규칙을 구축하고 차트 파싱 기법으로 구문 분석을 수행한다. 구문 규칙의 수가 많기 때문에 구문 분석 과정에서 많은 구조가 생성되는데, 이로 인해 구문 분석 속도가 저하되고 많은 메모리를 필요로 하여 번역의 실용성이 떨어진다. 또한 쉼표를 포함하는 긴 문장들은 구문 분석 복잡도가 매우 높아 구문 분석 시간/공간 효율이 떨어지고 정확한 번역을 생성하기 매우 어렵다. 본 논문에서는 실제 생활에서 나타나는 긴 문장들을 효율적으로 번역하기 위해 문장 분할 방법을 적용한 3단계 구문 분석 방법을 제안한다. 구문 분석의 각 단계는 독립된 구문 규칙들을 적용하여 구문 분석을 수행함으로써 구문 분석의 복잡도를 줄이려 하였다. 이를 위해 구문 규칙을 3가지 부류로 분류하고 이를 이용한 3단계 구문 분석 알고리즘을 고안하였다. 특히 세 번째 부류의 구문 규칙은 쉼표로 구성되는 문장 구조에 대한 규칙으로 구성되는데, 이들 규칙들을 말뭉치의 분석을 통해 획득하는 방법을 제안하여 구문 분석의 적용률을 지속적으로 개선하고자 하였다. 실험을 통해 제안한 방법이 문장 분할만을 적용한 기존 2단계 구문 분석 방법에 비해 유사한 번역 품질을 유지하면서도 시간/공간 효율 면에서 우수함을 확인하였다.

구문 관계와 운율 특성을 이용한 한국어 운율구 경계 예측 (Prediction of Prosodic Break Using Syntactic Relations and Prosodic Features)

  • 정영임;조선호;윤애선;권혁철
    • 인지과학
    • /
    • 제19권1호
    • /
    • pp.89-105
    • /
    • 2008
  • 본 논문에서는 자연스러운 한국어 운율구 경계를 예측하기 위해 (1) 문장 성분을 하위범주화하고, (2) 세분화된 문장 성분 간 의존관계를 이용하여 통사구를 추출하며 (3) 추출한 통사구의 유형에 따른 운율구 경계 예측 규칙을 설정하였다. 또한, (4) 통사적 정보 외에도 통사구와 문장의 길이, 통사구의 문장 내 위치, 문맥의 의미 정보 등에 따라 가변적인 운율구 경계를 판단하여 보다 자연스러운 한국어 운율구 경계 예측 시스템을 개발하였다. 그 결과 통사구 경계와 상관관계가 높은 강한 운율구 경계 예측과 운율구 내부 비경계 예측에 있어 90% 이상의 높은 재현율과 정확도를 보였으며, 전체 운율구 경계 예측에 있어서도 87% 이상의 성능을 보였다.

  • PDF

The Parallel Corpus Approach to Building the Syntactic Tree Transfer Set in the English-to- Vietnamese Machine Translation

  • Dien Dinh;Ngan Thuy;Quang Xuan;Nam Chi
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
    • /
    • pp.382-386
    • /
    • 2004
  • Recently, with the machine learning trend, most of the machine translation systems on over the world use two syntax tree sets of two relevant languages to learn syntactic tree transfer rules. However, for the English-Vietnamese language pair, this approach is impossible because until now we have not had a Vietnamese syntactic tree set which is correspondent to English one. Building of a very large correspondent Vietnamese syntactic tree set (thousands of trees) requires so much work and take the investment of specialists in linguistics. To take advantage from our available English-Vietnamese Corpus (EVC) which was tagged in word alignment, we choose the SITG (Stochastic Inversion Transduction Grammar) model to construct English- Vietnamese syntactic tree sets automatically. This model is used to parse two languages at the same time and then carry out the syntactic tree transfer. This English-Vietnamese bilingual syntactic tree set is the basic training data to carry out transferring automatically from English syntactic trees to Vietnamese ones by machine learning models. We tested the syntax analysis by comparing over 10,000 sentences in the amount of 500,000 sentences of our English-Vietnamese bilingual corpus and first stage got encouraging result $(analyzed\;about\;80\%)[5].$ We have made use the TBL algorithm (Transformation Based Learning) to carry out automatic transformations from English syntactic trees to Vietnamese ones based on that parallel syntactic tree transfer set[6].

  • PDF

한국어 전산처리에서 규칙과 확률을 이용한 구문관계에 따른 의미역 결정 (Determination of Thematic Roles according to Syntactic Relations Using Rules and Statistical Models in Korean Language Processing)

  • 강신재;박정혜
    • 한국산업정보학회논문지
    • /
    • 제8권1호
    • /
    • pp.33-42
    • /
    • 2003
  • 본 논문은 한국어정보처리 과정에서 규칙과 확률을 이용하여 구문 관계를 의미역으로 사상시키는 방법을 제시하고 있다. 의미역의 결정은 의미 분석의 핵심 작업 중 하나이며 자연어처리에서 해결해야 하는 매우 중요한 문제중 하나이다. 일반적인 언어학 지식과 경험만 가지고 의미역 결정 규칙을 기술하는 것은 작업자의 주관에 따라 결과가 많이 달라질 수 있으며, 또 모든 경우를 다룰 수 있는 규칙의 구축은 불가능하다. 하지만 본 논문에서 제시하는 혼합 방법은 대량의 원시 말뭉치를 분석하여 실제 언어의 다양한 사용례를 반영하며, 또 수십 명의 한국어학자들이 심도 있게 구축하고 있는 세종전자사전의 격틀 정보도 함께 고려하기 때문에 보다 객관적이고 효율적인 방법이라 할 수 있다. 의미역을 보다 정확하게 결정하기 위해 구문관계, 의미부류, 형태소 정보, 이중주어의 위치정보 등의 자질 정보를 사용하였으며, 특히 의미부류의 사용으로 인해 적용률이 향상되는 효과를 가져올 수 있었다.

  • PDF

A Constraint-based Approach to English Gerunds

  • Kim, Yong-Beom
    • 한국언어정보학회지:언어와정보
    • /
    • 제7권2호
    • /
    • pp.117-137
    • /
    • 2003
  • This paper attempts to provide an alternative analysis involving categorical issues related to English gerunds. Especially, this paper rejects Maulof's approach that creates a new syntactic category gerund by mixing nominal and verbal categories. This paper identifies two syntactic structures in English gerunds: nominal gerunds and verbal gerunds. This distinction is based on syntactic and semantic characteristics of each type and is intended to account for the external distribution and endocentricity of the construction. Treating verbal gerunds syntactically as verbal categories, this paper proposes that English verbal gerunds act like other verbal categories such as infinitives whereas nominal gerunds behaves much like derived nominals. This paper proposes a few lexical rules that can take care of the two types of gerunds. The proposal can be extended to prepositional complements as well as sentential subject positions. This proposal not only resolves the issues involving distributional properties of the gerund construction but also captures syntactic parallelism observable between gerunds and other verbal constructions in English.

  • PDF

A comparison of grammatical error detection techniques for an automated english scoring system

  • Lee, Songwook;Lee, Kong Joo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제37권7호
    • /
    • pp.760-770
    • /
    • 2013
  • Detecting grammatical errors from a text is a long-history application. In this paper, we compare the performance of two grammatical error detection techniques, which are implemented as a sub-module of an automated English scoring system. One is to use a full syntactic parser, which has not only grammatical rules but also extra-grammatical rules in order to detect syntactic errors while paring. The other one is to use a finite state machine which can identify an error covering a small range of an input. In order to compare the two approaches, grammatical errors are divided into three parts; the first one is grammatical error that can be handled by both approaches, and the second one is errors that can be handled by only a full parser, and the last one is errors that can be done only in a finite state machine. By doing this, we can figure out the strength and the weakness of each approach. The evaluation results show that a full parsing approach can detect more errors than a finite state machine can, while the accuracy of the former is lower than that of the latter. We can conclude that a full parser is suitable for detecting grammatical errors with a long distance dependency, whereas a finite state machine works well on sentences with multiple grammatical errors.

Syntactic Structured Framework for Resolving Reflexive Anaphora in Urdu Discourse Using Multilingual NLP

  • Nasir, Jamal A.;Din, Zia Ud.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권4호
    • /
    • pp.1409-1425
    • /
    • 2021
  • In wide-ranging information society, fast and easy access to information in language of one's choice is indispensable, which may be provided by using various multilingual Natural Language Processing (NLP) applications. Natural language text contains references among different language elements, called anaphoric links. Resolving anaphoric links is a key problem in NLP. Anaphora resolution is an essential part of NLP applications. Anaphoric links need to be properly interpreted for clear understanding of natural languages. For this purpose, a mechanism is desirable for the identification and resolution of these naturally occurring anaphoric links. In this paper, a framework based on Hobbs syntactic approach and a system developed by Lappin & Leass is proposed for resolution of reflexive anaphoric links, present in Urdu text documents. Generally, anaphora resolution process takes three main steps: identification of the anaphor, location of the candidate antecedent(s) and selection of the appropriate antecedent. The proposed framework is based on exploring the syntactic structure of reflexive anaphors to find out various features for constructing heuristic rules to develop an algorithm for resolving these anaphoric references. System takes Urdu text containing reflexive anaphors as input, and outputs Urdu text with resolved reflexive anaphoric links. Despite having scarcity of Urdu resources, our results are encouraging. The proposed framework can be utilized in multilingual NLP (m-NLP) applications.

Syntactic and semantic information extraction from NPP procedures utilizing natural language processing integrated with rules

  • Choi, Yongsun;Nguyen, Minh Duc;Kerr, Thomas N. Jr.
    • Nuclear Engineering and Technology
    • /
    • 제53권3호
    • /
    • pp.866-878
    • /
    • 2021
  • Procedures play a key role in ensuring safe operation at nuclear power plants (NPPs). Development and maintenance of a large number of procedures reflecting the best knowledge available in all relevant areas is a complex job. This paper introduces a newly developed methodology and the implemented software, called iExtractor, for the extraction of syntactic and semantic information from NPP procedures utilizing natural language processing (NLP)-based technologies. The steps of the iExtractor integrated with sets of rules and an ontology for NPPs are described in detail with examples. Case study results of the iExtractor applied to selected procedures of a U.S. commercial NPP are also introduced. It is shown that the iExtractor can provide overall comprehension of the analyzed procedures and indicate parts of procedures that need improvement. The rich information extracted from procedures could be further utilized as a basis for their enhanced management.