• Title/Summary/Keyword: Syntactic Analysis

Search Result 261, Processing Time 0.029 seconds

Implementation of A Plagiarism Detecting System with Sentence and Syntactic Word Similarities (문장 및 어절 유사도를 이용한 표절 탐지 시스템 구현)

  • Maeng, Joosoo;Park, Ji Su;Shon, Jin Gon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.109-114
    • /
    • 2019
  • The similarity detecting method that is basically used in most plagiarism detecting systems is to use the frequency of shared words based on morphological analysis. However, this method has limitations on detecting accurate degree of similarity, especially when similar words concerning the same topics are used, sentences are partially separately excerpted, or postpositions and endings of words are similar. In order to overcome this problem, we have designed and implemented a plagiarism detecting system that provides more reliable similarity information by measuring sentence similarity and syntactic word similarity in addition to the conventional word similarity. We have carried out a comparison of on our system with a conventional system using only word similarity. The comparative experiment has shown that our system can detect plagiarized document that the conventional system can detect or cannot.

Interactions between Morpho-Syntax and Semantics in English Agreement

  • Kim, Jong-Bok
    • Language and Information
    • /
    • v.7 no.1
    • /
    • pp.55-68
    • /
    • 2003
  • Most of the previous approaches to English agreement phenomena have relied upon only one component of the grammar (e.g., either syntax, or semantics, or pragmatics). This paper argues that interrelationships among different grammatical components play crucial roles in such phenomenon too (cf. Kathol 1999 and Hudson 1999). The paper proposes that, contrary to traditional wisdom, English determiner-noun agreement is morpho-syntactic whereas subject-verb and pronoun-antecedent agreement are reflections of index agreement (cf. Pollard and Sag 1994). The present hybrid analysis of English agreement shows the importance of the interaction of different components of the grammar in accounting for English agreement phenomena. In particular, once we allow morphology to tightly interact with the system of syntax, semantics, or even pragmatics, we could provide a solution to some puzzling English agreement phenomena. This allows a more principled theory of English agreement.

  • PDF

A Syntactic Account of the Properties of Bare Nominals in Discourse

  • Ahn, Hee-Don;Cho, Sung-Eun
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.57-66
    • /
    • 2007
  • Case markers in Korean are omissible in colloquial speech. Previous discourse studies of Caseless bare NPs in Korean show that the information structure of zero Nominative not only differs from that of overt Nominative but it also differs from that of zero Accusative in many respects. This paper aims to provide a basis for these semantic/pragmatic properties of Caseless NPs through the syntactic difference between bare subjects and bare objects: namely, the former are left-dislocated NPs, whereas the latter form complex predicates with the subcategorizing verbs. Our analysis will account for the facts that (i) the distribution of bare subject NPs are more restricted than that of bare object NPs; (ii) bare subject NPs must be specific or topical; (iii) Acc-marked NPs in canonical position tend to be focalized.

  • PDF

An Analysis of Syntactic and Semantic Relations between Negative Polarity Items and Negatives in Korean. (결합범주문법을 이용한 한국어 부정극어와 부정어의 통사 및 의미적 관계 분석)

  • 김정재;박정철
    • Language and Information
    • /
    • v.8 no.1
    • /
    • pp.53-76
    • /
    • 2004
  • Negative polarity items(NPIs), which function as quantifiers are licensed in a syntactically strict way by negatives, which function as qualifiers, resulting in universal negating interpretations as pairs. We present a proposal to explain the related phenomena, in which the syntax and the semantics are closely related to each other, with Combinatory Categorial Grammar. For this purpose, we first adopt the usual approach to scrambling, but control its overgeneration with the use of markers, taking into account the complex syntactic phenomena involving NPIs and scrambling in Korean. We also propose to utilize polarity intensity as a novel feature, in order to account for the universal negating interpretations when NPIs are combined with negatives. Our proposal also explains the difference in readings when other quantifiers or qualifiers intervene the NPI and the related negatives.

  • PDF

An Analysis of Students' Understanding of Mathematical Concepts and Proving - Focused on the concept of subspace in linear algebra - (대학생들의 증명 구성 방식과 개념 이해에 대한 분석 - 부분 공간에 대한 증명 과정을 중심으로 -)

  • Cho, Jiyoung;Kwon, Oh Nam
    • School Mathematics
    • /
    • v.14 no.4
    • /
    • pp.469-493
    • /
    • 2012
  • The purpose of this study is find the relation between students' concept and types of proof construction. For this, four undergraduate students majored in mathematics education were evaluated to examine how they understand mathematical concepts and apply their concepts to their proving. Investigating students' proof with their concepts would be important to find implications for how students have to understand formal concepts to success in proving. The participants' proof productions were classified into syntactic proof productions and semantic proof productions. By comparing syntactic provers and semantic provers, we could reveal that the approaches to find idea for proof were different for two groups. The syntactic provers utilized procedural knowledges which had been accumulated from their proving experiences. On the other hand, the semantic provers made use of their concept images to understand why the given statements were true and to get a key idea for proof during this process. The distinctions of approaches to proving between two groups were related to students' concepts. Both two types of provers had accurate formal concepts. But the syntactic provers also knew how they applied formal concepts in proving. On the other hand, the semantic provers had concept images which contained the details and meaning of formal concept well. So they were able to use their concept images to get an idea of proving and to express their idea in formal mathematical language. This study leads us to two suggestions for helping students prove. First, undergraduate students should develop their concept images which contain meanings and details of formal concepts in order to produce a meaningful proof. Second, formal concepts with procedural knowledge could be essential to develop informal reasoning into mathematical proof.

  • PDF

Boolean Query Formulation From Korean Natural Language Queries using Syntactic Analysis (구문분석에 기반한 한글 자연어 질의로부터의 불리언 질의 생성)

  • Park, Mi-Hwa;Won, Hyeong-Seok;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1219-1229
    • /
    • 1999
  • 일반적으로 AND, OR, NOT과 같은 연산자를 사용하는 불리언 질의는 사용자의 검색의도를 정확하게 표현할 수 있기 때문에 검색 전문가들은 불리언 질의를 사용하여 높은 검색성능을 얻는다고 알려져 있지만, 일반 사용자는 자신이 원하는 정보를 불리언 형태로 표현하는데 익숙하지 않다. 본 논문에서는 검색성능의 향상과 사용자 편의성을 동시에 만족하기 위하여 사용자의 자연어 질의를 확장 불리언 질의로 자동 변환하는 방법론을 제안한다. 먼저 자연어 질의를 범주문법에 기반한 구문분석을 수행하여 구문트리를 생성하고 연산자 및 키워드 정보를 추출하여 구문트리를 간략화한다. 다음으로 간략화된 구문트리로부터 명사구를 합성하고 키워드들에 대한 가중치를 부여한 후 불리언 질의를 생성하여 검색을 수행한다. 또한 구문분석의 오류로 인한 검색성능 저하를 최소화하기 위하여 상위 N개 구문트리에 대해 각각 불리언 질의를 생성하여 검색하는 N-BEST average 방법을 제안하였다. 정보검색 실험용 데이타 모음인 KTSET2.0으로 실험한 결과 제안된 방법은 수동으로 추출한 불리언 질의보다 8% 더 우수한 성능을 보였고, 기존의 벡터공간 모델에 기반한 자연어질의 시스템에 비해 23% 성능향상을 보였다. Abstract There have been a considerable evidence that trained users can achieve a good search effectiveness through a boolean query because a structural boolean query containing operators such as AND, OR, and NOT can make a more accurate representation of user's information need. However, it is not easy for ordinary users to construct a boolean query using appropriate boolean operators. In this paper, we propose a boolean query formulation method that automatically transforms a user's natural language query into a extended boolean query for both effectiveness and user convenience. First, a user's natural language query is syntactically analyzed using KCCG(Korean Combinatory Categorial Grammar) parser and resulting syntactic trees are structurally simplified using a tree-simplifying mechanism in order to catch the logical relationships between keywords. Next, in a simplified tree, plausible noun phrases are identified and added into the same tree as new additional keywords. Finally, a simplified syntactic tree is automatically converted into a boolean query using some mapping rules and linguistic heuristics. We also propose an N-BEST average method that uses top N syntactic trees to compensate for bad effects of single incorrect top syntactic tree. In experiments using KTSET2.0, we showed that a proposed method outperformed a traditional vector space model by 23%, and surprisingly manually constructed boolean queries by 8%.

Automatic Error Detection of Morpho-syntactic Errors of English Writing Using Association Rule Analysis Algorithm (연관 규칙 분석 알고리즘을 활용한 영작문 형태.통사 오류 자동 발견)

  • Kim, Dong-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.3-8
    • /
    • 2010
  • 본 연구에서는 일련의 연구에서 수집된 영작문 오류 유형의 정제된 자료를 토대로 연관 규칙을 생성하고, 학습을 통해서 효용성이 검증된 연관 규칙을 활용해서 영작문 데이터의 형태 통사 오류를 자동으로 탐지한다. 영작문 데이터에서 형태 통사 오류를 찾아내는 작업은 많은 시간과 자원이 소요되는 작업이므로 자동화가 필수적이다. 기존의 연구들이 통계적 모델을 활용한 어휘적 오류에 치중하거나 언어 이론적 틀에 근거한 통사 처리에 집중하는 반면에, 본 연구는 데이터 마이닝을 통해서 정제된 데이터에서 연관 규칙을 생성하고 이를 검증한 후 형태 통사 오류를 감지한다. 이전 연구들에서는 이론적 틀에 맞추어진 규칙 생성이나 언어 모델 생성을 위한 대량의 코퍼스 데이터와 같은 다량의 지식 베이스 생성이 필수적인데, 본 연구는 적은 양의 정제된 데이터를 활용한다. 영작문 오류 유형의 형태 통사 연관 규칙을 생성하기 위해서 Apriori 알고리즘을 활용하였다. 알고리즘을 통해서 생성된 연관 규칙 중 잘못된 규칙이 생성될 가능성이 있으므로, 상관성 검정, 코사인 유사도와 같은 규칙 효용성의 통계적 검증을 활용해서 타당한 규칙만을 학습하였다. 이를 통해서 축적된 연관 규칙들을 영작문 오류를 자동으로 탐지하는 실험에 활용하였다.

  • PDF

Analysis of Web Browser Security Configuration Options

  • Jillepalli, Ananth A.;de Leon, Daniel Conte;Steiner, Stuart;Alves-Foss, Jim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.6139-6160
    • /
    • 2018
  • For ease of use and access, web browsers are now being used to access and modify sensitive data and systems including critical control systems. Due to their computational capabilities and network connectivity, browsers are vulnerable to several types of attacks, even when fully updated. Browsers are also the main target of phishing attacks. Many browser attacks, including phishing, could be prevented or mitigated by using site-, user-, and device-specific security configurations. However, we discovered that all major browsers expose disparate security configuration procedures, option names, values, and semantics. This results in an extremely hard to secure web browsing ecosystem. We analyzed more than a 1000 browser security configuration options in three major browsers and found that only 13 configuration options had syntactic and semantic similarity, while 4 configuration options had semantic similarity, but not syntactic similarity. We: a) describe the results of our in-depth analysis of browser security configuration options; b) demonstrate the complexity of policy-based configuration of web browsers; c) describe a knowledge-based solution that would enable organizations to implement highly-granular and policy-level secure configurations for their information and operational technology browsing infrastructures at the enterprise scale; and d) argue for necessity of developing a common language and semantics for web browser configurations.

Determination of Thematic Roles according to Syntactic Relations Using Rules and Statistical Models in Korean Language Processing (한국어 전산처리에서 규칙과 확률을 이용한 구문관계에 따른 의미역 결정)

  • 강신재;박정혜
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.1
    • /
    • pp.33-42
    • /
    • 2003
  • This paper presents an efficient determination method of thematic roles from syntactic relations using rules and statistical model in Korean language processing. This process is one of the main core of semantic analysis and an important issue to be solved in natural language processing. It is problematic to describe rules for determining thematic roles by only using general linguistic knowledge and experience, since the final result may be different according to the subjective views of researchers, and it is impossible to construct rules to cover all cases. However, our hybrid method is objective and efficient by considering large corpora, which contain practical usages of Korean language, and case frames in the Sejong Electronic Lexicon of Korean, which is being developed by dozens of Korean linguistic researchers. To determine thematic roles more correctly, our system uses syntactic relations, semantic classes, morpheme information, position of double subject. Especially by using semantic classes, we can increase the applicability of our system.

  • PDF