• Title/Summary/Keyword: syntactic constituent

Search Result 14, Processing Time 0.021 seconds

Text Watermarking Based on Syntactic Constituent Movement (구문요소의 전치에 기반한 문서 워터마킹)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.79-84
    • /
    • 2009
  • This paper explores a method of text watermarking for agglutinative languages and develops a syntactic tree-based syntactic constituent movement scheme. Agglutinative languages provide a good ground for the syntactic tree-based natural language watermarking because syntactic constituent order is relatively free. Our proposed natural language watermarking method consists of seven procedures. First, we construct a syntactic dependency tree of unmarked text. Next, we perform clausal segmentation from the syntactic tree. Third, we choose target syntactic constituents, which will move within its clause. Fourth, we determine the movement direction of the target constituents. Then, we embed a watermark bit for each target constituent. Sixth, if the watermark bit does not coincide with the direction of the target constituent movement, we displace the target constituent in the syntactic tree. Finally, from the modified syntactic tree, we obtain a marked text. From the experimental results, we show that the coverage of our method is 91.53%, and the rate of unnatural sentences of marked text is 23.16%, which is better than that of previous systems. Experimental results also show that the marked text keeps the same style, and it has the same information without semantic distortion.

A Phonetic Study of Vowel Raising: A Closer Look at the Realization of the Suffix {-go} (모음 상승 현상의 음성적 고찰: 어미 {-고}의 실현을 중심으로)

  • LEE, HYANG WON;Shin, Jiyoung
    • Korean Linguistics
    • /
    • v.81
    • /
    • pp.267-297
    • /
    • 2018
  • Vowel raising in Korean has been primarily treated as a phonological, categorical change. This study aims to show how the Korean connective suffix {-go} is realized in various environments, and propose a principle of vowel raising based on both acoustic and perceptual data. To that end, we used a corpus of spoken Korean to analyze the types of syntactic constructions, the realization of prosodic boundaries (IP and PP), and the types of boundary tone associated with {-go}. It was found that the vowel tends to be raised most frequently in utterance-final position, while in utterance-medial position the vowel was raised more when the syntactic and prosodic distance between {-go} and the following constituent was smaller. The results for boundary tone also showed a correlation between vowel raising and the discourse function of the boundary tone. In conclusion, we propose that vowel raising is not simply an optional phenomenon, but rather a type of phonetic reduction related to the comprehension of the following constituent.

An Implementation of Syntactic Constituent Recognizer Using Connectionism (Connectionism을 이용한 부분 구문 인식기의 구현)

  • Jung, Han-Min;Yuh, Sang-Hwa;Kim, Tae-Wan;Park, Dong-In
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.479-483
    • /
    • 1996
  • 본 논문은 구운 분석의 검색 영역 축소를 통한 구문 분석기의 성능 향상을 목적으로 connectionism을 이용한 부분 구문 인식기의 설계와 구현을 기술한다. 본 부분 구문 인식기는 형태소 분석된 문장으로부터 명사-주어부와 술어부를 인식함으로써 전체 검색 영역을 여러 부분으로 나누어 구문 분석문제를 축소시키는 것을 목적으로 하고 있다. Connectionist 모델은 입력층과 출력층으로 구성된 개선된 퍼셉트론 구조이며, 입/출력층 사이의 노드들을, 입력층 사이의 노드들을 연결하는 연결 강도(weight)가 존재한다. 명사-주어부 및 술어부 구문 태그를 connectionist 모델에 적용하며, 학습 알고리즘으로는 개선된 백프로퍼게이션 학습 알고리즘을 사용한다. 부분 구문 인식 실험은 112개 문장의 학습 코퍼스와 46개 문장의 실험 코퍼스에 대하여 85.7%와 80.4%의 정확한 명사-주어부 및 술어부 인식을, 94.6%와 95.7%의 명사-주어부와 술어부 사이의 올바른 경계 인식을 보여준다.

  • PDF

Korean Parsing Model using Various Features of a Syntactic Object (문장성분의 다양한 자질을 이용한 한국어 구문분석 모델)

  • Park So-Young;Kim Soo-Hong;Rim Hae-Chang
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.743-748
    • /
    • 2004
  • In this paper, we propose a probabilistic Korean parsing model using a syntactic feature, a functional feature, a content feature, and a site feature of a syntactic object for effective syntactic disambiguation. It restricts grammar rules to binary-oriented form to deal with Korean properties such as variable word order and constituent ellipsis. In experiments, we analyze the parsing performance of each feature combination. Experimental results show that the combination of different features is preferred to the combination of similar features. Besides, it is remarkable that the function feature is more useful than the combination of the content feature and the size feature.

A Structure of Passive Constructions in Korean and their meaning 'Potential' (한국어 피동문의 구조와 가능(potential)의 의미 해석 -대조적 관점에서-)

  • Mok, Jung-Soo;Kim, Yeong-Jung
    • Lingua Humanitatis
    • /
    • v.8
    • /
    • pp.369-387
    • /
    • 2006
  • Which syntactic function should we assign to the 'ga-type' constituent which occurs in the morphological passive constructions in Korean, [N0-neun N1-i Vpass-ending]? This problem is very important in two respects. First, a small change of status of the particle 'i/ga' can exert an overall influence on the Korean grammar. Second, the particle '-i/ga' cannot guarantee that 'ga-type' constituents are subject of the sentence, so that the concept of syntactic category should be distinguished from that of syntactic function. This paper claims that the analysis of sentence has long been focused on the structure of proposition, namely the argument structure and that the direction of analysis should be turned to the 'person structure' which can be revealed on the pragmatic level. On the basis of this, this paper suggests that the specific type of the morphological passive constructions in Korean, [N0-neun N1-i Vpass-ending] should be analysed in line with the psych-verb constructions and that the modal meaning 'potential' of the passive constructions is correlated with sentence pattern and 'person structure'.

  • PDF

The Idiom, the Lexicon, and the Formation of a Sentence (관용 표현과 어휘부, 그리고 문장의 형성)

  • Hwang, Hwa-sang
    • Korean Linguistics
    • /
    • v.65
    • /
    • pp.295-320
    • /
    • 2014
  • The idiom is listed in the lexicon, because it's meaning cannot be inferred from it's constituents. And the idiom is a single semantic unit. Thus the idiom is inserted to the syntax in the quality of a word. But the idiom is not always inserted to the syntax as a word. In the process generating the sentence, we can recognize the categorial property of the idiom that it is formally equal to the syntactic phrase. Then each of the constituents of the idiom can be inserted to the syntax. This is why the syntactic operation(as modification, topicalization, relativization, etc) can be applied to the constituent of the idiom. In this respect the idiom is a flexible construction as the listeme of a lexicon. The flexible property of the idiom is related to the dynamicity of a lexicon. The formal or semantic transformation of the idiom is the good example to show the dynamicity of a lexicon.

Exploiting Chunking for Dependency Parsing in Korean (한국어에서 의존 구문분석을 위한 구묶음의 활용)

  • Namgoong, Young;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.7
    • /
    • pp.291-298
    • /
    • 2022
  • In this paper, we present a method for dependency parsing with chunking in Korean. Dependency parsing is a task of determining a governor of every word in a sentence. In general, we used to determine the syntactic governor in Korean and should transform the syntactic structure into semantic structure for further processing like semantic analysis in natural language processing. There is a notorious problem to determine whether syntactic or semantic governor. For example, the syntactic governor of the word "먹고 (eat)" in the sentence "밥을 먹고 싶다 (would like to eat)" is "싶다 (would like to)", which is an auxiliary verb and therefore can not be a semantic governor. In order to mitigate this somewhat, we propose a Korean dependency parsing after chunking, which is a process of segmenting a sentence into constituents. A constituent is a word or a group of words that function as a single unit within a dependency structure and is called a chunk in this paper. Compared to traditional dependency parsing, there are some advantage of the proposed method: (1) The number of input units in parsing can be reduced and then the parsing speed could be faster. (2) The effectiveness of parsing can be improved by considering the relation between two head words in chunks. Through experiments for Sejong dependency corpus, we have shown that the USA and LAS of the proposed method are 86.48% and 84.56%, respectively and the number of input units is reduced by about 22%p.

Two-Phase Shallow Semantic Parsing based on Partial Syntactic Parsing (부분 구문 분석 결과에 기반한 두 단계 부분 의미 분석 시스템)

  • Park, Kyung-Mi;Mun, Young-Song
    • The KIPS Transactions:PartB
    • /
    • v.17B no.1
    • /
    • pp.85-92
    • /
    • 2010
  • A shallow semantic parsing system analyzes the relationship that a syntactic constituent of the sentence has with a predicate. It identifies semantic arguments representing agent, patient, instrument, etc. of the predicate. In this study, we propose a two-phase shallow semantic parsing model which consists of the identification phase and the classification phase. We first find the boundary of semantic arguments from partial syntactic parsing results, and then assign appropriate semantic roles to the identified semantic arguments. By taking the sequential two-phase approach, we can alleviate the unbalanced class distribution problem, and select the features appropriate for each task. Experiments show the relative contribution of each phase on the test data.

The Phonetic Realization of intermediate phrase in French Intonation (프랑스어 억양구조에서 중간구의 음성적 실현 양상)

  • Yuh, Hea-Oak;Lee, Eun-Yung
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.185-200
    • /
    • 2002
  • The current study confirmed the existence of an ip prosodic level in French intonation structure, as previously proposed by Sun-Ah Jun & $C\acute{e}cile$cile Fougeron (2000). However, in contrast to the previous suggestion of the plateau realized in an ip in several syntactic structures, the current study supposed that the plateau doesn't come from the different type of syntactic structures but arise from the unspecified syllables without any PA in an ip. Because if we limited ip phrasal tone to the syntactic structure, it would be difficult to find the more general reasons of ip level. Besides /Hi/ and /$H^*$/ we also used /$Hi^*$/ for the focused syllable in the current study. In emphasized sentences, in general, /$Hi^*$/ appeared in the first or second syllable of a leftward AP in an ip and /$H^*$/ in the final syllable of a rightmost AP of an ip, In contrast to these PAs, /$Hi^*$/ might appear in any syllable in an ip, but not to far from /$H^*$/ because the duration time and length t of plateau realized between /$Hi^*$/ and /$H^*$/ or /Hi/ and /$H^*$/ would make an essential harmonious rhythmic unit, Therefore, the current study determined the duration time and the number of syllables realized in each plateau in an ip level composed of more than one AP. As a phrase constituent structure, there is a practical need for intermediate prosodic units to allow for generalization over the many possible combinations of prosodic patterns that can occur. Further evidence is still needed to analyze and relate the different pitch ranges of the plateau of an ip according to the syntactic structure, to identify the considerable character in the French prosodic hierarchy.

  • PDF

A Morpheme-unit Korean Feature-Based Brammer (KFG) with the X-bar Theoretic Notion of Headedness (X-바 이론의 중심어 개념을 도입한 형태소 단위의 한국어 자질 기반 문법)

  • Park, So-Yeong;Hwang, Yeong-Suk;Im, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1247-1259
    • /
    • 1999
  • 본 논문에서는 한국어 문장형성원리를 간결하게 제시할 수 있도록 X-바 이론의 중심어 개념을 도입한 한국어 자질기반 문법을 제안한다. 제안하는 문법은 어절에 관계없이 나타나는 한국어의 문법현상을 명확히 설명할 수 있도록 어절 대신 형태소를 기본단위로 한다. 그리고, 한국어의 구문범주가 지닌 의미정보와 기능정보를 자질을 이용하여 독립적으로 표현하며, 구문범주간의 결합관계를 바탕으로 하는 자질연산을 수행하여 문장을 분석한다. 또한, 한국어의 부분자유어순과 생략현상에 대해 견고하게 분석할 수 있도록 자질연산을 이진결합중심의 CNF(Chomsky Normal Form)로 제한한다. 이렇게 구성된 한국어 자질기반 문법은 규칙을 직관적이고도 간단하게 기술하며, 한국어의 다양한 문장들을 견고하게 분석한다. SERI Test Suites 97과 신문기사에서 746문장을 추출하여 실험한 결과 94%~99%의 적용율을 보였다.Abstract In this paper, we propose a Korean feature-based grammar(KFG) which adopts the X-bar theoretic notion of headedness for a precise representation of Korean syntactic structure. In order to explain various language phenomena in a given sentence, we use not the word but the morpheme as a constituent unit of KFG. We use features manifesting both the syntactic information and the semantic information of Korean syntactic categories, and feature operations based on the association relationship between two categories. In addition, we restrict feature operations to CNF(Chomsky Normal Form) binary form, which provides a robust representation for properties in Korean such as the frequent ellipsis and the partial free-order. The KFG is intuitive, simple, and versatile in representing most Korean sentences. The experimental result shows 94%~99% coverage on 746 sentences extracted from SERI Test Suites 97 and newspaper sentences.