• Title/Summary/Keyword: noun phrases

Search Result 60, Processing Time 0.028 seconds

Automatic Construction of a Concept Hierarchy from Coordinated Noun Phrases

  • No, Yong-Kyoon
    • Language and Information
    • /
    • v.11 no.1
    • /
    • pp.39-52
    • /
    • 2007
  • Noun phrase coordination is an extremely productive phenomenon. Based on an observation that conjuncts tend to denote semantically related concepts, we collect four hundred thousand pairs of conjuncts from the British National Corpus, in an attempt to build an is-a hierarchy of English noun concepts. The modifiedness patterns of the two words in these pairs point to three distinct semantic relations: sibling, cousin, and ancestor-or-ancestor' sibling. The process of finding them and how these pairs are used to motivate groups of quasi-synonyms and then to locate the hypernyms are discussed.

  • PDF

The Acquisition of External Sandhi in a Second Language: Production of Obstruent Nasalization by Chinese Learners of Korean

  • Han, Jeong-Im
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.77-83
    • /
    • 2011
  • The present study reports the results of an acoustic study of nasal assimilation at word boundaries in Chinese-Korean interlanguage. Twelve Chinese learners of Korean and four Korean native speakers recorded obstruent#nasal sequences in noun compounds and verb phrases, and their different production patterns were examined in detail. While nasalization of the word-final obstruents occurred only in 11.7% of the obstruent#nasal sequences for the Chinese learners, the Korean native speakers showed complete nasalization of those sequences. However, there was small, but consistent effect of learning on the production of external sandhi in L2, because there were shown to be differences in the rate of nasalization between the two proficiency groups of Chinese participants. On average, the intermediate level learners nasalized the target stops at the rate of 16%, and the beginning level learners showed the 7% nasalization rate. In addition, it was found that the context difference such as noun compounds versus verb phrases does not influence the nasalization pattern across word boundaries.

  • PDF

Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information (확률모형과 수식정보를 이용한 와/과 병렬사구 범위결정)

  • Choi, Yong-Seok;Shin, Ji-Ae;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.2
    • /
    • pp.128-136
    • /
    • 2008
  • Recognition of parallel structure at early stage of sentence parsing can reduce the complexity of parsing. In this paper, we propose an unsupervised language-independent probabilistic model for recongition of parallel noun structures. The proposed model is based on the idea of swapping constituents, which replies the properties of symmetry (two or more identical constituents are repeated) and of reversibility (the order of constituents is inter-changeable) in parallel structures. The non-symmetric patterns that cannot be captured by the general symmetry rule are resolved additionally by the modifier information. In particular this paper shows how the proposed model is applied to recognize Korean parallel noun phrases connected by "wa/kwa" particle. Our model is compared with other models including supervised models and performs better on recongition of parallel noun phrases.

Text Classification for Patents: Experiments with Unigrams, Bigrams and Different Weighting Methods

  • Im, ChanJong;Kim, DoWan;Mandl, Thomas
    • International Journal of Contents
    • /
    • v.13 no.2
    • /
    • pp.66-74
    • /
    • 2017
  • Patent classification is becoming more critical as patent filings have been increasing over the years. Despite comprehensive studies in the area, there remain several issues in classifying patents on IPC hierarchical levels. Not only structural complexity but also shortage of patents in the lower level of the hierarchy causes the decline in classification performance. Therefore, we propose a new method of classification based on different criteria that are categories defined by the domain's experts mentioned in trend analysis reports, i.e. Patent Landscape Report (PLR). Several experiments were conducted with the purpose of identifying type of features and weighting methods that lead to the best classification performance using Support Vector Machine (SVM). Two types of features (noun and noun phrases) and five different weighting schemes (TF-idf, TF-rf, TF-icf, TF-icf-based, and TF-idcef-based) were experimented on.

Study on the grammatical characteristics and fallacy of translation in the sentences of Donguibogam by Heo Jun - Focused on Tangaekpean(湯液篇) in Donguibogam "東醫寶鑑" - ("동의보감(東醫寶鑑)"에 쓰여진 허준(許浚) 문장(文章)의 문법적(文法的) 특성(特性)과 번역서(飜譯書)의 오류(誤謬) - "탕액편(湯液篇)"을 중심(中心)으로 -)

  • Kim, Yong-Han;Kim, Eun-Ha
    • Journal of Korean Medical classics
    • /
    • v.24 no.6
    • /
    • pp.111-124
    • /
    • 2011
  • The objectives of this study are to look into the grammatical characteristics and find misinterpretations on the translation books. 1. Sentences characteristics 1) Lots of ellipses of grammatical parts can be found such as conjunction, postposition, particle, Coverb, and focus on the parts which has practical meaning such as noun, pronoun, verb, adjective in the sentences. 2) Some predicates are skipped in the later phrases which has contradictive concepts against them of former phrases. 3) Pure Korean word order is exposed especially in complement. 2. Translation fallacy 1) There is fallacy in the sentences omitted paratactic conjunction as follows (1) mistranslation based on the wrong concept of the context between equal relation and subordinate relation. (2) failure on setting up the period, (3) misunderstanding equal relation as cause relation. 2) Some singular phrases, which are condition relation, were analyzed as plural phrases in the sentences omitted connection conjunction. 3) Ellipses of postposition obstruct understanding the difference between modifier and modificand in some sentences. 4) Some cause relation phrases were translated as equality relation due to lack of recognition of ellipsis of coverbs.

A Study on the Similarity of Compound Nouns and Noun Phrases in Sentences (문장의 복합명사와 명사구의 유사정도에 대한 고찰)

  • 이태영
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1999.08a
    • /
    • pp.43-46
    • /
    • 1999
  • 문장간의 유사정도와 명사구나 복합어간에서 유사한 그룹을 식별하는 연구를 수행하였다. 명사 어구는 형태소의 대체나 생략 등으로, 문장은 절간의 전체적 일치와 부분적 일치로 유사도를 측정하였다. 유사도가 50%이상되는 경우들에 유사성을 인정하였다.

  • PDF

Analyzer to Identify Phrases and the Functional Roles in Sentences: Its Architectural Aspects

  • Alam, Yukiko Sasaki
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.67-75
    • /
    • 2007
  • This paper presents the architectural aspects of the phrase analyzer that attempts to recognize phrases and identify the functional roles in the sentences in formal Japanese documents. Since the object of interest is a phrase, the current system, designed in an object-oriented architecture, contains the Phrase class, and makes use of the linguistic generalization about languages with Case markers that a phrase, whether a noun phrase, a verb phrase, a postposition (or preposition) phrase or a clause phrase, can be separated into the content and the function components. Without a dictionary, and drawing on the orthographic information on the words to parse, it also contains a class that identifies the types of characters, a class representing grammar, and a class playing the role of a controller. The system has a simple and intuitive structure, externally and internally, and therefore is easy to modify and extend.

  • PDF

Integrated Indexing Method using Compound Noun Segmentation and Noun Phrase Synthesis (복합명사 분할과 명사구 합성을 이용한 통합 색인 기법)

  • Won, Hyung-Suk;Park, Mi-Hwa;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.1
    • /
    • pp.84-95
    • /
    • 2000
  • In this paper, we propose an integrated indexing method with compound noun segmentation and noun phrase synthesis. Statistical information is used in the compound noun segmentation and natural language processing techniques are carefully utilized in the noun phrase synthesis. Firstly, we choose index terms from simple words through morphological analysis and part-of-speech tagging results. Secondly, noun phrases are automatically synthesized from the syntactic analysis results. If syntactic analysis fails, only morphological analysis and tagging results are applied. Thirdly, we select compound nouns from the tagging results and then segment and re-synthesize them using statistical information. In this way, segmented and synthesized terms are used together as index terms to supplement the single terms. We demonstrate the effectiveness of the proposed integrated indexing method for Korean compound noun processing using KTSET2.0 and KRIST SET which are a standard test collection for Korean information retrieval.

  • PDF

The Acoustic Characteristics of Focus Associated with the Korean Particle' -man' (한국어 특수조사 ‘-만’에 연계된 초점의 음향음성학적 특성)

  • Choe, J.W.;Jeon, Y.S.;C., Y.;Park, S.B.;Kim, K.H.
    • Speech Sciences
    • /
    • v.5 no.2
    • /
    • pp.77-91
    • /
    • 1999
  • The purpose of this paper is to investigate the phonetic characteristics of the 'focus' phrases associated with the particle '-man' in Korean. The particle '-man' is a bound morpheme which, like other postpositions such as the subject marker '-ka' and the object marker '-lil', the so-called 'case markers' in Korean, typically attaches to a noun (phrase). The semantics of '-man' roughly corresponds to that of only, its counterpart in English, and is thus classified as a 'delimiter' (Yang 1973). It is assumed in this paper that '-man', like only in English, should have a 'focus' associated with it (von Stechow 1991, Rooth 1992). In general, '-man' attached phrases get the focus, but sometimes the association is not clear-cut, especially in the cases of emphatic use of '-man' or when the context strongly favors other phrase as the focus (Choe 1996). In this paper, we compare the phonetic characteristics of the '-man' marked phrases with those to which '-ka'/'-lil' is attached, and conclude that the focused '-man' phrases show higher fundamental frequencies than their equally focused 'case' -marked counterparts. However, when the context clearly forces the focus to fall on phrases other than the '-man' or '-ka'/'-lil' attached ones, there is no meaningful difference in fundamental frequency between the '-man' and '-ka'/'-lil' attached phrases. We also compare the phonetic characteristics of the regular use of '-man' with those of the emphatic '-man'. According to our experiments, the emphatic '-man' does not bring forth its phonetic effects, namely, higher fundamental frequencies, on the' -man' attached words or phrases but rather in various other ways such as higher fundamental frequencies in '-man', lengthening of the following word-initial syllable, or the inclusion of the following word in the same accentual phrase. Finally, it is claimed that '-man' associated focus phenomena, especially the emphatic use of '-man', show some typical acoustic characteristics of the other well-known focus phenomena, namely, wh-interrogatives.

  • PDF