• Title/Summary/Keyword: Korean nouns

Search Result 232, Processing Time 0.024 seconds

Disambiguation of Homograph Suffixes using Lexical Semantic Network(U-WIN) (어휘의미망(U-WIN)을 이용한 동형이의어 접미사의 의미 중의성 해소)

  • Bae, Young-Jun;Ock, Cheol-Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.1
    • /
    • pp.31-42
    • /
    • 2012
  • In order to process the suffix derived nouns of Korean, most of Korean processing systems have been registering the suffix derived nouns in dictionary. However, this approach is limited because the suffix is very high productive. Therefore, it is necessary to analyze semantically the unregistered suffix derived nouns. In this paper, we propose a method to disambiguate homograph suffixes using Korean lexical semantic network(U-WIN) for the purpose of semantic analysis of the suffix derived nouns. 33,104 suffix derived nouns including the homograph suffixes in the morphological and semantic tagged Sejong Corpus were used for experiments. For the experiments first of all we semantically tagged the homograph suffixes and extracted root of the suffix derived nouns and mapped the root to nodes in the U-WIN. And we assigned the distance weight to the nodes in U-WIN that could combine with each homograph suffix and we used the distance weight for disambiguating the homograph suffixes. The experiments for 35 homograph suffixes occurred in the Sejong corpus among 49 homograph suffixes in a Korean dictionary result in 91.01% accuracy.

Korean Unknown-noun Recognition using Strings Following Nouns in Words (명사후문자열을 이용한 미등록어 인식)

  • Park, Ki-Tak;Seo, Young-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.4
    • /
    • pp.576-584
    • /
    • 2017
  • Unknown nouns which are not in a dictionary make problems not only morphological analysis but also almost all natural language processing area. This paper describes a recognition method for Korean unknown nouns using strings following nouns such as postposition, suffix and postposition, suffix and eomi, etc. We collect and sort words including nouns from documents and divide a word including unknown noun into two parts, candidate noun and string following the noun, by finding same prefix morphemes from more than two unknown words. We use information of strings following nouns extracted from Sejong corpus and decide unknown noun finally. We obtain 99.64% precision and 99.46% recall for unknown nouns occurred more than two forms in news of two portal sites.

Parallels between Korean Verbs and Nouns in Subcategorization (한국어 동사와 명사사이의 하위범주화에 있어서의 평행성)

  • 노용균
    • Language and Information
    • /
    • v.1
    • /
    • pp.27-65
    • /
    • 1997
  • Nouns in the Korean language are subcategorized for various frames(called SUBCAT lists) in much the same way as verbs are. Assuming a monostratal grammar and building on analyses of various 'little elements' as clitics, such as the ones given by No(1991), Chae(1995,1996), and Oh(1991), I delineate the ranges of SUBCAT lists for the Korean verbs and nouns and show that the two word-classes have heavily overlapping frames. Twenty five SUBCAT lists are identified for verbs, and twenty four for nouns, of which twenty three find associated lexical items in both. By the way of justification, I offer analyses of noun--verb collocations in terms of the new five-valued syntactic feature COLLOC along with SUBCAT, which subsume 'light verb' constructions. It is hoped that this work will have given clear syntactic underpinnings to those who are concerned with practical lexicography.

  • PDF

Noun Sense Identification of Korean Nominal Compounds Based on Sentential Form Recovery

  • Yang, Seong-Il;Seo, Young-Ae;Kim, Young-Kil;Ra, Dong-Yul
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.740-749
    • /
    • 2010
  • In a machine translation system, word sense disambiguation has an essential role in the proper translation of words when the target word can be translated differently depending on the context. Previous research on sense identification has mostly focused on adjacent words as context information. Therefore, in the case of nominal compounds, sense tagging of unit nouns mainly depended on other nouns surrounding the target word. In this paper, we present a practical method for the sense tagging of Korean unit nouns in a nominal compound. To overcome the weakness of traditional methods regarding the data sparseness problem, the proposed method adopts complement-predicate relation knowledge that was constructed for machine translation systems. Our method is based on a sentential form recovery technique, which recognizes grammatical relationships between unit nouns. This technique makes use of the characteristics of Korean predicative nouns. To show that our method is effective on text in general domains, the experiments were performed on a test set randomly extracted from article titles in various newspaper sections.

A Compound Term Retrieval Model Using Statistical lnformation (통계적 정보를 이용한 복합명사 검색 모델)

  • 박영찬;최기선
    • Korean Journal of Cognitive Science
    • /
    • v.6 no.3
    • /
    • pp.65-81
    • /
    • 1995
  • Compound nouns as a composition of multiple nouns exhibit diverse occurence patterns in the texts and have varying degree of meaning coherence.The problem of compound nouns in information retrieval is to find a method to represent and identify the compositive patterns of each words.This paper explains how the cooccurrence patterns are related with the meaning of each compound noun and the information of such relations that can be mechanically acquired from texts is used in ranking the candidated documents for a given query.The main theme of the paper is that compound nouns can be categorized according to their occurrence patterns of simple nouns and these occurrence patterns can be formalized by statistical analysis without large dictionary or complex compositive rules.Our suggested model achieved about 7.75% improvement over the best precision of the other methods at each recall measurements on Korean test collection.

  • PDF

A Reverse Segmentation Algorithm of Compound Nouns Using Affix Information and Preference Pattern (접사정보 및 선호패턴을 이용한 복합명사의 역방향 분해 알고리즘)

  • Ryu, Bang;Baek, Hyun-Chul;Kim, Sang-Bok
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.3
    • /
    • pp.418-426
    • /
    • 2004
  • This paper suggests a reverse segmentation Algorithm using affix information and some preference pattern information of Korean compound nouns. The structure of Korean compound nouns are mostly derived from the Chinese characters and it includes some preference patterns, which are going to be utilized as a segmentation rule in this paper. To evaluate the accuracy of the proposed algorithm, an experiment was performed with 36061 compound nouns. The experiment resulted in getting 99.3% of correct segmentation and showed excellent satisfactory result from the comparative experimentation with other algorithm, especially most of the four or five-syllable compound nouns were successfully segmented without fail.

  • PDF

Exploring the Microscopic Textual Characteristics of Japanese Prime Ministers' Diet Addressesby Measuring the Quantity and Diversity of Nouns

  • Suzuki, Takafumi;Kageura, Kyo
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.459-470
    • /
    • 2007
  • This study explores the textual characteristics, more precisely the quantity and diversity of nouns, of Japanese prime ministers' Diet addresses. In the field of stylistics, textual characteristics independent of the content have been examined with the aim on detecting the authors, genres, and chronological variations of texts. This study focuses instead on textual characteristics related to the content of texts, namely the quantity and diversity of nouns, because our aim is to analyze texts to better understand two political phenomena: (a) the difference between the two types of Diet addresses delivered by Japanese prime ministers, and (b) the perceived changes made to these addresses by two powerful prime ministers. It is a case study of the microscopic characterization of texts, which has become more and more important with the expansion in the scope of stylistics and the production of a wide variety of new types of texts following the advent of the Web.

  • PDF

Stress Patterns of Compound Nouns in English (영어 복합명사의 강세형)

  • Lee Yeong-Kil
    • MALSORI
    • /
    • no.42
    • /
    • pp.25-36
    • /
    • 2001
  • Stress assignment has been much discussed in the literature on English compound nouns. The general view of the stress pattern of English compound nouns is that a main stress falls on the first element and a secondary stress on the second element; however, a stress pattern is often employed that provides counterevidence to the traditional pedagogical approach. A new idea is suggested by Ladd(1984) that 'compound stress represents the deaccenting of the head of the compound.' Recent studies show that initial stressing does not indicate compounds and syntactic phrases are not always characterized by final stressing. In his pilot test Pennanen comments on the frequent variation of stress patterns on individual items, on the basis of which Bauer confirms Pennanen's results with different informants. This paper is an attempt to justify Bauer's analysis with the same data as Bauer's and different subjects. It turns out that the competences of native-speaker informants do not rovide clear-cut answers. Some factors should be taken into account in assigning appropirate stress to compound nouns.

  • PDF

Effective Thematic Words Extraction from a Book using Compound Noun Phrase Synthesis Method

  • Ahn, Hee-Jeong;Kim, Kee-Won;Kim, Seung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.3
    • /
    • pp.107-113
    • /
    • 2017
  • Most of online bookstores are providing a user with the bibliographic book information rather than the concrete information such as thematic words and atmosphere. Especially, thematic words help a user to understand books and cast a wide net. In this paper, we propose an efficient extraction method of thematic words from book text by applying the compound noun and noun phrase synthetic method. The compound nouns represent the characteristics of a book in more detail than single nouns. The proposed method extracts the thematic word from book text by recognizing two types of noun phrases, such as a single noun and a compound noun combined with single nouns. The recognized single nouns, compound nouns, and noun phrases are calculated through TF-IDF weights and extracted as main words. In addition, this paper suggests a method to calculate the frequency of subject, object, and other roles separately, not just the sum of the frequencies of all nouns in the TF-IDF calculation method. Experiments is carried out in the field of economic management, and thematic word extraction verification is conducted through survey and book search. Thus, 9 out of the 10 experimental results used in this study indicate that the thematic word extracted by the proposed method is more effective in understanding the content. Also, it is confirmed that the thematic word extracted by the proposed method has a better book search result.

Processing Nominal Suffixes in Korean: Evidence from Priming Experiments

  • Ahn, Hee-Don;An, Duk-Ho;Choi, Jung-Yun;Hwang, Jong-Bai;Jeon, Moon-Gee;Kim, Ji-Hyon
    • Language and Information
    • /
    • v.15 no.1
    • /
    • pp.1-12
    • /
    • 2011
  • This study investigates morphologically complex nouns in Korean through a series of priming studies. Two experiments examined whether morphological affixes on Korean nouns were decomposed or processed as a whole. Two types of morphological affixes were examined: morpho-syntactic case markers and the plural marker '-tul'. Results showed that priming occurred for the plural marker with SOAs of 80 ms and 160 ms, but no priming occurred for the morpho-syntactic case markers. These results suggest that the morphological processing for these two types of affixes differ. We argue that Korean nouns with the plural suffix are decomposed into the stem and affix, supporting the Decomposition Model (Pinker & Ullman, 2002). We suggest that while plural markers are truly morphological affixes, case markers in Korean are morpho-syntactic, and thus presuppose the existence of other syntactic elements, such as the matrix verb, hence the lack of priming effects.

  • PDF