• Title/Summary/Keyword: Noun Phrase

Search Result 71, Processing Time 0.025 seconds

Range Detection of Wa/Kwa Parallel Noun Phrase using a Probabilistic Model and Modification Information (확률모형과 수식정보를 이용한 와/과 병렬사구 범위결정)

  • Choi, Yong-Seok;Shin, Ji-Ae;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.2
    • /
    • pp.128-136
    • /
    • 2008
  • Recognition of parallel structure at early stage of sentence parsing can reduce the complexity of parsing. In this paper, we propose an unsupervised language-independent probabilistic model for recongition of parallel noun structures. The proposed model is based on the idea of swapping constituents, which replies the properties of symmetry (two or more identical constituents are repeated) and of reversibility (the order of constituents is inter-changeable) in parallel structures. The non-symmetric patterns that cannot be captured by the general symmetry rule are resolved additionally by the modifier information. In particular this paper shows how the proposed model is applied to recognize Korean parallel noun phrases connected by "wa/kwa" particle. Our model is compared with other models including supervised models and performs better on recongition of parallel noun phrases.

Anaphoric Reference Resolution in Expository Text: The Effects of Ellipsis (설명문의 대용어 참조해결과정: 대용어와 지시사 생략 효과)

  • Lee, Jae-Ho
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.2
    • /
    • pp.253-282
    • /
    • 2010
  • Two experiments were conducted to explore the effects of anaphora and demonstrative ellipsis on reference resolution. This study assumed that two type of ellipsis could be sensitive to antecedents' saliency: the reverse typicality and mention order of antecedents. The muti-task approach measured the antecedent's activation level and processing load for the conflict resolution of theories of anaphoric resolution. In Experiment 1, using ellipsis for anaphora, participants read a series of sentence pairs by self-paced and performed a probe recognition test. The results showed the main effects of antecedent's typicality and mention order in both tasks. In Experiment 2, using noun phrase without demonstrative for anaphora, participants read a series of sentence pairs by self-paced and performed a probe recognition test. The results showed main effects of mention order of antecedents for probe recognition task only. The first antecedent was recognized faster than the second one. The results of two experiments suggested that anaphora type and antecedent's saliency were dynamically interact in reference resolution for Korean.

  • PDF

The Acoustic Characteristics of Focus Associated with the Korean Particle' -man' (한국어 특수조사 ‘-만’에 연계된 초점의 음향음성학적 특성)

  • Choe, J.W.;Jeon, Y.S.;C., Y.;Park, S.B.;Kim, K.H.
    • Speech Sciences
    • /
    • v.5 no.2
    • /
    • pp.77-91
    • /
    • 1999
  • The purpose of this paper is to investigate the phonetic characteristics of the 'focus' phrases associated with the particle '-man' in Korean. The particle '-man' is a bound morpheme which, like other postpositions such as the subject marker '-ka' and the object marker '-lil', the so-called 'case markers' in Korean, typically attaches to a noun (phrase). The semantics of '-man' roughly corresponds to that of only, its counterpart in English, and is thus classified as a 'delimiter' (Yang 1973). It is assumed in this paper that '-man', like only in English, should have a 'focus' associated with it (von Stechow 1991, Rooth 1992). In general, '-man' attached phrases get the focus, but sometimes the association is not clear-cut, especially in the cases of emphatic use of '-man' or when the context strongly favors other phrase as the focus (Choe 1996). In this paper, we compare the phonetic characteristics of the '-man' marked phrases with those to which '-ka'/'-lil' is attached, and conclude that the focused '-man' phrases show higher fundamental frequencies than their equally focused 'case' -marked counterparts. However, when the context clearly forces the focus to fall on phrases other than the '-man' or '-ka'/'-lil' attached ones, there is no meaningful difference in fundamental frequency between the '-man' and '-ka'/'-lil' attached phrases. We also compare the phonetic characteristics of the regular use of '-man' with those of the emphatic '-man'. According to our experiments, the emphatic '-man' does not bring forth its phonetic effects, namely, higher fundamental frequencies, on the' -man' attached words or phrases but rather in various other ways such as higher fundamental frequencies in '-man', lengthening of the following word-initial syllable, or the inclusion of the following word in the same accentual phrase. Finally, it is claimed that '-man' associated focus phenomena, especially the emphatic use of '-man', show some typical acoustic characteristics of the other well-known focus phenomena, namely, wh-interrogatives.

  • PDF

Comparison of vowel lengths of articles and monosyllabic nouns in Korean EFL learners' noun phrase production in relation to their English proficiency (한국인 영어학습자의 명사구 발화에서 영어 능숙도에 따른 관사와 단음절 명사 모음 길이 비교)

  • Park, Woojim;Mo, Ranm;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.33-40
    • /
    • 2020
  • The purpose of this research was to find out the relation between Korean learners' English proficiency and the ratio of the length of the stressed vowel in a monosyllabic noun to that of the unstressed vowel in an article of the noun phrases (e.g., "a cup", "the bus", etcs.). Generally, the vowels in monosyllabic content words are phonetically more prominent than the ones in monosyllabic function words as the former have phrasal stress, making the vowels in content words longer in length, higher in pitch, and louder in amplitude. This study, based on the speech samples from Korean-Spoken English Corpus (K-SEC) and Rated Korean-Spoken English Corpus (Rated K-SEC), examined 879 English noun phrases, which are composed of an article and a monosyllabic noun, from sentences which are rated on 4 levels of proficiency. The lengths of the vowels in these 879 target NPs were measured and the ratio of the vowel lengths in nouns to those in articles was calculated. It turned out that the higher the proficiency level, the greater the mean ratio of the vowels in nouns to the vowels in articles, confirming the research's hypothesis. This research thus concluded that for the Korean English learners, the higher the English proficiency level, the better they could produce the stressed and unstressed vowels with more conspicuous length differences between them.

A Review of the Opinion Target Extraction using Sequence Labeling Algorithms based on Features Combinations

  • Aziz, Noor Azeera Abdul;MohdAizainiMaarof, MohdAizainiMaarof;Zainal, Anazida;HazimAlkawaz, Mohammed
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.111-119
    • /
    • 2016
  • In recent years, the opinion analysis is one of the key research fronts of any domain. Opinion target extraction is an essential process of opinion analysis. Target is usually referred to noun or noun phrase in an entity which is deliberated by the opinion holder. Extraction of opinion target facilitates the opinion analysis more precisely and in addition helps to identify the opinion polarity i.e. users can perceive opinion in detail of a target including all its features. One of the most commonly employed algorithms is a sequence labeling algorithm also called Conditional Random Fields. In present article, recent opinion target extraction approaches are reviewed based on sequence labeling algorithm and it features combinations by analyzing and comparing these approaches. The good selection of features combinations will in some way give a good or better accuracy result. Features combinations are an essential process that can be used to identify and remove unneeded, irrelevant and redundant attributes from data that do not contribute to the accuracy of a predictive model or may in fact decrease the accuracy of the model. Hence, in general this review eventually leads to the contribution for the opinion analysis approach and assist researcher for the opinion target extraction in particular.

An Index System using Restrictive Distance (거리 제한을 이용한 색인 시스템)

  • Park, Chan-Ee;Kim, Sang-Bok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.1 s.39
    • /
    • pp.273-282
    • /
    • 2006
  • In this paper, we propose index method introducing distance concept in word by a method weighting word. This index method is frequent representing an inquiry word and document index and compound noun or more than two adjoin nouns or noun phrase, the farther the distance between these nouns, the fewer selected ratio decreases in index point is the aiming, this choose guide word candidate by existent weight grant method and distance between candidates chose candidate finally in index within 3 sentences. Using in these way I document of 100 kinds of newspaper, scientific treatise, web document and so on, showed the correctness rate resulted of newspaper 92.03% scientific treatise 95% web document 73.33%.

  • PDF

A Model-Based Method for Information Alignment: A Case Study on Educational Standards

  • Choi, Namyoun;Song, Il-Yeol;Zhu, Yongjun
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.3
    • /
    • pp.85-94
    • /
    • 2016
  • We propose a model-based method for information alignment using educational standards as a case study. Discrepancies and inconsistencies in educational standards across different states/cities hinder the retrieval and sharing of educational resources. Unlike existing educational standards alignment systems that only give binary judgments (either "aligned" or "not-aligned"), our proposed system classifies each pair of educational standard statements in one of seven levels of alignments: Strongly Fully-aligned, Weakly Fully-aligned, Partially-$aligned^{***}$, Partially-$aligned^{**}$, Partially-$aligned^*$, Poorly-aligned, and Not-aligned. Such a 7-level categorization extends the notion of binary alignment and provides a finer-grained system for comparing educational standards that can broaden categories of resource discovery and retrieval. This study continues our previous use of mathematics education as a domain, because of its generally unambiguous concepts. We adopt a materialization pattern (MP) model developed in our earlier work to represent each standard statement as a verb-phrase graph and a noun-phrase graph; we align a pair of statements using graph matching based on Bloom's Taxonomy, WordNet, and taxonomy of mathematics concepts. Our experiments on data sets of mathematics educational standards show that our proposed system can provide alignment results with a high degree of agreement with domain expert's judgments.

Word Network Analysis based on Mutual Information for Ontology of Korean Rural Planning (한국농촌계획 온톨로지 구축을 위한 상호정보 기반 단어연결망 분석)

  • Lee, Jemyung
    • Journal of Korean Society of Rural Planning
    • /
    • v.23 no.3
    • /
    • pp.37-51
    • /
    • 2017
  • There has been a growing concern on ontology especially in recent knowledge-based industry and defining a field-customized semantic word network is essential for building it. In this paper, a word network for ontology is established with 785 publications of Korean Society of Rural Planning(KSRP), from 1995 to 2017. Semantic relationships between words in the publications were quantitatively measured with the 'normalized pointwise mutual information' based on the information theory. Appearance and co-appearance frequencies of nouns and adjectives in phrases are analyzed based on the assumption that a 'noun phrase' represents a single 'concept'. The word network of KSRP was compared with that of $WordNet^{TM}$, a world-wide thesaurus network, for the verification. It is proved that the KSRP's word network, established in this paper, provides words' semantic relationships based on the common concepts of Korean rural planning research field. With the results, it is expecting that the established word network can present more opportunity for preparation of the fourth industrial revolution to the field of the Korean rural planning.

Base Noun Phrase Recognition in Korean using Rule-based Learning (규칙 기반 학습에 의한 한국어의 기반 명사구 인식)

  • Yang, Jae-Hyeong
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.10
    • /
    • pp.1062-1071
    • /
    • 2000
  • 한국어의 기반 명사구, 즉 비재귀적인 단순 명사구를 인식하는 비통계적인 규칙 기반 학습 기법을 제안한다. 학습 말뭉치에 기반 명사구에 대한 초기 예측이 표시되어 있고 목표 말뭉치에는 올바른 기반 명사구가 태그(tag)의 형식으로 표시되어 있다면, 규칙 기반 학습은 먼저 인접한 주위 형태소들의 다양한 문법적 정보를 나타내는 규칙 템플릿을 이용하여 기반 명사구 태그를 수정하는 규칙 후보들을 생성해 내고, 이 후보들 가운데 학습 말뭉치를 목표 말뭉치에 가장 가깝게 변환하는 일련의 규칙들을 차례로 얻어낸다. 국어정보베이스의 15만 단어 규모의 트리 태그 부착 말뭉치를 이용한 실험 결과 386개의 변환 규칙을 얻었으며, 이를 이용하여 90% 이상의 높은 기반 명사구 인식 정확도를 얻을 수 있다.

  • PDF

Measurement of Document Similarity using Word and Word-Pair Frequencies (단어 및 단어쌍 별 빈도수를 이용한 문서간 유사도 측정)

  • 김혜숙;박상철;김수형
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1311-1314
    • /
    • 2003
  • In this paper, we propose a method to measure document similarity. First, we have exploited single-term method that extracts nouns by using a lexical analyzer as a preprocessing step to match one index to one noun. In spite of irrelevance between documents, possibility of increasing document similarity is high with this method. For this reason, a term-phrase method has been reported. This method constructs co-occurrence between two words as an index to measure document similarity. In this paper, we tried another method that combine these two methods to compensate the problems in these two methods. Six types of features are extracted from two input documents, and they are fed into a neural network to calculate the final value of document similarity. Reliability of our method has been proved by an experiment of document retrieval.

  • PDF