• Title/Summary/Keyword: ambiguous words

Search Result 75, Processing Time 0.028 seconds

Representation of ambiguous word in Latent Semantic Analysis (LSA모형에서 다의어 의미의 표상)

  • 이태헌;김청택
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.2
    • /
    • pp.23-31
    • /
    • 2004
  • Latent Semantic Analysis (LSA Landauer & Dumais, 1997) is a technique to represent the meanings of words using co-occurrence information of words appearing in he same context, which is usually a sentence or a document. In LSA, a word is represented as a point in multidimensional space where each axis represents a context, and a word's meaning is determined by its frequency in each context. The space is reduced by singular value decomposition (SVD). The present study elaborates upon LSA for use of representation of ambiguous words. The proposed LSA applies rotation of axes in the document space which makes possible to interpret the meaning of un. A simulation study was conducted to illustrate the performance of LSA in representation of ambiguous words. In the simulation, first, the texts which contain an ambiguous word were extracted and LSA with rotation was performed. By comparing loading matrix, we categorized the texts according to meanings. The first meaning of an ambiguous wold was represented by LSA with the matrix excluding the vectors for the other meaning. The other meanings were also represented in the same way. The simulation showed that this way of representation of an ambiguous word can identify the meanings of the word. This result suggest that LSA with axis rotation can be applied to representation of ambiguous words. We discussed that the use of rotation makes it possible to represent multiple meanings of ambiguous words, and this technique can be applied in the area of web searching.

  • PDF

Korean Document Classification Using Extended Vector Space Model (확장된 벡터 공간 모델을 이용한 한국어 문서 분류 방안)

  • Lee, Samuel Sang-Kon
    • The KIPS Transactions:PartB
    • /
    • v.18B no.2
    • /
    • pp.93-108
    • /
    • 2011
  • We propose a extended vector space model by using ambiguous words and disambiguous words to improve the result of a Korean document classification method. In this paper we study the precision enhancement of vector space model and we propose a new axis that represents a weight value. Conventional classification methods without the weight value had some problems in vector comparison. We define a word which has same axis of the weight value as ambiguous word after calculating a mutual information value between a term and its classification field. We define a word which is disambiguous with ambiguous meaning as disambiguous word. We decide the strengthness of a disambiguous word among several words which is occurring ambiguous word and a same document. Finally, we proposed a new classification method based on extension of vector dimension with ambiguous and disambiguous words.

Word Sense Classification Using Support Vector Machines (지지벡터기계를 이용한 단어 의미 분류)

  • Park, Jun Hyeok;Lee, Songwook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.563-568
    • /
    • 2016
  • The word sense disambiguation problem is to find the correct sense of an ambiguous word having multiple senses in a dictionary in a sentence. We regard this problem as a multi-class classification problem and classify the ambiguous word by using Support Vector Machines. Context words of the ambiguous word, which are extracted from Sejong sense tagged corpus, are represented to two kinds of vector space. One vector space is composed of context words vectors having binary weights. The other vector space has vectors where the context words are mapped by word embedding model. After experiments, we acquired accuracy of 87.0% with context word vectors and 86.0% with word embedding model.

The Prosodic Characteristics of Utterance of Sentences with Ambiguous Word in Patients with Neurogenic Communication Disorders (어휘적 중의성 문장 발화 시 신경언어장애인의 운율 특성)

  • Lee, Myoung-Soon;Kwon, Do-Ha
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.87-91
    • /
    • 2009
  • The purpose of this study was to examine the characteristics of prosody of utterance of ambiguous sentences in patients with neurogenic communication disorders. Ambiguous words on which prosody may have an impact were used to investigate this matter. The characteristics of tone duration, pitch and intensity were analyzed to examine the characteristics of prosody in patients with lesions in the left or right hemisphere and normal controls. The whole process was recorded using a Praat 4.3.14 and for statistical analyses, two-way Anova and multiple comparative analyses were carried out using SPSS10.0 for Windows. The conclusions of this study are as follows: The length of vowel in homograph in Korean was different depending on the meaning and the duration of vowel was the longest in patients with lesions in the left hemisphere. There was agreed that they had problem of timing of prosody(Danly & Shapiro, 1982). On the other hand, there found that patients with lesions in the right hemisphere had deficiency of changeability in pitch. Among various acoustic parameters, this study focused on the duration which are closely related to suprasegmental characteristics of prosody. More acoustic parameters should be taken into account in future studies.

  • PDF

Korean Part-of-Speech Tagging System Using Resolution Rules for Individual Ambiguous Word (어절별 중의성 해소 규칙을 이용한 혼합형 한국어 품사 태깅 시스템)

  • Park, Hee-Geun;Ahn, Young-Min;Seo, Young-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.6
    • /
    • pp.427-431
    • /
    • 2007
  • In this paper we describe a Korean part-of-speech tagging approach using resolution rules for individual ambiguous word and statistical information. Our tagging approach resolves lexical ambiguities by common rules, rules for individual ambiguous word, and statistical approach. Common rules are ones for idioms and phrases of common use including phrases composed of main and auxiliary verbs. We built resolution rules for each word which has several distinct morphological analysis results to enhance tagging accuracy. Each rule may have morphemes, morphological tags, and/or word senses of not only an ambiguous word itself but also words around it. Statistical approach based on HMM is then applied for ambiguous words which are not resolved by rules. Experiment shows that the part-of-speech tagging approach has high accuracy and broad coverage.

Effects of orthographic and morphological frequency of a syllable in Korean word recognition (한국어 음절의 표기빈도와 형태소빈도가 단어인지에 미치는 효과)

  • Yi, Kwang-Oh;Bae, Sung-Bong
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.3
    • /
    • pp.309-333
    • /
    • 2009
  • Two experiments were conducted to examine the role of Kulja and morpheme in processing two-syllable Sino-Korean words. In Experiment 1, the effects of morphemic frequency were not significant at the initial and final positions of a word while Kulja frequency and Kulja-morpheme correspondence at both positions in a word had a significant impact on the processing of nonwords. Lexical decision times were longer for nonwords with high frequency Kulja and for nonwords with ambiguous Kulja-morpheme correspondence whose Kulja can go with many different morphemes. In Experiment 2 Kulja-morpheme correspondence was examined for words as well as nonwords. Lexical decisions were slower for stimuli with ambiguous Kulja-morpheme correspondence. The effect was more stable for nonwords, which replicated the result of Experiment 1. In sum, the results of this study suggest that words with ambiguous Kulja-morpheme correspondence activate many different morphemes and competition among these morphemic candidates slows down the lexical selection process. Kulja frequency, Kulja neighborhood, morphemic frequency, morphological neighborhood, and Kulja-morpheme correspondence in Korean word recognition were also discussed.

  • PDF

Ontology-based Automated Metadata Generation Considering Semantic Ambiguity (의미 중의성을 고려한 온톨로지 기반 메타데이타의 자동 생성)

  • Choi, Jung-Hwa;Park, Young-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.11
    • /
    • pp.986-998
    • /
    • 2006
  • There has been an increasing necessity of Semantic Web-based metadata that helps computers efficiently understand and manage an information increased with the growth of Internet. However, it seems inevitable to face some semantically ambiguous information when metadata is generated. Therefore, we need a solution to this problem. This paper proposes a new method for automated metadata generation with the help of a concept of class, in which some ambiguous words imbedded in information such as documents are semantically more related to others, by using probability model of consequent words. We considers ambiguities among defined concepts in ontology and uses the Hidden Markov Model to be aware of part of a named entity. First of all, we constrict a Markov Models a better understanding of the named entity of each class defined in ontology. Next, we generate the appropriate context from a text to understand the meaning of a semantically ambiguous word and solve the problem of ambiguities during generating metadata by searching the optimized the Markov Model corresponding to the sequence of words included in the context. We experiment with seven semantically ambiguous words that are extracted from computer science thesis. The experimental result demonstrates successful performance, the accuracy improved by about 18%, compared with SemTag, which has been known as an effective application for assigning a specific meaning to an ambiguous word based on its context.

Integrated Knowledge Bases of Semantic Networks for Automatic Translation of Ambiguous Words (단어의 자동번역을 위한 의미 네트워크의 통합 지식베이스)

  • Yoo-Jin Moon;Young-Ho Hwang
    • Journal of Information Technology Applications and Management
    • /
    • v.9 no.2
    • /
    • pp.71-80
    • /
    • 2002
  • Automatic language translation has greatly advanced due to the increased user needs and Information retrieval in WWW. This paper utilizes the integrated knowledge bases of noun and verb networks for automatic translation of ambiguous words in the Korean sentences, through the selectional restriction relation in the sentences. And this paper presents the method to verify validity of Korean noun semantic networks that are used for the construction of the selectional restriction relation by applying the networks to the syntactic and semantic properties Integration of Korean Noun Networks into the SENKOV system will provide the accurate and efficient knowledge bases for the semantic analysis of Korean NLP.

  • PDF

Information Structure and Intonation Realization of Ambiguous Sentences with Focus Particle 'Only' (정보구조에 따른 중의적 문장의 억양실현 양상 -초점부사 only를 중심으로-)

  • Kim, So-Hee;Kong, Eun-Jong;Kang, Sun-Mi;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.275-288
    • /
    • 2001
  • The sentences with the same surface word order may be realized with the pragmatically different meanings, depending on the contexts under which they could appear. Semantically, their meaning differences have been explained in terms of the different information structures (Steedman 2000), whereas prosodically, they can be explained in terms of the different compositions of intonational components which make their own semantic contributions (Pierrehumbert and Hirschberg 1990). In other words, the different intonation realizations of the sentences with the same word order reflect the different information structures. In this paper, we investigate the relationship between the information structure and the intonational meaning by way of analysing the production of the sentences with ambiguous scopes of the English focus particle 'only'. In contrast to the previous quantitative approaches to the scopes of the focus particle 'only', two independent levels of information structure (Steedman 2000)-theme/rheme, and focus/ background-make it possible to consistently explain the intonational phenomena.

  • PDF

Morphological Analysis with Adjacency Attributes and Phrase Dictionary (접속 특성과 말마디 사전을 이용한 형태소 분석)

  • Im, Gwon-Muk;Song, Man-Seok
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.1
    • /
    • pp.129-139
    • /
    • 1994
  • This paper presents a morphological analysis method for the Korean language. The characteristics and adjacency information of the words can be obtained from sentences in a large corpus. Generally a word can be analyzed to a result by applying the adjacency attributes and rules. However, we have to choose one from the several results for the ambiguous words. The collected morpheme's adjacency attributes and relations with neighbor words are recorded in a well designed dictionaries. With this information, abbreviated words as well as ambiguous words can be almost analyzed successfully. Efficiency of morphological analyzer depends on the information in the dictionaries. A morpheme dictionary and a phrase dictionary have been designed with lexical database, and necessary information extracted from the corpus is stored in the dictionaries.

  • PDF