• Title/Summary/Keyword: Lexicon

Search Result 273, Processing Time 0.022 seconds

An Optimality Theoretic Approach to the Feature Model for Speech Understanding

  • Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.109-124
    • /
    • 1997
  • This paper shows how a distinctive feature model can effectively be implemented into speech understanding within the framework of the Optimality Theory(OT); i.e., to show how distinctive features can optimally be extracted from given speech signals, and how segments can be chosen as the optimal ones among plausible candidates. This paper will also show how the sequence of segments can successfully be matched with optimal words in a lexicon.

  • PDF

Combinatory Categorial Grammar for Korean

  • Han, Sung-Kook;Park, Chan-Gon
    • Annual Conference on Human and Language Technology
    • /
    • 1990.11a
    • /
    • pp.164-171
    • /
    • 1990
  • A commutative productive category is proposed to the current CCG for the syntactic analysis of free word order languages like Korean. The introduction of this sort of category is quite natural for categorial lexicon and functional operations. We present the theorical basis of productive category and examine the linguistic availability through typical syntactic structures of Korean.

  • PDF

Structure Analysis of Multilingual Lexicon (전문용어 대역사전의 구조와 배열에 관한 연구)

  • 김세주
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2001.08a
    • /
    • pp.35-40
    • /
    • 2001
  • 전문용어사전 중에서 개념 정보를 제시하지 않고 대역어나 음차어를 중심으로 제시하는 전문용어 대역사전을 선정하여 이들의 구조와 배열을 분석하였다. 실제로 전문용어 대역사전을 구성하고 있는 요소들의 기술 구조는 매우 다양하며 이들의 배열 방법도 사전마다 차이를 보이고 있는 것으로 나타났다. 이러한 특징은 사전의 이용자들에 많은 불편을 초래하며 표준화된 전자사전의 요구를 충족시키기 어려우므로 일관성있는 기술 방법이 요구된다.

  • PDF

An Automatic Expansion of Sentiment Lexicon by Using Corpus (코퍼스를 이용한 감성 사전 자동 확장)

  • Lee, Kong Joo;Seo, Hyung-Won;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.158-161
    • /
    • 2010
  • 본 연구에서는 기본 감성 사전과 대량의 코퍼스를 이용하여 대상 코퍼스에서 사용하는 확장된 감성 표현을 자동으로 추출하는 방법을 제안한다. 대상 코퍼스로는 방송사들이 운영하는 시청자 게시판의 게시글을 대상으로 하였다. 이와 같은 방법으로 대상 코퍼스에서 사용하는 구체적인 감성 패턴들을 추출할 수 있었다.

  • PDF

An Automatic Extraction of English-Korean Bilingual Terms by Using Word-level Presumptive Alignment (단어 단위의 추정 정렬을 통한 영-한 대역어의 자동 추출)

  • Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.6
    • /
    • pp.433-442
    • /
    • 2013
  • A set of bilingual terms is one of the most important factors in building language-related applications such as a machine translation system and a cross-lingual information system. In this paper, we introduce a new approach that automatically extracts candidates of English-Korean bilingual terms by using a bilingual parallel corpus and a basic English-Korean lexicon. This approach can be useful even though the size of the parallel corpus is small. A sentence alignment is achieved first for the document-level parallel corpus. We can align words between a pair of aligned sentences by referencing a basic bilingual lexicon. For unaligned words between a pair of aligned sentences, several assumptions are applied in order to align bilingual term candidates of two languages. A location of a sentence, a relation between words, and linguistic information between two languages are examples of the assumptions. An experimental result shows approximately 71.7% accuracy for the English-Korean bilingual term candidates which are automatically extracted from 1,000 bilingual parallel corpus.

Enhancing Performance of Bilingual Lexicon Extraction through Refinement of Pivot-Context Vectors (중간언어 문맥벡터의 정제를 통한 이중언어 사전 구축의 성능개선)

  • Kwon, Hong-Seok;Seo, Hyung-Won;Kim, Jae-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.41 no.7
    • /
    • pp.492-500
    • /
    • 2014
  • This paper presents the performance enhancement of automatic bilingual lexicon extraction by using refinement of pivot-context vectors under the standard pivot-based approach, which is very effective method for less-resource language pairs. In this paper, we gradually improve the performance through two different refinements of pivot-context vectors: One is to filter out unhelpful elements of the pivot-context vectors and to revise the values of the vectors through bidirectional translation probabilities estimated by Anymalign and another one is to remove non-noun elements from the original vectors. In this paper, experiments have been conducted on two different language pairs that are bi-directional Korean-Spanish and Korean-French, respectively. The experimental results have demonstrated that our method for high-frequency words shows at least 48.5% at the top 1 and up to 88.5% at the top 20 and for the low-frequency words at least 43.3% at the top 1 and up to 48.9% at the top 20.

Fusion Approach to Targeted Opinion Detection in Blogosphere (블로고스피어에서 주제에 관한 의견을 찾는 융합적 의견탐지방법)

  • Yang, Kiduk
    • Journal of Korean Library and Information Science Society
    • /
    • v.46 no.1
    • /
    • pp.321-344
    • /
    • 2015
  • This paper presents a fusion approach to sentiment detection that combines multiple sources of evidence to retrieve blogs that contain opinions on a specific topic. Our approach to finding opinionated blogs on topic consists of first applying traditional information retrieval methods to retrieve blogs on a given topic and then boosting the ranks of opinionated blogs based on the opinion scores computed by multiple sentiment detection methods. Our sentiment detection strategy, whose central idea is to rely on a variety of complementary evidences rather than trying to optimize the utilization of a single source of evidence, includes High Frequency module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinionated documents), Low Frequency module, which makes use of uncommon/rare terms (e.g., "sooo good") that express strong sentiments, IU Module, which leverages n-grams with IU (I and you) anchor terms (e.g., I believe, You will love), Wilson's lexicon module, which uses a collection-independent opinion lexicon constructed from Wilson's subjectivity terms, and Opinion Acronym module, which utilizes a small set of opinion acronyms (e.g., imho). The results of our study show that combining multiple sources of opinion evidence is an effective method for improving opinion detection performance.

The Influence of Negative Emotions on Customer Contribution to Organizational Innovation in an Online Brand Community (온라인 브랜드 커뮤니티 내 부정적 감정들이 기업 혁신을 위한 고객 기여에 미치는 영향)

  • Jung, Suyeon;Lee, Hanjun;Suh, Yongmoo
    • Journal of Internet Computing and Services
    • /
    • v.14 no.4
    • /
    • pp.91-100
    • /
    • 2013
  • In recent years, online brand communities, whereby firms and customers interact freely, are emerging trend, because customers' opinions collected in these communities can help firms to achieve their innovation effectively. In this study, we examined whether customer opinions containing negative emotions have influence on their adoption for organizational innovation. To that end, we firstly classified negative emotions into five categories of detailed negative emotions such as Fear, Anger, Shame, Sadness, and Frustration. Then, we developed a lexicon for each category of negative emotions, using WordNet and SentiWordNet. From 81,543 customer opinions collected from MyStarbucksIdea.com which is Starbucks' brand community, we extracted terms that belong to each lexicon. We conducted an experiment to examine whether the existence, frequency and strength of terms with negative emotions in each category affect the adoption of customer opinions for organizational innovation. In the experiment, we statistically verified that there is a positive relationship between customer ideas containing negative emotions and their adoption for innovation. Especially, Frustration and Sadness out of the five emotions are significantly influential to organizational innovation.

Rule Construction for Determination of Thematic Roles by Using Large Corpora and Computational Dictionaries (대규모 말뭉치와 전산 언어 사전을 이용한 의미역 결정 규칙의 구축)

  • Kang, Sin-Jae;Park, Jung-Hye
    • The KIPS Transactions:PartB
    • /
    • v.10B no.2
    • /
    • pp.219-228
    • /
    • 2003
  • This paper presents an efficient construction method of determination rules of thematic roles from syntactic relations in Korean language processing. This process is one of the main core of semantic analysis and an important issue to be solved in natural language processing. It is problematic to describe rules for determining thematic roles by only using general linguistic knowledge and experience, since the final result may be different according to the subjective views of researchers, and it is impossible to construct rules to cover all cases. However, our method is objective and efficient by considering large corpora, which contain practical osages of Korean language, and case frames in the Sejong Electronic Lexicon of Korean, which is being developed by dozens of Korean linguistic researchers. To determine thematic roles more correctly, our system uses syntactic relations, semantic classes, morpheme information, position of double subject. Especially by using semantic classes, we can increase the applicability of the rules.

Lexical Access in the Bilinguals and the Category-specific Semantic System (이중언어의 어휘접근과 범주 특수적 의미체계)

  • Lee, Seung-Bok;Jung, Hyo-Sun;Jo, Seong-Woo
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.505-534
    • /
    • 2010
  • The purpose of this study was aimed to compare the lexical access and representation of semantic system in the bilinguals. The participants(late Korean-English bilinguals) performed the word-picture matching task. The task was to decide whether the pictures presented after the words(basic-level categories) represent the Korean(L1) or English(L2) words' meaning or not. The stimuli were consisted of common object belonged to four different categories(animal, part of body, clothes, tool). To control the translation strategies, the SOA(stimulus onset asynchrony) were manipulated as 650ms(Exp. 1) and 200ms(Exp. 2). In both experiment, the RTs were faster in L1 condition. The decision time of the part of body categories were shorter than the animal in L1 condition. In L2 condition, clothes were responded faster than the tools. The differences of the lexical access time implied that the bilingual semantic system seemed to be structured by more sub-level categories than the super-level, living or non-living things, and the ways to access the bilingual lexicon might be differentiated according to the languages.

  • PDF