• Title/Summary/Keyword: word dictionary

Search Result 276, Processing Time 0.026 seconds

Vocabulary Generation Method by Optical Character Recognition (광학 문자 인식을 통한 단어 정리 방법)

  • Kim, Nam-Gyu;Kim, Dong-Eon;Kim, Seong-Woo;Kwon, Soon-Kak
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.8
    • /
    • pp.943-949
    • /
    • 2015
  • A reader usually spends a lot of time browsing and searching word meaning in a dictionary, internet or smart applications in order to find the unknown words. In this paper, we propose a method to compensate this drawback. The proposed method introduces a vocabulary upon recognizing a word or group of words that was captured by a smart phone camera. Through this proposed method, organizing and editing words that were captured by smart phone, searching the dictionary data using bisection method, listening pronunciation with the use of speech synthesizer, building and editing of vocabulary stored in database are given as the features. A smart phone application for organizing English words was established. The proposed method significantly reduces the organizing time for unknown English words and increases the English learning efficiency.

Emotion Analysis System for Social Media using Sentiment Dictionary including newly created word (신조어 감성사전 기반의 소셜미디어 감성분석 시스템)

  • Shin, Panseop;Oh, Hanmin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.225-226
    • /
    • 2019
  • 오피니언 마이닝은 온라인 문서의 감성을 추출하여 분석하는 기법이다. 별도의 여론조사 없이 감성을 분석 가능하므로, 최근 활발한 연구 분야이다. 그러나 소셜미디어에는 신조어 등이 많이 포함되어 있어 기존 감성분석 시스템으로는 정확한 분석이 어려울 뿐만 아니라, 복합적인 감성에 대한 분석을 내리기에 불리하다. 이에 본 연구에서는 직관적인 감성모델을 제안하고 SNS에서 주목받는 다양한 신조어를 수용한 감성단어사전을 구축한 후, 이를 적용하여 소셜미디어에 나타나는 복합적인 감성을 분석하는 감성분석시스템을 설계한다.

  • PDF

Word Sense Distinction of Middle Verbs for Korean Verb Wordnet (한국어 동사의 어휘의미망 구축을 위한 중립동사의 의미분할)

  • Lee, Eunr-Young;Yoon, Ae-Sun
    • Language and Information
    • /
    • v.9 no.2
    • /
    • pp.23-48
    • /
    • 2005
  • This study aims to discuss the word sense distinction of Korean middle verbs for restructuring KorLexVerb 1.0. Despite the duality of its meaning and syntactic structure, the word senses of middle verb are not clearly distinguished in current dictionaries. The underspecification causes very often mismatches that a same Korean word sense is used for two different English verb senses. A close examination on the syntactic and semantic properties of middle verb shows us that the word sense distinction and the reconstruction of hierarchical structure are indispensable. Finally, by doing this fine grained word sense distinction, we propose an alternative way of classification and description of the verb polysemy for KorLexVerb 1.0 as well as for dictionary-like language resources.

  • PDF

Pronunciation Dictionary for English Pronunciation Tutoring System (영어 발음교정시스템을 위한 발음사전 구축)

  • Kim Hyosook;Kim Sunju
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.168-171
    • /
    • 2003
  • This study is about modeling pronunciation dictionary necessary for PLU(phoneme like unit) level word recognition. The recognition of nonnative speakers' pronunciation enables an automatic diagnosis and an error detection which are the core of English pronunciation tutoring system. The above system needs two pronunciation dictionaries. One is for representing standard English pronunciation. The other is for representing Korean speakers' English Pronunciation. Both dictionaries are integrated to generate pronunciation networks for variants.

  • PDF

Weighted Bayesian Automatic Document Categorization Based on Association Word Knowledge Base by Apriori Algorithm (Apriori알고리즘에 의한 연관 단어 지식 베이스에 기반한 가중치가 부여된 베이지만 자동 문서 분류)

  • 고수정;이정현
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.2
    • /
    • pp.171-181
    • /
    • 2001
  • The previous Bayesian document categorization method has problems that it requires a lot of time and effort in word clustering and it hardly reflects the semantic information between words. In this paper, we propose a weighted Bayesian document categorizing method based on association word knowledge base acquired by mining technique. The proposed method constructs weighted association word knowledge base using documents in training set. Then, classifier using Bayesian probability categorizes documents based on the constructed association word knowledge base. In order to evaluate performance of the proposed method, we compare our experimental results with those of weighted Bayesian document categorizing method using vocabulary dictionary by mutual information, weighted Bayesian document categorizing method, and simple Bayesian document categorizing method. The experimental result shows that weighted Bayesian categorizing method using association word knowledge base has improved performance 0.87% and 2.77% and 5.09% over weighted Bayesian categorizing method using vocabulary dictionary by mutual information and weighted Bayesian method and simple Bayesian method, respectively.

  • PDF

중국 코퍼스 및 인터넷을 이용한 중한사전의 표제어 연구 - gu~guang을 중심으로

  • Park, Yeong-Jong
    • 중국학논총
    • /
    • no.67
    • /
    • pp.25-41
    • /
    • 2020
  • 当我们翻开中韩词典时, 就不难发现令人莫名其妙的词汇不在少数, 而且在部分词汇的解释上也存在着不少问题。本文主要探讨了这些词汇被收录于词典是否合适和词语释义是否正确的问题。为此, 先从中韩词典里筛选出在中国教育部语言文字应用研究所和北京大学中国语言学研究中心所提供的现代汉语语料库中出现频率较低的词汇。若考虑到这两个语料库为全方位收集现代汉语而做了巨大的努力, 而且肯定这一学术成果的话, 就能推测此文里筛选出的这些词汇很可能不太正规或现在不怎幺使用等事实。为了使这种推测能得到更准确的印证, 作者在百度网上又检索了是否出现它们的用例, 且对词语释义和实际用例是否一致做了详细的比较, 就发现不少词汇确实存在各种问题, 根本不适合被收录到词典, 或必须修改释义内容。

중국 코퍼스 및 인터넷을 이용한 중한사전의 표제어 연구 - huan~hui일부를 중심으로

  • Park, Yeong-Jong
    • 중국학논총
    • /
    • no.70
    • /
    • pp.39-60
    • /
    • 2021
  • 当我们翻开中韩词典时, 就不难发现令人莫名其妙的词汇不在少数, 而且在部分词汇的解释上也存在着不少问题。本文主要探讨了这些词汇被收录于词典是否合适和词语释义是否正确的问题。为此, 先从中韩词典里筛选出在中国教育部语言文字应用研究所和北京大学中国语言学研究中心所提供的现代汉语语料库中出现频率较低的词汇。若考虑到这两个语料库为全方位收集现代汉语而做了巨大的努力, 而且肯定这一学术成果的话, 就能推测此文里筛选出的这些词汇很可能不太正规或现在不怎幺使用等事实。为了使这种推测能得到更准确的印证, 作者在百度网上又检索了是否出现它们的用例, 且对词语释义和实际用例是否一致做了详细的比较, 就发现不少词汇确实存在各种问题, 根本不适合被收录到词典, 或必须修改释义内容。

중국 코퍼스 및 인터넷을 이용한 중한사전 표제어의 적합성 연구 - 'ge~gou'를 중심으로

  • Park, Yeong-Jong
    • 중국학논총
    • /
    • no.61
    • /
    • pp.1-18
    • /
    • 2019
  • 当我们翻开中韩词典时, 就不难发现令人莫名其妙的词汇不在少数, 而且在部分词汇的解释上也存在着不少问题. 本文主要探讨了这些词汇被收录于词典是否合适和词语释义是否正确的问题. 为此, 先从中韩词典里筛选出在中国教育部语言文字应用研究所和北京大学中国语言学研究中心所提供的现代汉语语料库中出现频率较低的词汇. 若考虑到这两个语料库为全方位收集现代汉语而做了巨大的努力, 而且肯定这一学术成果的话, 就能推测此文里筛选出的这些词汇很可能不太正规或现在不怎幺使用等事实. 为了使这种推测能得到更准确的印证, 作者在百度网上又检索了是否出现它们的用例, 且对词语释义和实际用例是否一致做了详细的比较, 就发现不少词汇确实存在各种问题, 根本不适合被收录到词典, 或必须修改释义内容.

An Automatic Classification System of Official Documents in Middle Schools Using Term Weighting of Titles (제목의 단어 가중치를 이용한 중등학교 공문서 자동분류시스템)

  • Kang, Hyun-Hee;Jin, Min
    • Journal of The Korean Association of Information Education
    • /
    • v.7 no.2
    • /
    • pp.219-226
    • /
    • 2003
  • It takes a lot of time to classify official documents in schools and educational institutions. In order to reduce the overhead, we propose an automatic document classification method using word information of the titles of documents in this paper. At first, meaningful words are extracted from titles of existing documents and Inverse Document Frequency(IDF) weights of words are calculated against each category. Then we build a word weight dictionary. Documents are automatically classified into the appropriate category of which the sum of weights of words of the title is the highest by using the word weight dictionary. We also evaluate the performance of the proposed method using a real dataset of a middle school.

  • PDF