• Title/Summary/Keyword: Korean Dictionary

Search Result 736, Processing Time 0.025 seconds

A Parser of Definitions in Korean Dictionary based on Probabilistic Grammar Rules (확률적 문법규칙에 기반한 국어사전의 뜻풀이말 구문분석기)

  • Lee, Su Gwang;Ok, Cheol Yeong
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.5
    • /
    • pp.448-448
    • /
    • 2001
  • The definitions in Korean dictionary not only describe meanings of title, but also include various semantic information such as hypernymy/hyponymy, meronymy/holonymy, polysemy, homonymy, synonymy, antonymy, and semantic features. This paper purposes to implement a parser as the basic tool to acquire automatically the semantic information from the definitions in Korean dictionary. For this purpose, first we constructed the part-of-speech tagged corpus and the tree tagged corpus from the definitions in Korean dictionary. And then we automatically extracted from the corpora the frequency of words which are ambiguous in part-of-speech tag and the grammar rules and their probability based on the statistical method. The parser is a kind of the probabilistic chart parser that uses the extracted data. The frequency of words which are ambiguous in part-of-speech tag and the grammar rules and their probability resolve the noun phrase's structural ambiguity during parsing. The parser uses a grammar factoring, Best-First search, and Viterbi search In order to reduce the number of nodes during parsing and to increase the performance. We experiment with grammar rule's probability, left-to-right parsing, and left-first search. By the experiments, when the parser uses grammar rule's probability and left-first search simultaneously, the result of parsing is most accurate and the recall is 51.74% and the precision is 87.47% on raw corpus.

Semi-Automatic Construction of Morphological Pattern Dictionary using the Method of Morphological Synthesis (형태소 합성 기법을 이용한 형태소 패턴 사전의 반자동 구축)

  • Park, In-Cheol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.11
    • /
    • pp.5278-5283
    • /
    • 2011
  • One approach for very high speed korean morphological analysis is to use pre-built morphological results in dictionary. It pays the high cost to build this morphological pattern dictionary manually, besides the dictionary may contain errors. This paper proposes a method to generate morphological patterns automatically using Korean morphological synthesis. The experiment shows that we automatically generate 86% morphological patterns for analyzing Korean sentences. It takes 52.68 seconds for the morphological system using the patterns to analyze 403MB Korean corpus on 2.8GHz Window system.

A New Terminology Classification System for the Open Korean Knowledge Dictionary and Reclassification (개방형 한국어 지식 대사전 전문용어 신분류 체계 설정 및 재분류)

  • Hwang, Humor;Kim, Jung-Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.2
    • /
    • pp.214-221
    • /
    • 2015
  • A new classification system with 9 main categories and 56 subcategories for the Open Korean Knowledge Dictionary is proposed. The classification system setup is to prepare for the standard classification system to be used to manage effectively vast of terminologies which were published in the Open Korean Knowledge Dictionary and is meant to enhance the fifteen-year old classification system for the standard korean great dictionary to match up to the trend of the modern terminology. The new terminology classification system covering all the academic areas such as humanity, sociology, politics, science, medicine, agriculture, engineering, etc, is designed and proposed after investigating several classification systems. The classification system setup procedures follow as ${\circ}$ The classification system is designed and planed by both the classification system and the academic expert. ${\circ}$ Classification system design covers all the academic areas following National Science and Technology standard classification system after investigating several classification systems such as the National Research Foundation, National Science and Technology Standard Act, Ministry of Knowledge Economy. ${\circ}$ Poll and survey is made to collect comments from total 93 members of several academic areas. ${\circ}$ The poll result is reviewed among working group members and utilized to update the new terminology classification system. Reclassifications are made for the around 200,000 terms in electricity, computer, medicine, pharmacy, biology, and economics according to the new terminology classification system.

A Study on the Academic vocabulary Education for Content-Based Korean Language Education: A Basic Study for Online Dictionary Development

  • Hwang, Shung-eun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.67-74
    • /
    • 2020
  • In this paper, we proposes to develop an online academic vocabulary dictionary as a way of educating academic vocabulary for content-oriented Korean language education. Various academic languages exist in the content-based Korean language teaching materials they encounter when studying at university. You cannot understand or produce academic text without knowing the academic vocabulary. Therefore, one of the tasks of Korean language education has become to improve educational efficiency by preparing a method for academic vocabulary education that is most suitable for them in consideration of their own. Prior to the development of the online academic vocabulary dictionary, the institute conducted a basic study on how the content should be contained in the online dictionary. Online academic vocabulary dictionaries allow students to naturally link their limited education into and out of the classroom, thereby overcoming the limitations of vocabulary education at the educational scene and maximizing their educational effectiveness.

Cloning of Korean Morphological Analyzers using Pre-analyzed Eojeol Dictionary and Syllable-based Probabilistic Model (기분석 어절 사전과 음절 단위의 확률 모델을 이용한 한국어 형태소 분석기 복제)

  • Shim, Kwangseob
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.3
    • /
    • pp.119-126
    • /
    • 2016
  • In this study, we verified the feasibility of a Korean morphological analyzer that uses a pre-analyzed Eojeol dictionary and syllable-based probabilistic model. For the verification, MACH and KLT2000, Korean morphological analyzers, were cloned with a pre-analyzed eojeol dictionary and syllable-based probabilistic model. The analysis results were compared between the cloned morphological analyzer, MACH, and KLT2000. The 10 million Eojeol Sejong corpus was segmented into 10 sets for cross-validation. The 10-fold cross-validated precision and recall for cloned MACH and KLT2000 were 97.16%, 98.31% and 96.80%, 99.03%, respectively. Analysis speed of a cloned MACH was 308,000 Eojeols per second, and the speed of a cloned KLT2000 was 436,000 Eojeols per second. The experimental results indicated that a Korean morphological analyzer that uses a pre-analyzed eojeol dictionary and syllable-based probabilistic model could be used in practical applications.

Korean Semantic Role Labeling Using Case Frame Dictionary and Subcategorization (격틀 사전과 하위 범주 정보를 이용한 한국어 의미역 결정)

  • Kim, Wan-Su;Ock, Cheol-Young
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1376-1384
    • /
    • 2016
  • Computers require analytic and processing capability for all possibilities of human expression in order to process sentences like human beings. Linguistic information processing thus forms the initial basis. When analyzing a sentence syntactically, it is necessary to divide the sentence into components, find obligatory arguments focusing on predicates, identify the sentence core, and understand semantic relations between the arguments and predicates. In this study, the method applied a case frame dictionary based on The Korean Standard Dictionary of The National Institute of the Korean Language; in addition, we used a CRF Model that constructed subcategorization of predicates as featured in Korean Lexical Semantic Network (UWordMap) for semantic role labeling. Automatically tagged semantic roles based on the CRF model, which established the information of words, predicates, the case-frame dictionary and hypernyms of words as features, were used. This method demonstrated higher performance in comparison with the existing method, with accuracy rate of 83.13% as compared to 81.2%, respectively.

Dynamic Expansion of Semantic Dictionary for Topic Extraction in Automatic Summarization (자동요약의 주제어 추출을 위한 의미사전의 동적 확장)

  • Choo, Kyo-Nam;Woo, Yo-Seob
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.241-247
    • /
    • 2009
  • This paper suggests the expansion methods of semantic dictionary, taking Korean semantic features account. These methods will be used to extract a practical topic word in the automatic summarization. The first is the method which is constructed the synonym dictionary for improving the performance of semantic-marker analysis. The second is the method which is extracted the probabilistic information from the subcategorization dictionary for resolving the syntactic and semantic ambiguity. The third is the method which is predicted the subcategorization patterns of the unregistered predicate, for the resolution of an affix-derived predicate.

  • PDF

Development of Japanese to Korean Machine Translation System ATOM Using Personal Computer I - Dictionary Construction and Morphological Analysis - (PC를 이용한 일$\cdot$한 번역 시스템 ATOM의 개발에 관한 연구 ( I ) - 구문해석과 생성과 사전 구성과 형태소 해석을 중심으로 -)

  • Kim, Young-Sum;Kim, Han-Woo;Choi, Byung-Uk
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.10
    • /
    • pp.1183-1192
    • /
    • 1988
  • In this paper, we describe heuristic information-added morphological dictionary and connection table, and automatic MUNJEUL separation process on the basis of least cost method for efficient morphological analysis. It is simplified the composition of connection and inflective word information by mutually interconnect conjugation table with connection tables. As a result, the applicability of system is increased. Translation dictionary consists of analysis and generation part and, increase the applicability by describing frequently using termination phrase which is extracted statistically as idiom and the procedure directly on the dictionary for the efficiency of analysis process and more natural generation of translation sentence.

  • PDF

On the Regulation for Pronunciation of Loanwords in Korean (외래어의 표준 발음과 어문 규범)

  • Yi, Eun-gyeong
    • Cross-Cultural Studies
    • /
    • v.38
    • /
    • pp.405-431
    • /
    • 2015
  • The purpose of this paper is to investigate how to decide pronunciation of loanwords in Korean language. There has not been a regulation for pronunciation of loanwords in Korean language. Even the dictionary published by the government does not provide any information about the pronunciation of loanwords. In this paper, some actual solutions are suggested for the pronunciation of loanwords. Korean language has Regulations of Standard Korean, Korean Orthography, Regulations on Hangeul Transcriptions on Loanwords and Pronunciation Methods of Standard Korean. These language standards could help to decide pronunciation of loanwords. Some pronunciations which could not be regulated by them must be presented in the standard pronunciation dictionary. For example, glottalization rule of 's' in many loanwords could be presented in the description of each loanword in the dictionary. However the pronunciation of loanwords must be similar to the spelling. If various pronunciations are allowed to one spelling, then people will be so confused by the discrepancy between pronunciation and spelling of loanwords.

A Study on Exceptional Pronunciations For Automatic Korean Pronunciation Generator (한국어 자동 발음열 생성 시스템을 위한 예외 발음 연구)

  • Kim Sunhee
    • MALSORI
    • /
    • no.48
    • /
    • pp.57-67
    • /
    • 2003
  • This paper presents a systematic description of exceptional pronunciations for automatic Korean pronunciation generation. An automatic pronunciation generator in Korean is an essential part of a Korean speech recognition system and a TTS (Text-To-Speech) system. It is composed of a set of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words that have exceptional pronunciations, based on the characteristics of the words of exceptional pronunciation through phonological research and the systematic analysis of the entries of Korean dictionaries. Thus, the method contributes to improve performance of automatic pronunciation generator in Korean as well as the performance of speech recognition system and TTS system in Korean.

  • PDF