• Title/Summary/Keyword: Korean Dictionary

Search Result 736, Processing Time 0.022 seconds

Constructing A Korean-English Bilingual Dictionary For Well-formed English Sentence Generations In A Glossary-based System (Glossary에 기초한 시스템에서의 적형태 영어문장 생성을 위한 한영 대역에 전자사전구축)

  • 신효필
    • Korean Journal of Cognitive Science
    • /
    • v.14 no.2
    • /
    • pp.1-13
    • /
    • 2003
  • We introduce a way to generate morphologically and syntactically well-formed English sentences when building Korean to English bilingual dictionary for Machine Translation Systems. It has been proved that basic inflectional or structural descriptions for English sentences are by no means enough to generate proper English sentences because of traditional dictionary structures. Furthermore, much research has been focused only on how to disambiguate semantic ambiguities of words in a bilingual dictionary To take advantage of existing paperback Korean to English bilingual dictionary, its automatic conversion to an electronic version and methodologies to assign proper features to the descriptions for well-formed English sentences with minimum human effort have been proposed on the basis of the dictionary-specific structures. This approach was originally motivated for a glossary-based machine translation system, but it can be also applied to large scale dictionary work.

  • PDF

Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.11-16
    • /
    • 2016
  • This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.

Japanese Dictionary Input System in Korean Traditional Reading Rule of Chinese Character (한자음으로 일본어 사전을 검색하는 방법(독음입력법))

  • Jeong, Cheol
    • Annual Conference on Human and Language Technology
    • /
    • 2005.10a
    • /
    • pp.139-144
    • /
    • 2005
  • When a Japanese learner in Korea tries to find Japanese dictionary, he must know the pronunciation of the target word. But it's not easy to know the pronunciation of target word from Japanese sentence. Because most of general Japanese sentence shows only HanJa(Chinese character) instead of Kana(Japanese alphabet). If the Japanese learner knows the Korean traditional pronunciation of the target word, he can input the word to electronic Japanese dictionary with the Korean pronunciation. For this solution, the dictionary service provider must convert the Japanese word to Korean pronunciation, in advance. After setting of the conversions as a additional searching process, we can find the target word through Korean pronunciation of the Japanese HanJa, This process is possible for the three reasons below, 1. Korean, Japanese and Chinese are using the nearly same HanJa. The difference is small. 2. Most Japanese learner in Korea, knows the Korean pronunciation of the HanJa. 3. The Korean pronunciation of the HanJa is nearly unique, a HanJa has a Korean pronunciation, generally.

  • PDF

A Comparative Study of Mathematical Terms in Korean Standard Unabridged Dictionary and the Editing Material (표준국어대사전과 편수자료의 수학 용어 비교 조사)

  • Her, Min
    • Journal for History of Mathematics
    • /
    • v.33 no.4
    • /
    • pp.237-257
    • /
    • 2020
  • In this paper, we classify the mathematical terms in Korean Standard Unabridged Dictionary into four groups; ① group 1 consists of the terms which coincide with the mathematical terms in the 2015 Editing Material, ② group 2 consists of the terms which are synonyms or old terms or inflection forms of the mathematical terms in the Editing Material, ③ group 3 consists of the terms which do not belong to group 1 or group 2, but relate to the elementary or secondary school mathematics, ④ group 4 consists of the terms which do not relate to the elementary or secondary school mathematics. And then we make a comparative study with the mathematical terms in the Editing Material. In this study, we find out the mathematical terms in the Editing Material, but not in Korean Standard Unabridged Dictionary. And by using synonyms and old terms of the mathematical terms in the Editing Material we guess the rough tendency which terms belong to the Editing Material. By investigating the terms in group 3 and 4, we find out the mathematical terms which may belong to the Editing Material. We also find out the wrong or inconsistent explanations in Korean Standard Unabridged Dictionary.

Automatic Construction of Korean Unknown Word Dictionary using Occurrence Frequency in Web Documents (웹문서에서의 출현빈도를 이용한 한국어 미등록어 사전 자동 구축)

  • Park, So-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.27-33
    • /
    • 2008
  • In this paper, we propose a method of automatically constructing a dictionary by extracting unknown words from given eojeols in order to improve the performance of a Korean morphological analyzer. The proposed method is composed of a dictionary construction phase based on full text analysis and a dictionary construction phase based on web document frequency. The first phase recognizes unknown words from strings repeatedly occurred in a given full text while the second phase recognizes unknown words based on frequency of retrieving each string, once occurred in the text, from web documents. Experimental results show that the proposed method improves 32.39% recall by utilizing web document frequency compared with a previous method.

  • PDF

MADE: Morphological Analyzer Development Environment (MADE : 형태소 분석기 개발환경)

  • Shim, Kwang-Seob
    • Journal of Internet Computing and Services
    • /
    • v.8 no.4
    • /
    • pp.159-171
    • /
    • 2007
  • This paper proposes a software tool MADE that is useful to develop a practical Korean morphological analyzer. A morphological analysis is performed by using adjacency conditions provided by a morphological dictionary. This means that developing a morphological analyzer is reduced merely to constructing a morphological dictionary. No programming skill is required in this process, MADE provides with useful functions that facilitate the construction of a dictionary. Once a dictionary is constructed, the morphological analysis engine embedded in MADE may be used as a stand-alone morphological analyzer or be integrated into an application software which requires a Korean morphological analysis module.

  • PDF

Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis (한국어 형태소 분석을 위한 효율적 기분석 사전의 구성 방법)

  • Kwak, Sujeong;Kim, Bogyum;Lee, Jae Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.881-888
    • /
    • 2013
  • A pre-analyzed dictionary is used to increase the speed and the accuracy of morphological analyzers and to decrease the over-generation. However, if the dictionary includes 'Insufficiently-analyzed word-phrases', which do not include all the possible analysis of the word-phrase, it may cause the decrease of the analysis accuracy. In this paper, we measure the accuracy changes according to the number of word-phrase frequency and the size changes of corpus by Sejong corpus. And performance of integrate system(SMA with pre-dictionary) is highest when sufficient analysis rate of pre-dictionary is more than 99.82%. Also pre-dictionary is constructed with word-phrase that frequency more than 32(64) when size of corpus is 1,600,000(6,300,000) word-phrase.

Dictionary Learning based Superresolution on 4D Light Field Images (4차원 Light Field 영상에서 Dictionary Learning 기반 초해상도 알고리즘)

  • Lee, Seung-Jae;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.20 no.5
    • /
    • pp.676-686
    • /
    • 2015
  • A 4D light field image is represented in traditional 2D spatial domain and additional 2D angular domain. The 4D light field has a resolution limitation both in spatial and angular domains since 4D signals are captured by 2D CMOS sensor with limited resolution. In this paper, we propose a dictionary learning-based superresolution algorithm in 4D light field domain to overcome the resolution limitation. The proposed algorithm performs dictionary learning using a large number of extracted 4D light field patches. Then, a high resolution light field image is reconstructed from a low resolution input using the learned dictionary. In this paper, we reconstruct a 4D light field image to have double resolution both in spatial and angular domains. Experimental result shows that the proposed method outperforms the traditional method for the test images captured by a commercial light field camera, i.e. Lytro.

Radioisotope identification using sparse representation with dictionary learning approach for an environmental radiation monitoring system

  • Kim, Junhyeok;Lee, Daehee;Kim, Jinhwan;Kim, Giyoon;Hwang, Jisung;Kim, Wonku;Cho, Gyuseong
    • Nuclear Engineering and Technology
    • /
    • v.54 no.3
    • /
    • pp.1037-1048
    • /
    • 2022
  • A radioactive isotope identification algorithm is a prerequisite for a low-resolution scintillation detector applied to an unmanned radiation monitoring system. In this paper, a sparse representation with dictionary learning approach is proposed and applied to plastic gamma-ray spectra. Label-consistent K-SVD was used to learn a discriminative dictionary for the spectra corresponding to a mixture of four isotopes (133Ba, 22Na, 137Cs, and 60Co). A Monte Carlo simulation was employed to produce the simulated data as learning samples. Experimental measurement was conducted to obtain practical spectra. After determining the hyper parameters, two dictionaries tailored to the learning samples were tested by varying with the source position and the measurement time. They achieved average accuracies of 97.6% and 98.0% for all testing spectra. The average accuracy of each dictionary was above 96% for spectra measured over 2 s. They also showed acceptable performance when the spectra were artificially shifted. Thus, the proposed method could be useful for identifying radioisotopes in gamma-ray spectra from a plastic scintillation detector even when a dictionary is adapted to only simulated data. Furthermore, owing to the outstanding properties of sparse representation, the proposed approach can easily be built into an insitu monitoring system.

A Study on the Reconstruction of a Frame Based Speech Signal through Dictionary Learning and Adaptive Compressed Sensing (Adaptive Compressed Sensing과 Dictionary Learning을 이용한 프레임 기반 음성신호의 복원에 대한 연구)

  • Jeong, Seongmoon;Lim, Dongmin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.12
    • /
    • pp.1122-1132
    • /
    • 2012
  • Compressed sensing has been applied to many fields such as images, speech signals, radars, etc. It has been mainly applied to stationary signals, and reconstruction error could grow as compression ratios are increased by decreasing measurements. To resolve the problem, speech signals are divided into frames and processed in parallel. The frames are made sparse by dictionary learning, and adaptive compressed sensing is applied which designs the compressed sensing reconstruction matrix adaptively by using the difference between the sparse coefficient vector and its reconstruction. Through the proposed method, we could see that fast and accurate reconstruction of non-stationary signals is possible with compressed sensing.