• Title/Summary/Keyword: dictionaries

Search Result 212, Processing Time 0.022 seconds

Study on Efficient Generation of Dictionary for Korean Vocabulary Recognition (한국어 음성인식을 위한 효율적인 사전 구성에 관한 연구)

  • Lee Sang-Bok;Choi Dae-Lim;Kim Chong-Kyo
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.41-44
    • /
    • 2002
  • This paper is related to the enhancement of speech recognition rate using enhanced pronunciation dictionary. Modern large vocabulary, continuous speech recognition systems have pronunciation dictionaries. A pronunciation dictionary provides pronunciation information for each word in the vocabulary in phonemic units, which are modeled in detail by the acoustic models. But in most speech recognition system based on Hidden Markov Model, actual pronunciation variations are disregarded. Without the pronunciation variations in the speech recognition system, the phonetic transcriptions in the dictionary do not match the actual occurrences in the database. In this paper, we proposed the unvoiced rule of semivowel in allophone rules to pronunciation dictionary. Experimental results on speech recognition system give higher performance than existing pronunciation dictionaries.

  • PDF

A study on the Chinese characters originated in Japanese industrial standard (JIS * 0212) (일본공업규격 '정보교환용한자부호-보조한자'에 포함된 일본한자에 대한 연구)

  • 이춘택
    • Journal of Korean Library and Information Science Society
    • /
    • v.19
    • /
    • pp.59-81
    • /
    • 1992
  • This study investigates Japanese-made Chinese Characters in JIS X 0212-1990(Code of the Su n.0, pplementary Japanese Graphic Character Set for Information Interchange). As a results of detailed investigation, it is found that the number of Japanese-made Chinese Characters in su n.0, pplementary set reaches to 69 characters. Among them, 29 characters are not listed even in the best known chinese character dictionary [대한화사전]. 30 characters are found in the chinese character dictionaries published in Korea, while 39 characters are not found in any of those dictionaries. The distinctive characteristic of Japanese-made Chinese characters is that those chinese characters are made in order to name the things, such as fishes, birds, trees, which do not have Chinese-made Chinese Characters.

  • PDF

Using Core Components to Design Semantic Libraries (코어 컴포넌트 기반 시맨틱 라이브러리의 설계)

  • Jung, Yong-Gyu
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.3
    • /
    • pp.83-92
    • /
    • 2007
  • Semantic libraries can be used for EDI messages to exchange by implementing the semantic dictionaries. This paper describes the design information of semantic libraries for the field engineers to implement the semantic dictionary using metadata. The components of semantic libraries are semantic elements, semantic units and mapping tables. The basic characteristics and design methods related implementing are proposed. Also the metadata semantic dictionaries including the components and rules are introduced.

KNE: An Automatic Dictionary Expansion Method Using Use-cases for Morphological Analysis

  • Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.3
    • /
    • pp.191-197
    • /
    • 2019
  • Morphological analysis is used for searching sentences and understanding context. As most morpheme analysis methods are based on predefined dictionaries, the problem of a target word not being registered in the given morpheme dictionary, the so-called unregistered word problem, can be a major cause of reduced performance. The current practical solution of such unregistered word problem is to add them by hand-write into the given dictionary. This method is a limitation that restricts the scalability and expandability of dictionaries. In order to overcome this limitation, we propose a novel method to automatically expand a dictionary by means of use-case analysis, which checks the validity of the unregistered word by exploring the use-cases through web crawling. The results show that the proposed method is a feasible one in terms of the accuracy of the validation process, the expandability of the dictionary and, after registration, the fast extraction time of morphemes.

A Study on Usage Frequency of Translated English Phrase Using Google Crawling

  • Kim, Kyuseok;Lee, Hyunno;Lim, Jisoo;Lee, Sungmin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.689-692
    • /
    • 2020
  • People have studied English using online English dictionaries when they looked for the meaning of English words or the example sentences. These days, as the AI technologies such as machine learning have been developing, documents can be translated in real time with Kakao, Papago, Google translators and so on. But, there has still been some problems with the accuracy of translation. The AI secretaries can be used for real-time interpreting, so this kind of systems are being used to translate such the web pages, papers into Korean. In this paper, we researched on the usage frequency of the combined English phrases from dictionaries by analyzing the number of the searched results on Google. With the result of this paper, we expect to help the people to use more English fluently.

Morphological Analysis with Adjacency Attributes and Phrase Dictionary (접속 특성과 말마디 사전을 이용한 형태소 분석)

  • Im, Gwon-Muk;Song, Man-Seok
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.1
    • /
    • pp.129-139
    • /
    • 1994
  • This paper presents a morphological analysis method for the Korean language. The characteristics and adjacency information of the words can be obtained from sentences in a large corpus. Generally a word can be analyzed to a result by applying the adjacency attributes and rules. However, we have to choose one from the several results for the ambiguous words. The collected morpheme's adjacency attributes and relations with neighbor words are recorded in a well designed dictionaries. With this information, abbreviated words as well as ambiguous words can be almost analyzed successfully. Efficiency of morphological analyzer depends on the information in the dictionaries. A morpheme dictionary and a phrase dictionary have been designed with lexical database, and necessary information extracted from the corpus is stored in the dictionaries.

  • PDF

Design and Implementation of a Koran Text to Sign Language Translation System (한국어-수화 번역 시스템 설계)

  • Gwon, Gyeong-Hyeok;U, Yo-Seop;Min, Hong-Gi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.3
    • /
    • pp.756-765
    • /
    • 2000
  • In this paper, a korean text to sign language translation system is designed and implemented for the hearing impaired people to learn letters and to have a conversation with normal people. We adopt the direct method for machine translation which uses morphological analysis and the dictionary search. And we define the necessary sign language dictionaries. Based on this processes, the system translate korean sentences to sign language moving picture. The proposed dictionaries are composed of the basic sign language dictionary, the compound sing language dictionary, and the resemble sign language dictionary. The basic sign language dictionary includes basic symbols and moving pictures of korean sign language. The compound sing language dictionary is composed of key-words of basic sign language. In addition, we offered the similar letters at the resemble sign language dictionary. The moving pictures of searched sign symbols are displayed on a screen in GIF formats by continuous motion of sign symbols or represented by the finger spelling based on the korean code analysis. The proposed system can provide quick sign language search and complement the lack of sign languages in the translation process by using the various sign language dictionaries which are characterized as korean sign language. In addition, to represent the sign language using GIF makes it possible to save the storage space of the sign language. In addition, to represent the sign language using GIF makes it possible to save storage space of the sign language dictionary.

  • PDF

Analyzing the Effect of Characteristics of Dictionary on the Accuracy of Document Classifiers (용어 사전의 특성이 문서 분류 정확도에 미치는 영향 연구)

  • Jung, Haegang;Kim, Namgyu
    • Management & Information Systems Review
    • /
    • v.37 no.4
    • /
    • pp.41-62
    • /
    • 2018
  • As the volume of unstructured data increases through various social media, Internet news articles, and blogs, the importance of text analysis and the studies are increasing. Since text analysis is mostly performed on a specific domain or topic, the importance of constructing and applying a domain-specific dictionary has been increased. The quality of dictionary has a direct impact on the results of the unstructured data analysis and it is much more important since it present a perspective of analysis. In the literature, most studies on text analysis has emphasized the importance of dictionaries to acquire clean and high quality results. However, unfortunately, a rigorous verification of the effects of dictionaries has not been studied, even if it is already known as the most essential factor of text analysis. In this paper, we generate three dictionaries in various ways from 39,800 news articles and analyze and verify the effect each dictionary on the accuracy of document classification by defining the concept of Intrinsic Rate. 1) A batch construction method which is building a dictionary based on the frequency of terms in the entire documents 2) A method of extracting the terms by category and integrating the terms 3) A method of extracting the features according to each category and integrating them. We compared accuracy of three artificial neural network-based document classifiers to evaluate the quality of dictionaries. As a result of the experiment, the accuracy tend to increase when the "Intrinsic Rate" is high and we found the possibility to improve accuracy of document classification by increasing the intrinsic rate of the dictionary.

A Study on the Structure of Definition in Terminological Dictionaries (전문용어사전의 정의 구조에 관한 연구)

  • 김성진
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2000.08a
    • /
    • pp.11-14
    • /
    • 2000
  • 사전에서 정의는 의미전달 및 이해를 도모하는 핵심부로, 피정의항의 언어적 성격에 따라, 사전의 성격에 따라 다양하다. 전문용어사전의 체계적이고 일관성 있는 정의 구조는 이용자의 이해를 도울 뿐만 아니라 시소러스 및 전자사전의 구축을 용이하게 한다. 본 연구에서는 전문용어사전의 정의 구조를 분석하여 정의 구조의 체계화를 도모할 수 있는 방안을 제안한다.

  • PDF

A comparative study on the South and North Korean dictionaries (남북한 통합 국어 사전 구축을 위한 비교 연구)

  • 백지원
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2000.08a
    • /
    • pp.15-18
    • /
    • 2000
  • 남한의 $\boxDr$표준국어대사전$\boxUl$과 북한의 $\boxDr$조선말대사전$\boxUl$을 대상으로 현행 남북한 국어 사전의 통합 유형 및 통합시의 문제점을 분석하여 남북한 통합 국어 사전 구축을 위한 기초 자료를 제공하고자 하였다.

  • PDF