• Title/Summary/Keyword: Korean Dictionary

Search Result 737, Processing Time 0.025 seconds

Construction of the Terminology Dictionary for National R&D Information Utilization (국가R&D정보활용을 위한 전문용어사전 구축)

  • Kim, Tae-Hyun;Yang, Myung-Seok;Choi, Kwang-Nam
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.10
    • /
    • pp.217-225
    • /
    • 2019
  • National research and development(R&D) information is information generated in the process of performing R&D based on programs and projects issued by national government departments, and includes information from various research fields as ordered by various departments. Therefore, for efficient R&D information retrieval, it is necessary to build a national R&D terminology dictionary that can reflect the characteristics of such national R&D information. In this study, we propose a method for constructing a national R&D terminology dictionary by applying the classification of science and technology standards used to specify the research field in national R&D information. We will discuss the structural characteristics of national R&D project information and the usefulness of the project keyword, and explain the status of national R&D information by the National Standard Science and Technology Classification(NSSTC) Codes and the characteristics of the national R&D terminologies. Based on this, a method for building a national R&D terminology dictionary is defined in terms of the type and structure of the terminology dictionary, preliminary construction procedures, and refining rules. The national R&D terminology dictionary built on the basis of this study can be used in various ways such as expansion of search terms using Korean-English equivalent words and synonyms when searching national R&D information, clarifying the scope of search using NSSTC, and providing user convenience functions using term explanation information.

A Study on the Standardization of Cartoon and Animation Terminology and Publication of a Terminology Dictionary (만화애니메이션 학술용어 표준화 및 용어사전편찬 연구)

  • Kim, Il-Tae;Sul, Jong-Hoon
    • Cartoon and Animation Studies
    • /
    • s.10
    • /
    • pp.17-31
    • /
    • 2006
  • The purpose of this study is to find out the problems related to the lack of research on rapidly changing digital information media and industry flow and to suggest solutions. In Korea what is needed is basic academic research on cartoon and animation field, and the development and systemization of teaching material. As a first step to achieve this goal, standardization of cartoon and animation related terminology and publication of a dictionary of terminology are underway. To reach agreements on terms, discover new terms, standardize terms and their interpretations, and publish a standardized dictionary, experts in the fields of cartoon and animation education, industry, and academia in North America, Europe, and Japan have been invited to work together. Through these efforts, a dictionary of terminology will be published, which will not only provide useful information but also solve problems related to the use of not standardized terms.

  • PDF

A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia (위키피디아 기반 개체명 사전 반자동 구축 방법)

  • Song, Yeongkil;Jeong, Seokwon;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1397-1403
    • /
    • 2015
  • A named entity(NE) dictionary is an important resource for the performance of NE recognition. However, it is not easy to construct a NE dictionary manually since human annotation is time consuming and labor-intensive. To save construction time and reduce human labor, we propose a semi-automatic system for the construction of a NE dictionary. The proposed system constructs a pseudo-document with Wiki-categories per NE class by using an active learning technique. Then, it calculates similarities between Wiki entries and pseudo-documents using the BM25 model, a well-known information retrieval model. Finally, it classifies each Wiki entry into NE classes based on similarities. In experiments with three different types of NE class sets, the proposed system showed high performance(macro-average F1-score of 0.9028 and micro-average F1-score 0.9554).

Approaches to Creating a Digital Encyclopedia of Korean Archaeology (한국고고학 디지털 사전 구축 방안 연구)

  • LEE Chorong
    • Korean Journal of Heritage: History & Science
    • /
    • v.56 no.2
    • /
    • pp.28-45
    • /
    • 2023
  • Although we have entered the era of digital transformation, there is currently no system that efficiently collects, manages, integrates, and services a large number of archaeological digital source materials produced as a result of cultural relics research, i.e., an intelligent integrated management and service platform for archaeological academic information. In this regard, the need to build a digital dictionary of Korean archaeology was confirmed by examining the problem of the Digital Encyclopedia of Korean Archaeology, which is currently available in PDF format on the web, the current status of the publication and use of the Dictionary of Korean Archaeology, and the cases of building digital platforms at home and abroad. Therefore, this paper aims to suggest a general direction for creating a digital encyclopedia of Korean archaeology based on the Dictionary of Korean Archaeology, which includes quality knowledge information, to reconsider the accessibility of archaeological data in conformity with data access limitations. The application of the series Dictionary of Korean Archaeology, published since 2001, and the necessity for digital transformation were examined, as well as the application of data from the archaeological data archiving platforms of Europe, the USA, Japan, and cases of establishing platforms corresponding to specialized encyclopedias from Korea. Based on these, a three-step implementation plan and detailed projects were suggested to create the Digital Encyclopedia of Korean Archaeology. Through this, we proposed the design of metadata for computerized records and the expansion to semantic (meaning-based) data that gives and shows the relationship information between the produced metadata as the implementation tasks to build the Digital Dictionary of Korean Archaeology. It is hoped that such research will help create an integrated intelligent management and service platform for archaeology, raise awareness, and provide a better understanding of Korean archaeology to the general public.

On the statistics of Korean Phonetic Dictionary - Basic Survey to make corpus of Korean Speech DB - (발음사전 표제어중의 음소의 통계적 성질-음성 DB용 단어선정을 위하여-)

  • Lee, Y.J.;Kim, K.T.;Jo, C.W.;Rhee, T.W.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1606-1609
    • /
    • 1987
  • Statistical information about spoken Korean was obtained. The data are the results of analyzing the Korean phonetic dictionary. This is one of the basic survey to make phoneme ballanced corpus of Korean Speech Data Base (KSDB).

  • PDF

Fault-Diagnosis "Dictionary" for Reactor System (리액터 시스템을 위한 고장 진단 사전)

  • 서병설;이수윤
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.17 no.2
    • /
    • pp.37-52
    • /
    • 1980
  • Recent industrial processes have been complicated and automated. In order to improve the system reliability and solve the limitation of human ability, the necessity of alarm analysis or fault diagnosis has been rapidly grown, A "dictionary" made by a sequence computer programming has been developed as one of the mothods for fault diagnosis in the chemical industrial processes and its usefulness has been proved through the experiment. It also suggests a way to simplity the recent alarm system being complex.g complex.

  • PDF

Design of Big Data Preference Analysis System (빅데이터 선호도 분석 시스템 설계)

  • Son, Sung Il;Park, Chan Khon
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.11
    • /
    • pp.1286-1295
    • /
    • 2014
  • This paper suggests the way that it could improve the reliability about preference of user's feedback by adding weighting factor on sentiment analysis, and efficiently make a sentiment analysis of users' emotional perspective on the big data massively generated on twitter. To solve errors on earlier studies, this paper has improved recall and precision of sensibility determination by using sensibility dictionary subdivided sentiment polarity based on the level of sensibility and given impotance to sensibility determination by populating slang, new words, emoticons and idiomatic expressions not in the system dictionary. It has considered the context through conjunctive adverbs fixed in korean characteristics which are free to the word order. It also recognize sensibility words such as TF(Term Frequency), RT(Retweet), Follower which are weighting factors of preference and has increased reliability of preference analysis considering weight on 'a very emotional tweet', 'a recognised tweet from users' and 'a tweeter influencer'

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Constructing a Korean Subcategorization Dictionary with Semantic Roles using Thesaurus and Predicate Patterns (시소러스와 술어 패턴을 이용한 의미역 부착 한국어 하위범주화 사전의 구축)

  • Yang, Seung-Hyun;Kim, Young-Sum;Woo, Yo-Sub;Yoon, Deok-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.3
    • /
    • pp.364-372
    • /
    • 2000
  • Subcategorization, defining dependency relation between predicates and their complements, is an important source of knowledge for resolving syntactic and semantic ambiguities arising in analyzing sentences. This paper describes a Korean subcategorization dictionary, particularly annotated with semantic roles of complements coupled with thesaural semantic hierarchy as well as syntactic dependencies. For annotating roles, we defined 25 semantic roles associated with surface case markers that can be used to derive semantic structures directly from syntactic ones. In addition, we used more than 120,000 entries of thesaurus to specify concept markers of noun complements, and also used 47 and 17 predicate patterns for verbs and adjectives, respectively, to express dependency relation between predicates and their complements. Using a full-fledged thesaurus for specifying concept markers makes it possible to build an effective selectional restriction mechanism coupled with the subcategorization dictionary, and using the standard predicate patterns for specifying dependency relations makes it possible to avoid inconsistency in the results and to reduce the costs for constructing the dictionary. On the bases of these, we built a Korean subcategorization dictionary for frequently used 13,000 predicates found in corpora with the aid of a tool specially designed to support this task. An experimental result shows that this dictionary can provide 72.7% of predicates in corpora with appropriate subcategorization information.

  • PDF

A Document Ordering Support System Employing Concept Structure based on Fuzzy Fish View Extraction

  • Ohashi, Tadashi;Nobuhara, Hajime;Hirota, Kaoru
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.98-101
    • /
    • 2003
  • To classify desired and undesired documents on the web according to each user's view, FOCUS (Fuzzy dOCUment ordering System) is developed based on fuzzy concept extraction, fuzzy fish eye matching, and fuzzy selection. Experiments are done using the concept-system-dictionary by EDR (Electronic Dictionary Research Institute) including 140,000 words and web-based documents related to movie.

  • PDF