• Title/Summary/Keyword: Electronic Dictionary

Search Result 83, Processing Time 0.021 seconds

An Electronic Dictionary Structure supporting Truncation Search (절단검색을 지원하는 전자사전 구조)

  • 김철수
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.1
    • /
    • pp.60-69
    • /
    • 2003
  • In an Information Retrieval System(IRS) based on an inverted file as a file structure it is possible to retrieve related documents when the searcher know the complete words of searching fields. however, there are many cases in which the searcher may not know the complete words but a partial string of words with which to search. In this case, if the searcher can search indexes that include the known partial string, it is possible to retrieve related documents. Futhermore, when the retrieved documents are few, we need a method to find all documents having indexes which include known the partial string. To satisfy these requests, the searcher should be able to construct a query formulation that uses the term truncation method. Also the IRS should have an electronic dictionary that can support a truncated search term. This paper designs and implements an electronic dictionary(ED) structure to support a truncation search efficiently. The ED guarantees very fast and constant searching time for searching a term entry and the inversely alphabetized entry of it, regardless of the number of inserted words. In order to support a truncation search efficiently, we use the Trie structure and in order to accommodate fast searching time we use a method using array. In the searching process of a truncated term, we can reduce the searching time by minimizing the length of string to be expanded.

A Study on Preservation Metadata for Long Term Preservation of Electronic Records (전자기록의 장기적 보존을 위한 보존메타데이터 요소 분석)

  • Lee, Kyung-Nam
    • The Korean Journal of Archival Studies
    • /
    • no.14
    • /
    • pp.191-240
    • /
    • 2006
  • For long-term preservation of electronic records, the information on the whole processes of management from the time of creation of the electronic information should be captured and managed together. Such information is supported by preservation metadata thus the implementation of preservation metadata is important for preservation of electronic records maintaining the record-ness. Preservation metadata is the information that supports the process of digital preservation and functions th maintain long-term viability, renderability, understandability, authenticity and identity of digital resources. Preservation metadata should be developed applying the international standard Reference Model for an Open Archival Information System(OAIS) to have international interoperability for exchange and reuse. Initial international preservation metadata schemas were developed standardizing the OAIS Reference Model. But the preservation metadata schema of Victorian Electronic Records Strategy(VERS) and recently published Data Dictionary of PREMIS Working Group were developed in advanced types that are different from the existing framework. Those were advanced th practical ones from conceptual one. Comparing these two cases, proposed the elements of integral preservation metadata for long-term preservation of electronic records. This thesis has the significance that it has suggested the direction for future development of the elements of preservation metadata by setting the past discussions related to preservation metadata in order and proposing integral preservation metadata elements for long-term preservation of electronic records.

A Preliminary Study on Clinical Decision Support System based on Classification Learning of Electronic Medical Records

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.817-824
    • /
    • 2003
  • We employed a hierarchical document classification method to classify a massive collection of electronic medical records(EMR) written in both Korean and English. Our experimental system has been learned from 5,000 records of EMR text data and predicted a newly given set of EMR text data over 68% correctly. We expect the accuracy rate can be improved greatly provided a dictionary of medical terms or a suitable medical thesaurus. The classification system might play a key role in some clinical decision support systems and various interpretation systems for clinical data.

  • PDF

EFL College Students' Perceptions toward the Use of Electronic Dictionaries

  • Park, Mae-Ran
    • English Language & Literature Teaching
    • /
    • v.12 no.1
    • /
    • pp.29-54
    • /
    • 2006
  • The purpose of this study is two-fold: first, to examine the current status of e-dictionary use in Korea and the attitude toward its use by Korean college students; second, to investigate to what degree e-dictionaries may be useful and effective tools in helping to improve learners' overall English skills. The subjects were 84 college students and they were divided into two groups: the experiment group and the control group. The instrument employed was the Preliminary Student Usage Questionnaire, which was developed by the researcher, together with the questionnaire survey developed by Koyama and Takeuchi (2004), which was modified for the study. The findings from this research are as follows: First, a special instruction session on how to use e-dictionaries made a statistically significant difference among users of the dictionaries. Those subjects who had received the instruction displayed a more positive attitude toward the use of e-dictionaries. Second, the experiment group showed a more favorable attitude toward the use of e-dictionaries. On the basis of the above results, the researcher suggests that proper guidance on the use of e-dictionaries and their benefits should have a positive influence on users. The findings from the current research will shed light on the current status of electronic dictionary use among Korean college students.

  • PDF

Hanja Information in the Entries of Korean Unabridged Dictionary (국어대사전의 표제어에 나타나는 한자 정보)

  • Kim, Cheol-Su
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.438-446
    • /
    • 2010
  • For language information processing that includes both Hangul and Hanja, an electronic dictionary supporting Hangul and Hanja simultaneously is necessary. This paper examined statistical information on Hanja entries of Korean Unabridged Dictionary such as the number of entries that include Hanja based on the KSC-5601 character set, the frequency of the pronunciation and meaning of each character of Hanja included in the entries, the frequency per part of speech of Hanja in entries and the average number of Hanja characters per entry. At least one or more of Hanja characters appear in 303,951 entries out of 440,594, accounting for 68.99% of the total. 858,595 characters of Hanja are included in the 440,594 entries, which is 1.95 Hanja characters per entry. As the average syllable length of the entries is 3.56 and the average count of the Hanja characters per entry is 1.96, it can be said that 54.7% of all the characters of the entries are in Hanja. Among 4,888 Hanja character codes, 4,660 are used once or more, whereas 228 Hanja codes never appear in any entry. There were 5 characters which appear more than 4,000 times. A total of 858,595 Hanja characters used in all the entries correspond to 471 Hangeul codes.

Learning-based Super-resolution for Text Images (글자 영상을 위한 학습기반 초고해상도 기법)

  • Heo, Bo-Young;Song, Byung Cheol
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.4
    • /
    • pp.175-183
    • /
    • 2015
  • The proposed algorithm consists of two stages: the learning and synthesis stages. At the learning stage, we first collect various high-resolution (HR)-low-resolution (LR) text image pairs, and quantize the LR images, and extract HR-LR block pairs. Based on quantized LR blocks, the LR-HR block pairs are clustered into a pre-determined number of classes. For each class, an optimal 2D-FIR filter is computed, and it is stored into a dictionary with the corresponding LR block for indexing. At the synthesis stage, each quantized LR block in an input LR image is compared with every LR block in the dictionary, and the FIR filter of the best-matched LR block is selected. Finally, a HR block is synthesized with the chosen filter, and a final HR image is produced. Also, in order to cope with noisy environment, we generate multiple dictionaries according to noise level at the learning stage. So, the dictionary corresponding to the noise level of the input image is chosen, and a final HR image is produced using the selected dictionary. Experimental results show that the proposed algorithm outperforms the previous works for noisy images as well as noise-free images.

Studying the frequencies of sentence pattern for a entence patterns dictionary (문형 사전을 위한 문형 빈도 조사)

  • Kim Yu-Mi
    • Korean Journal of Cognitive Science
    • /
    • v.16 no.2
    • /
    • pp.123-140
    • /
    • 2005
  • The purpose of this paper is to examine the frequency and usage of sentence patterns appearing in electronic dictionaries used in Korean language education in order to design an automatic sentence patterns checking. First, the concept of sentence patterns is defined and it is classified into sentence structure patterns and sentencial expression patterns. Sentence structure patterns and sentencial expression patterns are analyzed how they are expressed in the Korean Learner's Corpus. learner's Corpus is built into the Standard Corpus, which all Korean Learners must learn, and the Errors Corpus made by learners. From these research, we will find out how frequently the Sentential Patterns are being used in the Standard Corpus which has been made of Korean Texts and how the Sentential Pattern are being used in the Errors Corpus which were constructed from Korean learner's writings. Finally, having described the Sentential Patterns on the Sentential Electric Dictionary, we determine the optimum speed in the search for the Sentential Pattern.

  • PDF

An Analysis of Korean inflected Word for Machine Translation (한국어의 기계번역을 위한 용언 구조의 해석)

  • Han, H.R.;Lee, J.K.
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.612-615
    • /
    • 1988
  • This paper proposes a method for analyzing the Korean inflected word in machine translation system. We define the processing rules which are useful of analyzing an irregular conjugation, pesent an parsing algorithm of noun and specifed verb and reduce the space of dictionary by the algorithm.

  • PDF

Enhanced Prediction for Low Complexity Near-lossless Compression (낮은 복잡도의 준무손실 압축을 위한 향상된 예측 기법)

  • Son, Ji Deok;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.19 no.2
    • /
    • pp.227-239
    • /
    • 2014
  • This paper proposes an enhance prediction for conventional near-lossless coder to effectively lower external memory bandwidth in image processing SoC. First, we utilize an already reconstructed green component as a base of predictor of the other color component because high correlation between RGB color components usually exists. Next, we can improve prediction performance by applying variable block size prediction. Lastly, we use minimum internal memory and improve a temporal prediction performance by using a template dictionary that is sampled in previous frame. Experimental results show that the proposed algorithm shows better performance than the previous works. Natural images have approximately 30% improvement in coding efficiency and CG images have 60% improvement on average.

Automatic Mapping Between Large-Scale Heterogeneous Language Resources for NLP Applications: A Case of Sejong Semantic Classes and KorLexNoun for Korean

  • Park, Heum;Yoon, Ae-Sun
    • Language and Information
    • /
    • v.15 no.2
    • /
    • pp.23-45
    • /
    • 2011
  • This paper proposes a statistical-based linguistic methodology for automatic mapping between large-scale heterogeneous languages resources for NLP applications in general. As a particular case, it treats automatic mapping between two large-scale heterogeneous Korean language resources: Sejong Semantic Classes (SJSC) in the Sejong Electronic Dictionary (SJD) and nouns in KorLex. KorLex is a large-scale Korean WordNet, but it lacks syntactic information. SJD contains refined semantic-syntactic information, with semantic labels depending on SJSC, but the list of its entry words is much smaller than that of KorLex. The goal of our study is to build a rich language resource by integrating useful information within SJD into KorLex. In this paper, we use both linguistic and statistical methods for constructing an automatic mapping methodology. The linguistic aspect of the methodology focuses on the following three linguistic clues: monosemy/polysemy of word forms, instances (example words), and semantically related words. The statistical aspect of the methodology uses the three statistical formulae ${\chi}^2$, Mutual Information and Information Gain to obtain candidate synsets. Compared with the performance of manual mapping, the automatic mapping based on our proposed statistical linguistic methods shows good performance rates in terms of correctness, specifically giving recall 0.838, precision 0.718, and F1 0.774.

  • PDF