• Title/Summary/Keyword: Dictionary

Search Result 1,126, Processing Time 0.027 seconds

Ternary Decomposition and Dictionary Extension for Khmer Word Segmentation

  • Sung, Thaileang;Hwang, Insoo
    • Journal of Information Technology Applications and Management
    • /
    • v.23 no.2
    • /
    • pp.11-28
    • /
    • 2016
  • In this paper, we proposed a dictionary extension and a ternary decomposition technique to improve the effectiveness of Khmer word segmentation. Most word segmentation approaches depend on a dictionary. However, the dictionary being used is not fully reliable and cannot cover all the words of the Khmer language. This causes an issue of unknown words or out-of-vocabulary words. Our approach is to extend the original dictionary to be more reliable with new words. In addition, we use ternary decomposition for the segmentation process. In this research, we also introduced the invisible space of the Khmer Unicode (char\u200B) in order to segment our training corpus. With our segmentation algorithm, based on ternary decomposition and invisible space, we can extract new words from our training text and then input the new words into the dictionary. We used an extended wordlist and a segmentation algorithm regardless of the invisible space to test an unannotated text. Our results remarkably outperformed other approaches. We have achieved 88.8%, 91.8% and 90.6% rates of precision, recall and F-measurement.

A Practical Algorithm for Two-Dimensional Dictionary Matching (2차원 사전 정합을 위한 실용적인 알고리즘)

  • Lee, Gwang-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.3
    • /
    • pp.812-820
    • /
    • 1999
  • In two-dimensional dictionary matching problem, we are given a two-dimensional text T and a dictionary D={P\ulcorner, ...., P\ulcorner} as a set of two-dimensional patterns. We seek the locations of all the dictionary patterns that appear in T. We present a new two-dimensional pattern matching algorithm that can handle just a single pattern, and then show how to extend it into two-dimensional dictionary matching algorithm. The suggested algorithm is practical in the sense that it can deal that it uses a small extra space proportional to the size of the dictionary, and that it is quite simple to be implemented without depending on complicated data structures.

  • PDF

KNE: An Automatic Dictionary Expansion Method Using Use-cases for Morphological Analysis

  • Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.3
    • /
    • pp.191-197
    • /
    • 2019
  • Morphological analysis is used for searching sentences and understanding context. As most morpheme analysis methods are based on predefined dictionaries, the problem of a target word not being registered in the given morpheme dictionary, the so-called unregistered word problem, can be a major cause of reduced performance. The current practical solution of such unregistered word problem is to add them by hand-write into the given dictionary. This method is a limitation that restricts the scalability and expandability of dictionaries. In order to overcome this limitation, we propose a novel method to automatically expand a dictionary by means of use-case analysis, which checks the validity of the unregistered word by exploring the use-cases through web crawling. The results show that the proposed method is a feasible one in terms of the accuracy of the validation process, the expandability of the dictionary and, after registration, the fast extraction time of morphemes.

A Data Dictionary for Procurement of Die and Mold Parts Based on PLIB Standard (PLIB에 기반한 전자상거래용 금형부품 데이터 사전의 구축)

  • 조준면;문두환;김흥기;한순흥;류병우
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.3
    • /
    • pp.37-52
    • /
    • 2003
  • ISO 13584 Parts Library (PLIB) standard is making its way into e-business as a norm for classifying products and their characteristics. PLIB is a multi-parts standard, and the Part 42: Methodology for structuring Parts families Provides the information model and design Principles for the data dictionary of parts library or e-catalog. If e-catalog systems are built using a data dictionary that is constructed based on PLIB dictionary data model, many different e-catalog systems can be easily integrated and interoperated. This paper studies the roles and requirements of the data dictionary in e-catalog, and applies the data model and design principles of PLIB Part 42 to construct a data dictionary from the viewpoint of ontology Based on the analysis results, we propose a data dictionary of die and mold parts, and implementat the B2B e-catalog system.

  • PDF

Design and Implementation of Web-Based Dictionary of Computing for Efficient Search Interface (효율적인 검색 인터페이스를 위한 웹 기반 컴퓨터 용어사전의 설계 및 구현)

  • Hwang, Byeong-Yeon;Park, Seong-Cheol
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.457-466
    • /
    • 2002
  • In this paper, we designed and implemented a web-based dictionary of computing which keeps the data up-to-date. This dictionary shows the English information based on the FOLDOC (Free On-Line Dictionary Of Computing) dictionary file at the beginning of searching, and then one or more users can translate the information into Korean. This function is the new one only this dictionary has. Also, we can easily find any words we want to took up, even if we don't know the spelling completely, because the dictionary has various searching interfaces (searching for the words starting with inputted characters, searching for the words including inputted characters in the description, etc.) using a SQL Server DBMS and SQL. The performance test for CPU load factor shows that the server can support at least 1780 users at the same time.

Construction of Thesaurus Using "The Korean Standard Dictionary" ("표준국어대사전"을 이용한 시소러스 구축)

  • Han, Sangkil
    • Journal of Korean Library and Information Science Society
    • /
    • v.44 no.4
    • /
    • pp.233-254
    • /
    • 2013
  • Collecting terms in thesaurus construction work is the most difficult. A dictionary is thesaurus can be used as an excellent term acquisition. Reflect faithfully the provisions of Korean literary "Standard Korean Dictionary" is a standard dictionary of the Korea. The "Standard Korean Dictionary" is simply the definition of entry, as well as a wide range of information about the term because it contains a systematic, it can be used to build a thesaurus. In this study, the "Standard Korean Dictionary" has the relevant information using a variety of terms, it is defined as the thesaurus term relationship. In addition, the separation of the term, equal relationships and hierarchical set of relationships, the use of qualifiers, North Korean issues, the issue presented in thesaurus construction, and suggest ways to solve the problem.

Dictionary Attacks against Password-Based Authenticated Three-Party Key Exchange Protocols

  • Nam, Junghyun;Choo, Kim-Kwang Raymond;Kim, Moonseong;Paik, Juryon;Won, Dongho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.12
    • /
    • pp.3244-3260
    • /
    • 2013
  • A three-party password-based authenticated key exchange (PAKE) protocol allows two clients registered with a trusted server to generate a common cryptographic key from their individual passwords shared only with the server. A key requirement for three-party PAKE protocols is to prevent an adversary from mounting a dictionary attack. This requirement must be met even when the adversary is a malicious (registered) client who can set up normal protocol sessions with other clients. This work revisits three existing three-party PAKE protocols, namely, Guo et al.'s (2008) protocol, Huang's (2009) protocol, and Lee and Hwang's (2010) protocol, and demonstrates that these protocols are not secure against offline and/or (undetectable) online dictionary attacks in the presence of a malicious client. The offline dictionary attack we present against Guo et al.'s protocol also applies to other similar protocols including Lee and Hwang's protocol. We conclude with some suggestions on how to design a three-party PAKE protocol that is resistant against dictionary attacks.

Development and Evaluation of Video English Dictionary for Silver Generation (실버세대를 위한 동영상 영어사전의 개발 및 평가)

  • Kim, Jeiyoung;Park, Ji Su;Shon, Jin Gon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.345-350
    • /
    • 2020
  • Based on the analysis of physical and learning characteristics and requirements of the silver generation, a video English dictionary was developed and evaluated as English learning contents. The video English dictionary was developed using OCR as an input method and video as an output method, and 17 silver generations were evaluated for academic achievement, learning satisfaction, and ease of use. As a result of the analysis, both the text English dictionary and the video English dictionary showed high learning satisfaction, but the video English dictionary showed higher results than the text English dictionary in an academic achievement and ease of use.

Anomaly Detection via Pattern Dictionary Method and Atypicality in Application (패턴사전과 비정형성을 통한 이상치 탐지방법 적용)

  • Sehong Oh;Jongsung Park;Youngsam Yoon
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.6
    • /
    • pp.481-486
    • /
    • 2023
  • Anomaly detection holds paramount significance across diverse fields, encompassing fraud detection, risk mitigation, and sensor evaluation tests. Its pertinence extends notably to the military, particularly within the Warrior Platform, a comprehensive combat equipment system with wearable sensors. Hence, we propose a data-compression-based anomaly detection approach tailored to unlabeled time series and sequence data. This method entailed the construction of two distinctive features, typicality and atypicality, to discern anomalies effectively. The typicality of a test sequence was determined by evaluating the compression efficacy achieved through the pattern dictionary. This dictionary was established based on the frequency of all patterns identified in a training sequence generated for each sensor within Warrior Platform. The resulting typicality served as an anomaly score, facilitating the identification of anomalous data using a predetermined threshold. To improve the performance of the pattern dictionary method, we leveraged atypicality to discern sequences that could undergo compression independently without relying on the pattern dictionary. Consequently, our refined approach integrated both typicality and atypicality, augmenting the effectiveness of the pattern dictionary method. Our proposed method exhibited heightened capability in detecting a spectrum of unpredictable anomalies, fortifying the stability of wearable sensors prevalent in military equipment, including the Army TIGER 4.0 system.

An effect of dictionary information in the handwritten Hangul word recognition (필기한글 단어 인식에서 사전정보의 효과)

  • 김호연;임길택;남윤석
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.1019-1022
    • /
    • 1999
  • In this paper, we analysis the effect of a dictionary in a handwritten Hangul word recognition problem in terms of its size and the length of the words in it. With our experimental results, we can account for the word recognition rate depending not only on character recognition performance, but also much on the amount of the information that the dictionary contains, as well as the reduction rate of a dictionary.

  • PDF