• Title/Summary/Keyword: dictionary-based

Search Result 555, Processing Time 0.024 seconds

Optimizing the Additional Term Weight Ratio in Query Expansion Search based on Dictionary Definition (사전 의미 기반의 질의확장 검색에서 추가 용어 가중치 최적화)

  • 최영란;전유정;박순철
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.2
    • /
    • pp.45-53
    • /
    • 2003
  • The significances of this paper are of two points. One is that this research develops the query expansion search by adding the related terms based on the dictionary to the original query terms. This method shortens the process of the conventional model of query expansion utilizing the feedback data of the search. The other is that this research tries to find out the optimal point of precisions and recalls by differentiating the weight ratio between original quay and additional terms. This method shows that the efficiency and precision of query expansion search increase.

  • PDF

Cloud Storage Security Deduplication Scheme Based on Dynamic Bloom Filter

  • Yan, Xi-ai;Shi, Wei-qi;Tian, Hua
    • Journal of Information Processing Systems
    • /
    • v.15 no.6
    • /
    • pp.1265-1276
    • /
    • 2019
  • Data deduplication is a common method to improve cloud storage efficiency and save network communication bandwidth, but it also brings a series of problems such as privacy disclosure and dictionary attacks. This paper proposes a secure deduplication scheme for cloud storage based on Bloom filter, and dynamically extends the standard Bloom filter. A public dynamic Bloom filter array (PDBFA) is constructed, which improves the efficiency of ownership proof, realizes the fast detection of duplicate data blocks and reduces the false positive rate of the system. In addition, in the process of file encryption and upload, the convergent key is encrypted twice, which can effectively prevent violent dictionary attacks. The experimental results show that the PDBFA scheme has the characteristics of low computational overhead and low false positive rate.

Cryptanalysis of an Efficient RSA-Based Password-Authenticate Key Exchange Protocol against Dictionary Attack (RSA-EPAKE의 사전공격에 대한 안전성 분석)

  • Youn, Taek-Young;Park, Young-Ho;Ryu, Heui-Su
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.179-183
    • /
    • 2008
  • Recently, an efficient password-authenticated key exchange protocol based on RSA has been proposed by Park et al. with formal security proof. In this letter, we analyze their protocol, and show that it is not secure against an active adversary who performs a dictionary attack. Moreover, we analyze the performance of the proposed attack and show that the attack is a threatening attack against the protocol.

Design and Implementation of Dictionary-based Column Name Standardization System (사전기반 항목명 표준화 시스템 설계 및 구현)

  • Shin, Su-Mi;Moon, Young-Su
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.621-624
    • /
    • 2021
  • 최근 빅데이터에 대한 관심이 높아지면서 분석을 위해 필요한 데이셋의 표준화에 대한 중요성이 강조되고 있다. 데이터 표준화를 위해서는 업무 처리에 필요한 모든 데이터의 명명 규칙을 규정하고 그 기준에 따라 표준 명칭을 부여하여야 한다. 본 연구에서는 사전을 기반으로 하는 항목명 표준화 시스템을 제안하였다. 제안한 시스템은 공개된 표준단어사전을 활용하여 유의어를 포함한 참조 사전을 구축하고 이를 기반으로 표준사전을 구축하여 표준 항목명을 제공한다. 기 구축된 데이터셋의 항목명을 입력하거나 사용자가 원하는 새로운 항목명을 입력하면 항목명 표준화 시스템은 표준화된 한글 항목명과 영문 항목명, 그리고 테이블 설계에 사용하는 영문 약어명을 출력한다. 본 연구에서 제안한 시스템을 테이블 설계에 활용하거나 기 구축된 데이터셋을 표준화하는데 적용하면 일관된 데이터 해석이나 관리가 가능할 것으로 기대된다.

  • PDF

YDK : A Thesaurus Developing System for Korean Language (한국어 통합정보사전 시스템)

  • Hwang, Do-Sam;Choi, Key-Sun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.9
    • /
    • pp.2885-2893
    • /
    • 2000
  • Dictionaries are indispensable for NLP(natural language processing) systems. Sophisticated algorithms in the NLP systems can be fully appreciated only with matching dictionaries that are built systematically based on computational linguistics. Only few dictionaries are developed for natural language processing. Available dictionaries are far from complete specifications for practical uses. So, it is necessary to develop an integrated information dictionary that includes useful lexical information for processing and understanding natural languages such as morphology and syntactic and semantic information. In this paper, we propose a method to build an integrated dictionary, and introduce a dictionary developing system.

  • PDF

Support on Ideograph Characters Search of Unicode Based Information System (정보 시스템의 유니코드 기반 한자 검색 지원)

  • Yoon, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.375-391
    • /
    • 2007
  • Unicode Han ideograph character set differed from the our principle of the phonetic value ordering in that it followed the principle of KangXi radical-stroke ordering of the characters. Therefore, information system should support ideograph search on precise analysis of materials which consist of korean character (hangul) and ideograph character (hanja). History Information system has been maintaining Hanja(Chinese Character) to Hangul Dictionary, Terminology Dictionary for composition, borrowing, non-ideographic principles, Variant Forms Dictionary, and Recently discovered Chinese Characters List.

Post-Processing for JPEG-Coded Image Deblocking via Sparse Representation and Adaptive Residual Threshold

  • Wang, Liping;Zhou, Xiao;Wang, Chengyou;Jiang, Baochen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1700-1721
    • /
    • 2017
  • The problem of blocking artifacts is very common in block-based image and video compression, especially at very low bit rates. In this paper, we propose a post-processing method for JPEG-coded image deblocking via sparse representation and adaptive residual threshold. This method includes three steps. First, we obtain the dictionary by online dictionary learning and the compressed images. The dictionary is then modified by the histogram of oriented gradient (HOG) feature descriptor and K-means cluster. Second, an adaptive residual threshold for orthogonal matching pursuit (OMP) is proposed and used for sparse coding by combining blind image blocking assessment. At last, to take advantage of human visual system (HVS), the edge regions of the obtained deblocked image can be further modified by the edge regions of the compressed image. The experimental results show that our proposed method can keep the image more texture and edge information while reducing the image blocking artifacts.

A Study on Utilization of Wikipedia Contents for Automatic Construction of Linguistic Resources (언어자원 자동 구축을 위한 위키피디아 콘텐츠 활용 방안 연구)

  • Yoo, Cheol-Jung;Kim, Yong;Yun, Bo-Hyun
    • Journal of Digital Convergence
    • /
    • v.13 no.5
    • /
    • pp.187-194
    • /
    • 2015
  • Various linguistic knowledge resources are required in order that machine can understand diverse variation in natural languages. This paper aims to devise an automatic construction method of linguistic resources by reflecting characteristics of online contents toward continuous expansion. Especially we focused to build NE(Named-Entity) dictionary because the applicability of NEs is very high in linguistic analysis processes. Based on the investigation on Korean Wikipedia, we suggested an efficient construction method of NE dictionary using the syntactic patterns and structural features such as metadatas.

A Study on the Language Independent Dictionary Creation Using International Phoneticizing Engine Technology (국제 음소 기술에 의한 언어에 독립적인 발음사전 생성에 관한 연구)

  • Shin, Chwa-Cheul;Woo, In-Sung;Kang, Heung-Soon;Hwang, In-Soo;Kim, Suk-Dong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.1E
    • /
    • pp.1-7
    • /
    • 2007
  • One result of the trend towards globalization is an increased number of projects that focus on natural language processing. Automatic speech recognition (ASR) technologies, for example, hold great promise in facilitating global communications and collaborations. Unfortunately, to date, most research projects focus on single widely spoken languages. Therefore, the cost to adapt a particular ASR tool for use with other languages is often prohibitive. This work takes a more general approach. We propose an International Phoneticizing Engine (IPE) that interprets input files supplied in our Phonetic Language Identity (PLI) format to build a dictionary. IPE is language independent and rule based. It operates by decomposing the dictionary creation process into a set of well-defined steps. These steps reduce rule conflicts, allow for rule creation by people without linguistics training, and optimize run-time efficiency. Dictionaries created by the IPE can be used with the Sphinx speech recognition system. IPE defines an easy-to-use systematic approach that can lead to internationalization of automatic speech recognition systems.

Movie Retrieval System by Analyzing Sentimental Keyword from User's Movie Reviews (사용자 영화평의 감정어휘 분석을 통한 영화검색시스템)

  • Oh, Sung-Ho;Kang, Shin-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.3
    • /
    • pp.1422-1427
    • /
    • 2013
  • This paper proposed a movie retrieval system based on sentimental keywords extracted from user's movie reviews. At first, sentimental keyword dictionary is manually constructed by applying morphological analysis to user's movie reviews, and then keyword weights in the dictionary are calculated for each movie with TF-IDF. By using these results, the proposed system classify sentimental categories of movies and rank classified movies. Without reading any movie reviews, users can retrieve movies through queries composed by sentimental keywords.