• 제목/요약/키워드: Fast retrieval

Search Result 198, Processing Time 0.033 seconds

A Word Dictionary Structure for the Postprocessing of Hangul Recognition (한글인식 후처리용 단어사전의 기억구조)

  • ;Yoshinao Aoki
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.9
    • /
    • pp.1702-1709
    • /
    • 1994
  • In the postprocessing of Hangul recognition system, the storage structure of contextual information is an important matter for the recognition rate and speed of the entire system. Trie in general is used to represent the context as word dictionary, but the memory space efficiency of the structure is low. Therefore we propose a new structure for word dictionary that has better space efficiency and the equivalent merits of trie. Because Hangul is a compound language, the language can be represented by phonemes or by characters. In the representation by phonemes(P-mode) the retrieval is fast, but the space efficiency is low. In the representation by characters(C-mode) the space efficiency is high, but the retrieval is slow. In this paper the two representation methods are combined to form a hybrid representation(H-mode). At first an optimal level for the combination is selected by two characteristic curves of node utilization and dispersion. Then the input words are represented with trie structure by P-mode from the first to the optimal level, and the rest are represented with sequentially linked list structure by C-mode. The experimental results for the six kinds of word set show that the proposed structure is more efficient. This result is based on the fact that the retrieval for H-mode is as fast as P-mode and the space efficiency is as good as C-mode.

  • PDF

The Noise Robust Algorithm to Detect the Starting Point of Music for Content Based Music Retrieval System (노이즈에 강인한 음악 시작점 검출 알고리즘)

  • Kim, Jung-Soo;Sung, Bo-Kyung;Koo, Kwang-Hyo;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.9
    • /
    • pp.95-104
    • /
    • 2009
  • This paper proposes the noise robust algorithm to detect the starting point of music. Detection of starting point of music is necessary to solve computational-waste problem and retrieval-comparison problem with inconsistent input data in music content based retrieval system. In particular, such detection is even more necessary in time sequential retrieval method that compares data in the sequential order of time in contents based music retrieval system. Whereas it has the long point that the retrieval is fast since it executes simple comparison in the order of time, time sequential retrieval method has the short point that data starting time to be compared should be the same. However, digitalized music cannot guarantee the equity of starting time by bit rate conversion. Therefore, this paper ensured that recognition rate shall not decrease even while executing high speed retrieval by applying time sequential retrieval method through detection of music starting point in the pre-processing stage of retrieval. Starting point detection used minimum wave model that can detect effective sound, and for strength against noise, the noises existing in mute sound were swapped. The proposed algorithm was confirmed to produce about 38% more excellent performance than the results to which starting point detection was not applied, and was verified for the strength against noise.

Design of Moving Picture Retrieval System using Scene Change Technique (장면 전환 기법을 이용한 동영상 검색 시스템 설계)

  • Kim, Jang-Hui;Kang, Dae-Seong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.3
    • /
    • pp.8-15
    • /
    • 2007
  • Recently, it is important to process multimedia data efficiently. Especially, in case of retrieval of multimedia information, technique of user interface and retrieval technique are necessary. This paper proposes a new technique which detects cuts effectively in compressed image information by MPEG. A cut is a turning point of scenes. The cut-detection is the basic work and the first-step for video indexing and retrieval. Existing methods have a weak point that they detect wrong cuts according to change of a screen such as fast motion of an object, movement of a camera and a flash. Because they compare between previous frame and present frame. The proposed technique detects shots at first using DC(Direct Current) coefficient of DCT(Discrete Cosine Transform). The database is composed of these detected shots. Features are extracted by HMMD color model and edge histogram descriptor(EHD) among the MPEG-7 visual descriptors. And detections are performed in sequence by the proposed matching technique. Through this experiments, an improved video segmentation system is implemented that it performs more quickly and precisely than existing techniques have.

Content-based Image Retrieval using LBP and HSV Color Histogram (LBP와 HSV 컬러 히스토그램을 이용한 내용 기반 영상 검색)

  • Lee, Kwon;Lee, Chulhee
    • Journal of Broadcast Engineering
    • /
    • v.18 no.3
    • /
    • pp.372-379
    • /
    • 2013
  • In this paper, we proposed a content-based image retrieval algorithm using local binary patterns and HSV color histogram. Images are retrieved using image input in image retrieval system. Many researches are based on global feature distribution such as color, texture and shape. These techniques decrease the retrieval performance in images which contained background the large amount of image. To overcome this drawback, the proposed method extract background fast and emphasize the feature of object by shrinking the background. The proposed method uses HSV color histogram and Local Binary Patterns. We also extract the Local Binary Patterns in quantized Hue domain. Experimental results show that the proposed method 82% precision using Corel 1000 database.

Content-based image retrieval using region-based image querying (영역 기반의 영상 질의를 이용한 내용 기반 영상 검색)

  • Kim, Nac-Woo;Song, Ho-Young;Kim, Bong-Tae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.10C
    • /
    • pp.990-999
    • /
    • 2007
  • In this paper, we propose the region-based image retrieval method using JSEG which is a method for unsupervised segmentation of color-texture regions. JSEG is an algorithm that discretizes an image by color classification, makes the J-image by applying a region to window mask, and then segments the image by using a region growing and merging. The segmented image from JSEG is given to a user as the query image, and a user can select a few segmented regions as the query region. After finding the MBR of regions selected by user query and generating the multiple window masks based on the center point of MBR, we extract the feature vectors from selected regions. We use the accumulated histogram as the global descriptor for performance comparison of extracted feature vectors in each method. Our approach fast and accurately supplies the relevant images for the given query, as the feature vectors extracted from specific regions and global regions are simultaneously applied to image retrieval. Experimental evidence suggests that our algorithm outperforms the recent image-based methods for image indexing and retrieval.

Enhancement of HCB Tree for Improving Retrieval Performance and Dynamic Environments (검색 성능 향상과 동적 환경을 위한 HCB 트리의 개선)

  • Kim, Sung Wan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.365-371
    • /
    • 2015
  • CB tree represents the binary trie by a compact binary sequence. However, retrieval time grows fast since the more keys stored in the trie, longer the binary sequences are. In addition it is inefficient for frequent key insertion/deletion. HCB tree is a hierarchical CB tree consisting of small binary tries. However it can not avoid shift operations and have to scan an additional table to refer child or parent trie. In order to improve retrieval performance and avoid shift operations when keys are inserted or deleted, we in this paper represent each separated trie by a full binary trie and then assign the unique identifier to it. Finally the theoretical evaluations show that both the proposed approach and HCB tree provides better than CB tree for key retrieval. The proposed approach shows the highest performance in case of key insertion/deletion and moreover requires only 71%~89% of storage as compared with CB tree.

Comparison of User-generated Tags with Subject Descriptors, Author Keywords, and Title Terms of Scholarly Journal Articles: A Case Study of Marine Science

  • Vaidya, Praveenkumar;Harinarayana, N.S.
    • Journal of Information Science Theory and Practice
    • /
    • v.7 no.1
    • /
    • pp.29-38
    • /
    • 2019
  • Information retrieval is the challenge of the Web 2.0 world. The experiment of knowledge organisation in the context of abundant information available from various sources proves a major hurdle in obtaining information retrieval with greater precision and recall. The fast-changing landscape of information organisation through social networking sites at a personal level creates a world of opportunities for data scientists and also library professionals to assimilate the social data with expert created data. Thus, folksonomies or social tags play a vital role in information organisation and retrieval. The comparison of these user-created tags with expert-created index terms, author keywords and title words, will throw light on the differentiation between these sets of data. Such comparative studies show revelation of a new set of terms to enhance subject access and reflect the extent of similarity between user-generated tags and other set of terms. The CiteULike tags extracted from 5,150 scholarly journal articles in marine science were compared with corresponding Aquatic Science and Fisheries Abstracts descriptors, author keywords, and title terms. The Jaccard similarity coefficient method was employed to compare the social tags with the above mentioned wordsets, and results proved the presence of user-generated keywords in Aquatic Science and Fisheries Abstracts descriptors, author keywords, and title words. While using information retrieval techniques like stemmer and lemmatization, the results were found to enhance keywords to subject access.

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

A Study on Generation of Query toy Korean Information Retrieval (한국어 정보검색을 위한 질의어 생성에 관한 연구)

  • Lee Deok-Nam;Park In-Chol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.3
    • /
    • pp.358-364
    • /
    • 2006
  • At present age, great many informations are no exaggeration to say that supply information of better quality to users depend on that grasp correctly user's query intention through internet along with fast development of internet. Therefore, this thesis suggest that generating meaning relation between keywords with that result by passing through morpheme analysis and syntactic analysis about Natural Language Query. This approach is implied more meaning relation than query by simple keyword or simple combination between keywords. Therefore, it is going to permit much more efficient information retrieval because of solving problem about existent query form, and generating query that user's query intention is reflected more correctly.

  • PDF

Dynamic index storage and integrated searching service development (동적 색인 스토리지 및 통합 검색 서비스 개발)

  • Lee, Wang-Woo;Lee, Seok-Hyoung;Choe, Ho-Seop;Yoon, Hwa-Mook;Kim, Jong-Hwan;Hur, Yoon-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.346-349
    • /
    • 2007
  • In this paper, the integrated search system made for the web news and review retrieval service is introduced. We made XSLTRobot that extract title, date, author and content from html document like news or reviews for search service. XSLTRobot used the XSLT technology in order to extract desired part of html page. The Intergrated Information Retrieval System(IIRS) is suitable for various search data format. And we introduce Dynamic Index Storage which is module of IIRS. Dynamic Index Storage is used to environment which needs fast index update like news. And it's design focused on retrieval performance because there was not many document that it has to update on a real time.

  • PDF