• Title/Summary/Keyword: Automatic Indexing Algorithm

Search Result 29, Processing Time 0.018 seconds

Eojeol-Block Bidirectional Algorithm for Automatic Word Spacing of Hangul Sentences (한글 문장의 자동 띄어쓰기를 위한 어절 블록 양방향 알고리즘)

  • Kang, Seung-Shik
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.4
    • /
    • pp.441-447
    • /
    • 2000
  • Automatic word spacing is needed to solve the automatic indexing problem of the non-spaced documents and the space-insertion problem of the character recognition system at the end of a line. We propose a word spacing algorithm that automatically finds out word spacing positions. It is based on the recognition of Eojeol components by using the sentence partition and bidirectional longest-match algorithm. The sentence partition utilizes an extraction of Eojeol-block where the Eojeol boundary is relatively clear, and a Korean morphological analyzer is applied bidirectionally to the recognition of Eojeol components. We tested the algorithm on two sentence groups of about 4,500 Eojeols. The space-level recall ratio was 97.3% and the Eojeol-level recall ratio was 93.2%.

  • PDF

Video Indexing using Motion vector and brightness features (움직임 벡터와 빛의 특징을 이용한 비디오 인덱스)

  • 이재현;조진선
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.4
    • /
    • pp.27-34
    • /
    • 1998
  • In this paper we present a method for automatic motion vector and brightness based video indexing and retrieval. We extract a representational frame from each shot and compute some motion vector and brightness based features. For each R-frame we compute the optical flow field; motion vector features are then derived from this flow field, BMA(block matching algorithm) is used to find motion vectors and Brightness features are related to the cut detection of method brightness histogram. A video database provided contents based access to video. This is achieved by organizing or indexing video data based on some set of features. In this paper the index of features is based on a B+ search tree. It consists of internal and leaf nodes stores in a direct access a storage device. This paper defines the problem of video indexing based on video data models.

  • PDF

A Theoretical Study of Designing Thesaurus Browser by Clustering Algorithm (클러스터링을 이용한 시소러스 브라우저의 설계에 대한 이론적 연구)

  • Seo, Hwi
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.3
    • /
    • pp.427-456
    • /
    • 1999
  • This paper deals with the problems of information retrieval through full-test database which arise from both the deficiency of searching strategies or methods by information searcher and the difficulties of query representation, generation, extension, etc. In oder to solve these problems, we should use automatic retrieval instead of manual retrieval in the past. One of the ways to make the gap narrow between the terms by the writers and query by the searchers is that the query should be searched with the terms which the writers use. Thus, the preconditions which should be taken one accorded way to solve the problems are that all areas of information retrieval such as should taken one accorded way to solve the problems are that all areas of information retrieval such as contents analysis, information structure, query formation, query evaluation, etc. should be solved as a coherence way. We need to deal all the ares of automatic information retrieval for the efficiency of retrieval thought this paper is trying to solve the design of thesaurus browser. Thus, this paper shows the theoretical analyses about the form of information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, and information retrieval technique. As the result of analyzing them, this paper shows us theoretical model, that is to say, the thesaurus browser by clustering algorithm. The result in the paper will be a theoretical basis on new retrieval algorithm.

  • PDF

Automatic Parsing of MPEG-Compressed Video (MPEG 압축된 비디오의 자동 분할 기법)

  • Kim, Ga-Hyeon;Mun, Yeong-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.868-876
    • /
    • 1999
  • In this paper, an efficient automatic video parsing technique on MPEG-compressed video that is fundamental for content-based indexing is described. The proposed method detects scene changes, regardless of IPB picture composition. To detect abrupt changes, the difference measure based on the dc coefficient in I picture and the macroblock reference feature in P and B pictures are utilized. For gradual scene changes, we use the macroblock reference information in P and B pictures. the process of scene change detection can be efficiently handled by extracting necessary data without full decoding of MPEG sequence. The performance of the proposed algorithm is analyzed based on precision and recall. the experimental results verified the effectiveness of the method for detecting scene changes of various MPEG sequences.

  • PDF

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

Localization of captions in MPEG compression images based on I frame (I 프레임에 기반한 MPEG 압축영상에서의 자막 탐지)

  • 유태웅
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.11
    • /
    • pp.1465-1476
    • /
    • 2001
  • For the applications like video indexing, text understanding, and automatic captions localization system, real-time localization of captions is an essential task. This paper presents a algorithm for localization of captions in MPEG compression images based on I frame. In this algorithm, caption text regions are segmented from background images using their distinguishing texture characteristics and chrominance information. Unlike previously published algorithms which fully decompress the video sequence before extracting the text regions, this algorithm locates candidate caption text region directly in the DCT compressed domain.

  • PDF

A Study of Designing the Knowledge Base System for the Query Extension by Index File (색인파일 기반의 질의어 확장용 지식베이스 구축에 관한 연구)

  • Seo, Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.40 no.2
    • /
    • pp.139-159
    • /
    • 2009
  • This study is to develop knowledge base system for query extension to the user oriented information retrieval. This study has survey the theories of the concept-based information retrieval method and statistic based information retrieval method. In the construction method of knowledge base, the common hypothesis is that the emergence of related term is the frequency of simultaneous emergence of a set of documents. Using the subject index file algorithms and the 'and' operator of boolean logic based on this hypothesis, this study builds the knowledge base. In this research experiment, a subject of knowledge base is education. Using the book of the Introduction to Education, two experimental knowledge base systems is constructed by the different indexing method. One system has constructed by controlled language indexing method, and another system has constructed by natural language indexing method. The performance of two knowledge base system is evaluated.

  • PDF

Automatic Extraction and Measurement of Visual Features of Mushroom (Lentinus edodes L.) (표고 외관 특징점의 자동 추출 및 측정)

  • Hwang, Heon;Lee, Yong-Guk
    • Journal of Bio-Environment Control
    • /
    • v.1 no.1
    • /
    • pp.37-51
    • /
    • 1992
  • Quantizing and extracting visual features of mushroom(Lentinus edodes L.) are crucial to the sorting and grading automation, the growth state measurement, and the dried performance indexing. A computer image processing system was utilized for the extraction and measurement of visual features of front and back sides of the mushroom. The image processing system is composed of the IBM PC compatible 386DK, ITEX PCVISION Plus frame grabber, B/W CCD camera, VGA color graphic monitor, and image output RGB monitor. In this paper, an automatic thresholding algorithm was developed to yield the segmented binary image representing skin states of the front and back sides. An eight directional Freeman's chain coding was modified to solve the edge disconnectivity by gradually expanding the mask size of 3$\times$3 to 9$\times$9. A real scaled geometric quantity of the object was directly extracted from the 8-directional chain element. The external shape of the mushroom was analyzed and converted to the quantitative feature patterns. Efficient algorithms for the extraction of the selected feature patterns and the recognition of the front and back side were developed. The developed algorithms were coded in a menu driven way using MS_C language Ver.6.0, PC VISION PLUS library fuctions, and VGA graphic functions.

  • PDF

Generation of Video Clips Utilizing Shot Boundary Detection (샷 경계 검출을 이용한 영상 클립 생성)

  • Kim, Hyeok-Man;Cho, Seong-Kil
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.582-592
    • /
    • 2001
  • Video indexing plays an important role in the applications such as digital video libraries or web VOD which archive large volume of digital videos. Video indexing is usually based on video segmentation. In this paper, we propose a software tool called V2Web Studio which can generate video clips utilizing shot boundary detection algorithm. With the V2Web Studio, the process of clip generation consists of the following four steps: 1) Automatic detection of shot boundaries by parsing the video, 2) Elimination of errors by manually verifying the results of the detection, 3) Building a modeling structure of logical hierarchy using the verified shots, and 4) Generating multiple video clips corresponding to each logically modeled segment. The aforementioned steps are performed by shot detector, shot verifier, video modeler and clip generator in the V2Web Studio respectively.

  • PDF

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

  • Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1236-1243
    • /
    • 2017
  • In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.