• Title/Summary/Keyword: indexing method

Search Result 532, Processing Time 0.027 seconds

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

  • Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1236-1243
    • /
    • 2017
  • In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.

An XML Tag Indexing Method Using on Lexical Similarity (XML 태그를 분류에 따른 가중치 결정)

  • Jeong, Hye-Jin;Kim, Yong-Sung
    • The KIPS Transactions:PartB
    • /
    • v.16B no.1
    • /
    • pp.71-78
    • /
    • 2009
  • For more effective index extraction and index weight determination, studies of extracting indices are carried out by using document content as well as structure. However, most of studies are concentrating in calculating the importance of context rather than that of XML tag. These conventional studies determine its importance from the aspect of common sense rather than verifying that through an objective experiment. This paper, for the automatic indexing by using the tag information of XML document that has taken its place as the standard for web document management, classifies major tags of constructing a paper according to its importance and calculates the term weight extracted from the tag of low weight. By using the weight obtained, this paper proposes a method of calculating the final weight while updating the term weight extracted from the tag of high weight. In order to determine more objective weight, this paper tests the tag that user considers as important and reflects it in calculating the weight by classifying its importance according to the result. Then by comparing with the search performance while using the index weight calculated by applying a method of determining existing tag importance, it verifies effectiveness of the index weight calculated by applying the method proposed in this paper.

A New Spatial Indexing Method for Level-Of-Detailed Data (레벨별로 상세화된 공간 데이터를 위한 새로운 공간 인덱싱 기법)

  • 권준희;윤용익
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.4
    • /
    • pp.361-371
    • /
    • 2002
  • An efficient access technique is one of the most Important requirements in GIS. Using level -of-detailed data, we can access spatial data efficiently, because of no access to the fully detailed spatial data. Previous spatial access methods do not access data with level of detail efficiently. To solve it, a few spatial access methods for spatial data with level of detail, are known. However these methods support only a few kinds of data with level of detail, i.e, data through selection and simplification operations. For the effects, we propose a new spatial indexing method supporting fast searching in all kinds of data with level of detail. In the proposed method, the collection of indexes in its own level are integrated into a single index structure. Experimental results show that our method offers both no data redundancy and high search performance.

  • PDF

An Efficient Frequent Melody Indexing Method to Improve Performance of Query-By-Humming System (허밍 질의 처리 시스템의 성능 향상을 위한 효율적인 빈번 멜로디 인덱싱 방법)

  • You, Jin-Hee;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.4
    • /
    • pp.283-303
    • /
    • 2007
  • Recently, the study of efficient way to store and retrieve enormous music data is becoming the one of important issues in the multimedia database. Most general method of MIR (Music Information Retrieval) includes a text-based approach using text information to search a desired music. However, if users did not remember the keyword about the music, it can not give them correct answers. Moreover, since these types of systems are implemented only for exact matching between the query and music data, it can not mine any information on similar music data. Thus, these systems are inappropriate to achieve similarity matching of music data. In order to solve the problem, we propose an Efficient Query-By-Humming System (EQBHS) with a content-based indexing method that efficiently retrieve and store music when a user inquires with his incorrect humming. For the purpose of accelerating query processing in EQBHS, we design indices for significant melodies, which are 1) frequent melodies occurring many times in a single music, on the assumption that users are to hum what they can easily remember and 2) melodies partitioned by rests. In addition, we propose an error tolerated mapping method from a note to a character to make searching efficient, and the frequent melody extraction algorithm. We verified the assumption for frequent melodies by making up questions and compared the performance of the proposed EQBHS with N-gram by executing various experiments with a number of music data.

A Study on the Retrieval Effectiveness of KWIC Index versus Descriptor Index (KWIC색인(索引)과 Descriptor색인(索引)의 검색(檢索) 효율성(效率性))

  • Choi, Sang-Ki
    • Journal of the Korean Society for information Management
    • /
    • v.2 no.1
    • /
    • pp.96-107
    • /
    • 1985
  • The purpose of this study is to compare the retrieval effectiveness of KWIC index by automatic indexing method with Descriptor index by manual indexing method. The number of documents and requests used in this experimental study are 281 journal articles and 10 user requests in the area of nuclear engineering. The results of experiment show an average recall ratio of 54.89% for KWIC index and 64.42% for Descriptor index.

  • PDF

A Study on Fire Risk Analysis & Indexing of Buildings (건축물의 화재위험의 분석과 지수화에 관한 연구)

  • Chung, Eui-Soo;Yang, Kwang-Mo;Ha, Jeong-Ho;Kang, Kyung-Sik
    • Journal of the Korea Safety Management & Science
    • /
    • v.10 no.4
    • /
    • pp.93-104
    • /
    • 2008
  • A successful fire risk assessment is depends on identification of risk, the analytical process of potential risk, on estimation of likelihood and the width and depth of consequence. Take the influence on enterprise into consideration, Fire risk assessment could carry out along the evaluation of the risk importance, the risk level and the risk acceptance. A large part of the limitation of choosing the risk assessment techniques impose restrictions on expense and time. If it is unnecessary high level risk assessment or Probabilistic Risk Assessment of buildings, in compliance with the Relative Ranking Method, Fire risk indexing and assessing is possible. As working-level technique, AHP method is useful with practical technique.

An Effective Method for Dimensionality Reduction in High-Dimensional Space (고차원 공간에서 효과적인 차원 축소 기법)

  • Jeong Seung-Do;Kim Sang-Wook;Choi Byung-Uk
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.88-102
    • /
    • 2006
  • In multimedia information retrieval, multimedia data are represented as vectors in high dimensional space. To search these vectors effectively, a variety of indexing methods have been proposed. However, the performance of these indexing methods degrades dramatically with increasing dimensionality, which is known as the dimensionality curse. To resolve the dimensionality curse, dimensionality reduction methods have been proposed. They map feature vectors in high dimensional space into the ones in low dimensional space before indexing the data. This paper proposes a method for dimensionality reduction based on a function approximating the Euclidean distance, which makes use of the norm and angle components of a vector. First, we identify the causes of the errors in angle estimation for approximating the Euclidean distance, and discuss basic directions to reduce those errors. Then, we propose a novel method for dimensionality reduction that composes a set of subvectors from a feature vector and maintains only the norm and the estimated angle for every subvector. The selection of a good reference vector is important for accurate estimation of the angle component. We present criteria for being a good reference vector, and propose a method that chooses a good reference vector by using Levenberg-Marquardt algorithm. Also, we define a novel distance function, and formally prove that the distance function lower-bounds the Euclidean distance. This implies that our approach does not incur any false dismissals in reducing the dimensionality effectively. Finally, we verify the superiority of the proposed method via performance evaluation with extensive experiments.

A Fast and Powerful Question-answering System using 2-pass Indexing and Rule-based Query Processing Method (2-패스 색인 기법과 규칙 기반 질의 처리기법을 이용한 고속, 고성능 질의 응답 시스템)

  • 김학수;서정연
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.11
    • /
    • pp.795-802
    • /
    • 2002
  • We propose a fast and powerful Question-answering (QA) system in Korean, which uses a predictive answer indexer based on 2-pass scoring method. The indexing process is as follows. The predictive answer indexer first extracts all answer candidates in a document. Then, using 2-pass scoring method, it gives scores to the adjacent content words that are closely related with each answer candidate. Next, it stores the weighted content words with each candidate into a database. Using this technique, along with a complementary analysis of questions which is based on lexico-syntactic pattern matching method, the proposed QA system saves response time and enhances the precision.

A Novel Blind Watermarking Scheme Using Block Indexing (블록 인덱싱을 이용한 블라인드 워터마킹 기법)

  • Kang Hyun-Ho;Shin Sang-Uk;Han Seung-Wu
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.6
    • /
    • pp.331-342
    • /
    • 2005
  • In this paper, we propose an efficient watermarking algorithm using block indexing. The proposed algorithm is a novel blind watermarking scheme using the indexed watermark value based on the spread spectrum method. The watermark insertion is allocated into index value of each block after dividing original image into sub-blocks. The watermark embedded in mappinged with index values of blocks, And the mappinged blocks convert to DCT and then the PN sequence embedded to middle frequency band. Consequently the watermark is expressed by index value of sub-blocks. The watermark extracted from the correlation of between PN sequence and watermarked image. Experimental results demonstrate that the watermarked image has a good quality in terms of imperceptibility and is robust against various attacks.

  • PDF

Indexing Considering Video Rating of Scenes in Video (동영상의 장면별 비디오 등급을 고려한 색인)

  • Kim Young-Bong
    • Journal of Game and Entertainment
    • /
    • v.2 no.2
    • /
    • pp.51-60
    • /
    • 2006
  • Recently, many streaming videos including drama, music videos, and movies have been extensively given on the web. Such video services are on negative lines in any service restriction depending on the age of users and then whole part of a video have been restricted considering the age of users. Therefore, in this paper, we will present a new method that provides the access depending on the ages of users and also sets the video rating of each scene in a video. To get this restricted access for video, we will first divide a streaming video into many scenes using histogram techniques. Each scene gets an access control depending on the nudity level. Finally, we will make the video indexing including the access level depending on its nudity level and then hide restricted scenes using several masks in playing that streaming video.

  • PDF