• Title/Summary/Keyword: 색인기법

Search Result 654, Processing Time 0.032 seconds

COVA: A Distance Learning System supporting Content-based Lecture Retrieval (COVA: 내용 기반 강의 검색을 지원하는 원격 학습 시스템)

  • 차광호
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.99-107
    • /
    • 2004
  • Education and training are expected to change dramatically due to the combined impact of the Internet, database, and multimedia technologies However, the distance learning is often impeded by the lack of effective tools and system to manage and retrieve the lecture contents effectively. This paper introduces a prototype system called COVA that enables remote users to access specific parts of interest by contents from a large lecture database. COVA includes several novel techniques to achieve the content-based lecture retrieval in distance teaming: (1) The XML-based semistructured model to represent lecture contents; (2) The technique to build structural summaries, i.e., schemas, of XML lecture databases; (3) Index structures to speed up the search to find appropriate lecture contents.

Efficient Similarity Search in Multi-attribute Time Series Databases (다중속성 시계열 데이타베이스의 효율적인 유사 검색)

  • Lee, Sang-Jun
    • The KIPS Transactions:PartD
    • /
    • v.14D no.7
    • /
    • pp.727-732
    • /
    • 2007
  • Most of previous work on indexing and searching time series focused on the similarity matching and retrieval of one-attribute time series. However, multimedia databases such as music, video need to handle the similarity search in multi-attribute time series. The limitation of the current similarity models for multi-attribute sequences is that there is no consideration for attributes' sequences. The multi-attribute sequences are composed of several attributes' sequences. Since the users may want to find the similar patterns considering attributes's sequences, it is more appropriate to consider the similarity between two multi-attribute sequences in the viewpoint of attributes' sequences. In this paper, we propose the similarity search method based on attributes's sequences in multi-attribute time series databases. The proposed method can efficiently reduce the search space and guarantees no false dismissals. In addition, we give preliminary experimental results to show the effectiveness of the proposed method.

An Efficient Indexing Method For XML Documents Using Order-Array (XML 문서의 효과적인 색인방법을 위한 Order-Array의 사용)

  • Kim Young;Ahn Chan-Min;Park Sang-Ho;Park Sun;Lee Ju-Hong;Chun Suk-Ju
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.77-80
    • /
    • 2004
  • 최근 XML은 전자상거래에서 의학, 국방, 법률 등의 전문분야에 이르기까지 많은 분야에서 활용되고 있으며, 데이터의 양 또한 방대해지고 있다. 따라서 대량의 XML 문서들을 효과적으로 저장하고 빠르게 검색할 수 있는 많은 인덱싱 기법들이 연구되고 있다. 최근의 인덱싱 기법들 중 Numbering Scheme 을 기반으로 한 인덱싱 기법들은 대부분의 검색에 우수한 성능을 보이나 하위노드의 수가 늘어나면 검색 오버헤드가 커질 수 있으며, 대량의 XML 문서의 추가 삽입 및 구조가 다른 XML 문서의 삽입시에 인덱스와 데이터 값의 재조정에 따른 많은 비용이 발생하게 된다. 이에 우리는 Numbering Scheme 을 기반으로 하지만, 각 노드별로 노드범위(Node-Range)와 Order-Array를 추가하여 검색성능을 향상시키고 대량의 XML 문서의 삽입 및 구조가 다른 XML 문서의 삽입시에 발생되는 문제를 해결하고자 한다.

  • PDF

Small Active Command Design for High Density DRAMs

  • Lee, Kwangho;Lee, Jongmin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.1-9
    • /
    • 2019
  • In this paper, we propose a Small Active Command scheme which reduces the power consumption of the command bus to DRAM. To do this, we target the ACTIVE command, which consists of multiple packets, containing the row address that occupies the largest size among the addresses delivered to the DRAM. The proposed scheme identifies frequently referenced row addresses as Hot pages first, and delivers index numbers of small caches (tables) located in the memory controller and DRAM. I-ACTIVE and I-PRECHARGE commands using unused bits of existing DRAM commands are added for index number transfer and cache synchronization management. Experimental results show that the proposed method reduces the command bus power consumption by 20% and 8.1% on average in the close-page and open-page policies, respectively.

Fast Hilbert R-tree Bulk-loading Scheme using GPGPU (GPGPU를 이용한 Hilbert R-tree 벌크로딩 고속화 기법)

  • Yang, Sidong;Choi, Wonik
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.792-798
    • /
    • 2014
  • In spatial databases, R-tree is one of the most widely used indexing structures and many variants have been proposed for its performance improvement. Among these variants, Hilbert R-tree is a representative method using Hilbert curve to process large amounts of data without high cost split techniques to construct the R-tree. This Hilbert R-tree, however, is hardly applicable to large-scale applications in practice mainly due to high pre-processing costs and slow bulk-load time. To overcome the limitations of Hilbert R-tree, we propose a novel approach for parallelizing Hilbert mapping and thus accelerating bulk-loading of Hilbert R-tree on GPU memory. Hilbert R-tree based on GPU improves bulk-loading performance by applying the inversed-cell method and exploiting parallelism for packing the R-tree structure. Our experimental results show that the proposed scheme is up to 45 times faster compared to the traditional CPU-based bulk-loading schemes.

Reverse Skyline Query Processing in Metric Spaces (거리공간에서의 리버스 스카이라인 질의 처리)

  • Lim, Jong-Tae;Park, Yong-Hun;Seo, Dong-Min;Lee, Jin-Ju;Jang, Soo-Min;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.809-813
    • /
    • 2010
  • Many studies on reverse skyline query processing have been done for company oriented services. The existing methods about reverse skyline are reverse skyline based on dynamic skyline. There is no reverse skyline query processing algorithm based on metric spaces for location-based services. In this paper we propose a reverse skyline query processing scheme that applies for a general skyline and considers distance spaces. The proposed method processes reverse skyline queries in the metric spaces using the existing spatial indexing scheme and considers both Monochromatic and Bichromatic environments. In order to show the superiority of the proposed scheme, we compare it with the basic skyline query processing scheme through performance evaluation. As a result, the proposed method excellent performance was about 5000 times more than conventional method.

Physical Database Design for DFT-Based Multidimensional Indexes in Time-Series Databases (시계열 데이터베이스에서 DFT-기반 다차원 인덱스를 위한 물리적 데이터베이스 설계)

  • Kim, Sang-Wook;Kim, Jin-Ho;Han, Byung-ll
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1505-1514
    • /
    • 2004
  • Sequence matching in time-series databases is an operation that finds the data sequences whose changing patterns are similar to that of a query sequence. Typically, sequence matching hires a multi-dimensional index for its efficient processing. In order to alleviate the dimensionality curse problem of the multi-dimensional index in high-dimensional cases, the previous methods for sequence matching apply the Discrete Fourier Transform(DFT) to data sequences, and take only the first two or three DFT coefficients as organizing attributes of the multi-dimensional index. This paper first points out the problems in such simple methods taking the firs two or three coefficients, and proposes a novel solution to construct the optimal multi -dimensional index. The proposed method analyzes the characteristics of a target database, and identifies the organizing attributes having the best discrimination power based on the analysis. It also determines the optimal number of organizing attributes for efficient sequence matching by using a cost model. To show the effectiveness of the proposed method, we perform a series of experiments. The results show that the Proposed method outperforms the previous ones significantly.

  • PDF

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

  • Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1236-1243
    • /
    • 2017
  • In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.

A Study on the Retrieval Effectiveness of KoreaMed using MeSH Search Filter and Word-Proximity Search (검색용 MeSH 필터와 단어인접탐색 기법을 활용한 KoreaMed 검색 효율성 향상 연구)

  • Jeong, So-Na;Jeong, Ji-Na
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.5
    • /
    • pp.596-607
    • /
    • 2017
  • This study examined the method for adding related to "stomach neoplasms" as filters to the Medical Subject Headings (MeSH) for search as well as a method for improving the search efficiency through a word-proximity search by measuring the distance of co-occurring terms. A total of 8,625 articles published between 2007 and 2016 with the major topic terms "stomach neoplasms" were downloaded from PubMed article titles. The vocabulary to be added to the MeSH for search were analyzed. The search efficiency was verified by 277 articles that had "Stomach Neoplasms" indexed as MEDLINE MeSH in KoreaMed. As a result, 973 terms were selected as the candidate vocabulary. "Gastric Cancer" (2,780 appearances) was the most frequent term and 7,376 compound words (88.51%) combined the histological terms of "stomach" and "neoplasm", such as "gastric adenocarcinoma" and "gastric MALT lymphoma". A total of 5,234 compounds words (70.95%), in which the co-occurring distance was two words, were found. The matching rate through the MEDLINE MeSH and KoreaMed MeSH Indexer was 209 articles (75.5%). The search efficiency improved to 263 articles (94.9%) when the search filters were added, and to 268 articles (96.7%) when the 13 word-proximity search technique of the co-occurring terms was applied. This study showed that the use of a thesaurus as a means of improving the search efficiency in a natural language search could maintain the advantages of controlled vocabulary. The search accuracy can be improved using the word-proximity search instead of a Boolean search.

Cloaking Method supporting K-anonymity and L-diversity for Privacy Protection in Location-Based Services (위치기반 서비스에서 개인 정보 보호를 위한 K-anonymity 및 L-diversity를 지원하는 Cloaking 기법)

  • Kim, Ji-Hee;Lee, Ah-Reum;Kim, Yong-Ki;Um, Jung-Ho;Chang, Jae-Woo
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.4
    • /
    • pp.1-10
    • /
    • 2008
  • In wireless internet, the location information of the user is one of the important resources for many applications. One of these applications is Location-Based Services (LBSs) which are being popular. Because, in the LBS system, users request a location-based query to LBS servers by sending their exact location, the location information of the users can be misused by adversaries. In this regard, there must be a mechanism which can deal with privacy protection of the users. In this paper, we propose a cloaking method considering both features of K-anonymity and L-diversity. Our cloaking method creates a minimum cloaking region by finding L number of buildings (L-diversity) and then finding number of users (K-anonymity). To support this, we use a R*-tree based index structure and use filtering methods especially for the m inimum cloaking region. Finally, we show from a performance analysis that our method outperforms the existing grid based cloaking method.

  • PDF