• Title/Summary/Keyword: 고차원

Search Result 465, Processing Time 0.022 seconds

PdR-Tree : An Efficient Indexing Technique for the improvement of search performance in High-Dimensional Data (PdR-트리 : 고차원 데이터의 검색 성능 향상을 위한 효율적인 인덱스 기법)

  • Joh, Beom-Seok;Park, Young-Bae
    • The KIPS Transactions:PartD
    • /
    • v.8D no.2
    • /
    • pp.145-153
    • /
    • 2001
  • The Pyramid-Technique is based on mapping n-dimensional space data into one-dimensional data and expressing it as B-tree ; and by solving the problem of search time complexity the pyramid technique also prevents the effect \"phenomenon of dimensional curse\" which is caused by treatment of hypercube range query in n-dimensional data space. The Spherical Pyramid-Technique applies the pyramid method’s space division strategy, uses spherical range query and improves the search performance to make it suitable for similarity search. However, depending on the size of data and change in dimensions, the two above technique demonstrate significantly inferior search performance for data sizes greater than one million and dimensions greater than sixteen. In this paper, we propose a new index-structured PdR-Tree to improve the search performance for high dimensional data such as multimedia data. Test results using simulation data as well as real data demonstrate that PdR-Tree surpasses both the Pyramid-Technique and Spherical Pyramid-Technique in terms of search performance.

  • PDF

Design of an Efficient Parallel High-Dimensional Index Structure (효율적인 병렬 고차원 색인구조 설계)

  • Park, Chun-Seo;Song, Seok-Il;Sin, Jae-Ryong;Yu, Jae-Su
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.58-71
    • /
    • 2002
  • Generally, multi-dimensional data such as image and spatial data require large amount of storage space. There is a limit to store and manage those large amount of data in single workstation. If we manage the data on parallel computing environment which is being actively researched these days, we can get highly improved performance. In this paper, we propose a parallel high-dimensional index structure that exploits the parallelism of the parallel computing environment. The proposed index structure is nP(processor)-n$\times$mD(disk) architecture which is the hybrid type of nP-nD and lP-nD. Its node structure increases fan-out and reduces the height of a index tree. Also, A range search algorithm that maximizes I/O parallelism is devised, and it is applied to K-nearest neighbor queries. Through various experiments, it is shown that the proposed method outperforms other parallel index structures.

A Scalable Index for Content-based Retrieval of Large Scale Multimedia Data (대용량 멀티미디어 데이터의 내용 기반 검색을 위한 고확장 지원 색인 기법)

  • Choi, Hyun-HWa;Lee, Mi-Young;Lee, Kyu-Chul
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.726-730
    • /
    • 2009
  • The proliferation of the web and digital photography has drastically increased multimedia data and has resulted in the need of the high quality internet service based on the moving picture like user generated contents(UGC). The keyword-based search on large scale images and video collections is too expensive and requires much manual intervention. Therefore the web search engine may provide the content-based retrieval on the multimedia data for search accuracy and customer satisfaction. In this paper, we propose a novel distributed index structure based on multiple length signature files according to data distribution. In addition, we describe how our scalable index technique can be used to find the nearest neighbors in the cluster environments.

  • PDF

Performance Enhancement of a DVA-tree by the Independent Vector Approximation (독립적인 벡터 근사에 의한 분산 벡터 근사 트리의 성능 강화)

  • Choi, Hyun-Hwa;Lee, Kyu-Chul
    • The KIPS Transactions:PartD
    • /
    • v.19D no.2
    • /
    • pp.151-160
    • /
    • 2012
  • Most of the distributed high-dimensional indexing structures provide a reasonable search performance especially when the dataset is uniformly distributed. However, in case when the dataset is clustered or skewed, the search performances gradually degrade as compared with the uniformly distributed dataset. We propose a method of improving the k-nearest neighbor search performance for the distributed vector approximation-tree based on the strongly clustered or skewed dataset. The basic idea is to compute volumes of the leaf nodes on the top-tree of a distributed vector approximation-tree and to assign different number of bits to them in order to assure an identification performance of vector approximation. In other words, it can be done by assigning more bits to the high-density clusters. We conducted experiments to compare the search performance with the distributed hybrid spill-tree and distributed vector approximation-tree by using the synthetic and real data sets. The experimental results show that our proposed scheme provides consistent results with significant performance improvements of the distributed vector approximation-tree for strongly clustered or skewed datasets.

An Efficient Bulk Loading for High Dimensional Index Structures (고차원 색인 구조를 위한 효율적인 벌크 로딩)

  • Bok, Kyoung-Soo;Lee, Seok-Hee;Cho, Ki-Hyung;Yoo, Jae-Soo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2327-2340
    • /
    • 2000
  • Existing bulk loading algorithms for multi-dimensional index structures suffer from satisfying both index construction time and retrieval perfonnancc. In this paper, we propose an efficient bulk loading algorithm to construct high dimensional index structures for large data set that overcomes the problem. Although several bulk loading algorithms have been proposed for this purpose, none of them improve both constnlCtion time and search performance. To improve the construction time, we don't sort whole data set and use bisectiou algorithm that divides the whole data set or a subset into two partitions according to the specific pivot value. Also, we improve the search performance by selecting split positions according to the distribution properties of the data set. We show that the proposed algorithm is superior to existing algorithms in terms of construction time and search perfomlance through various experiments.

  • PDF

Deriving the Fourier Transforms of Pulse Signals Through the Look-up Tables (찾아보기 목록에 의한 고차 펄스의 푸리에 변환법)

  • 오용선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.3
    • /
    • pp.327-338
    • /
    • 1993
  • This paper proposes a novel method for deriving the Fourier transform pairs of high order pulses given in a generalized form. Primarily, modifying the PRS system model, we establish a new model which simplifies the process of Fourier analysis of the n-th order pulse signal, resulting in a representative relationship. In succession, we present the Frame Formula which plays a role of substituent for the parameters in table look-up procedures. Each look-up table contains all the parameters needed to obtain the Fourier transform of the corresponding pulse of any order. Regarding the amount of calculations and the complexity of procedures required to derive the transforms of pulse signals, analytically or numerically, this method is more compact and timesaving than conventional methods. When pulse has a much narrow width of equivalently higher the order of several pulses, the method presented here acts to the best of its true merit.

  • PDF

A Low-Complexity Turbo coded BICM-ID System (Turbo coded BICM-ID의 복잡도 개선 기법)

  • Kang, Donghoon;Lee, Yongwook;Oh, Wangrok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.21-27
    • /
    • 2013
  • In this paper, we propose a low-complexity Turbo coded BICM-ID (bit-interleaved coded modulation with iterative decoding) system. A Turbo code is a powerful error correcting code with a BER (bit error rate) performance very close to the Shannon limit. In order to increase spectral efficiency of the Turbo code, a coded modulation combining Turbo code with high order modulation is used. The BER performance of Turbo-BICM can be improved by Turbo-BICM-ID using iterative demodulation and decoding algorithm. However, compared with Turbo-BICM, the decoding complexity of Turbo-BICM-ID is increased by exchanging information between decoder and demodulator. To reduce the decoding complexity of Turbo-BICM-ID, we propose a low-complexity Turbo-BICM-ID system. When compared with conventional Turbo-BICM-ID, the proposed scheme not only show similar BER performance but also reduce the decoding complexity.

Feature Extraction and Classification of High Dimensional Biomedical Spectral Data (고차원을 갖는 생체 스펙트럼 데이터의 특징추출 및 분류기법)

  • Cho, Jae-Hoon;Park, Jin-Il;Lee, Dae-Jong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.3
    • /
    • pp.297-303
    • /
    • 2009
  • In this paper, we propose the biomedical spectral pattern classification techniques by the fusion scheme based on the SpPCA and MLP in extended feature space. A conventional PCA technique for the dimension reduction has the problem that it can't find an optimal transformation matrix if the property of input data is nonlinear. To overcome this drawback, we extract features by the SpPCA technique in extended space which use the local patterns rather than whole patterns. In the classification step, individual classifier based on MLP calculates the similarity of each class for local features. Finally, biomedical spectral patterns is classified by the fusion scheme to effectively combine the individual information. As the simulation results to verify the effectiveness, the proposed method showed more improved classification results than conventional methods.

A new cluster validity index based on connectivity in self-organizing map (자기조직화지도에서 연결강도에 기반한 새로운 군집타당성지수)

  • Kim, Sangmin;Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.591-601
    • /
    • 2020
  • The self-organizing map (SOM) is a unsupervised learning method projecting high-dimensional data into low-dimensional nodes. It can visualize data in 2 or 3 dimensional space using the nodes and it is available to explore characteristics of data through the nodes. To understand the structure of data, cluster analysis is often used for nodes obtained from SOM. In cluster analysis, the optimal number of clusters is one of important issues. To help to determine it, various cluster validity indexes have been developed and they can be applied to clustering outcomes for nodes from SOM. However, while SOM has an advantage in that it reflects the topological properties of original data in the low-dimensional space, these indexes do not consider it. Thus, we propose a new cluster validity index for SOM based on connectivity between nodes which considers topological properties of data. The performance of the proposed index is evaluated through simulations and it is compared with various existing cluster validity indexes.

Design and Implementation of a CIR-Tree Manager for Efficient Image Retrieval on MiDAS-III (MiDAS-III에서 효율적인 이미지 검색을 위한 CIR-트리 관리기의 설계 및 구현)

  • Song, Seok-Il;Lee, Hee-Jong;Lee, Seok-Hee;Yoo, Jae-Soo;Cho, Ki-Hyung;Yoo, Kwan-Hee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.392-401
    • /
    • 2001
  • Nowadays, the amount of image data increase explosively. To manage the large amount of image data efficiently, high-dimensional index structures are necessary. However, as my knowledge none of existing DBMSs supports high-dimensional index structures as access methods of DBMSs. In this paper, we design and implement CIR-Tree as a access method for retrieving image data effectively on the MiDAS-III that is the storage subsystem of the BADA-III. The implemented CIR-Tree manager shows much better retrieval performance than sequential search in performance evaluation.

  • PDF