Search | Korea Science

Declustering Spatial Objects by Clustering for Parallel Disks (클러스터를 이용한 공간데이타 디클러스터링)

곽지숙;김학철;이기준
- Proceedings of the Korean Information Science Society Conference
- /
- 1999.10a
- /
- pp.168-170
- /
- 1999
지리정보시스템과 같은 공간 데이터베이스에서 다루는 데이터는 대용량이며, 사용자의 다양한 질의에 따라 빠르게 접근할 수 있어야 한다. 그런데 이때 성능의 대부분이 디스크 접근시간에 의해 영향을 받으므로 접근시간을 줄이는 기술이 필요하다. 이는 다수의 디스크 공간에 데이터를 분산하여 저장하는 디클러스터링 방법을 사용함으로써 효과적인 성능 향상을 기대할 수 있다. 효과적인 디클러스터링 방법은 주어진 질의에 대하여 동시에 접근될 가능성이 있는 공간 객체를 다른 디스크에 각각 저장함으로써 한번에 접근하는 병렬성을 높일 수 있다. 그러나 하나의 디스크에게 할당 가능한 공간 객체들을 서로 다른 디스크에 할당하는 것은 오히려 성능의 저하를 초래할 수 있다. 이러한 두 가지 조건을 동시에 만족하기 위해서는 공간 객체들을 클러스터링 한 후, 크러러스터 단위로 디스크로 할당하는 것이 효과적이다. 이전에 제시된 디클러스터링 방법들은 이러한 요소를 고려하지 않았다. 이에 본 논문에서는 주어진 공간 객체들에 대해서 일정한 크기의 클러스터를 만들고 클러스터 단위로 디클러스터링 하여 효율적인 성능 향상을 보이는 새로운 방법에 대해서 제시하고자 한다. 또한 이전에 제시되어졌던 여러 가지 디클러스터링 방법들과의 비교실험을 통해, 본 논문에서 제시한 방법이 가장 효과적인 방법임을 밝히고자 한다.
PDF

An Outlier Cluster Detection Technique for Real-time Network Intrusion Detection Systems (실시간 네트워크 침입탐지 시스템을 위한 아웃라이어 클러스터 검출 기법)

Chang, Jae-Young;Park, Jong-Myoung;Kim, Han-Joon
- Journal of Internet Computing and Services
- /
- v.8 no.6
- /
- pp.43-53
- /
- 2007
Intrusion detection system(IDS) has recently evolved while combining signature-based detection approach with anomaly detection approach. Although signature-based IDS tools have been commonly used by utilizing machine learning algorithms, they only detect network intrusions with already known patterns, Ideal IDS tools should always keep the signature database of your detection system up-to-date. The system needs to generate the signatures to detect new possible attacks while monitoring and analyzing incoming network data. In this paper, we propose a new outlier cluster detection algorithm with density (or influence) function, Our method assumes that an outlier is a kind of cluster with similar instances instead of a single object in the context of network intrusion, Through extensive experiments using KDD 1999 Cup Intrusion Detection dataset. we show that the proposed method outperform the conventional outlier detection method using Euclidean distance function, specially when attacks occurs frequently.
PDF

Performance Evaluation of Real-Time Transaction Processing in a Shared Disk Cluster (공유 디스크 클러스터에서 실시간 트랜잭션 처리의 성능 평가)

Lee Sangho;Ohn Kyungoh;Cho Haengrae
- Journal of KIISE:Databases
- /
- v.32 no.2
- /
- pp.142-150
- /
- 2005
A shared disks (SD) cluster couples multiple computing nodes, and every node shares a common database at the disk level. A great deal of research indicates that the SD cluster is suitable to high performance transaction processing, but the aggregation of SD cluster with real-time processing has not been investigated at all. A real-time transaction has not only ACID properties of traditional transactions but also time constraints. By adopting cluster technology, the real-time services will be highly available and can exploit inter-node parallelism. In this paper, we first develop an experiment model of an SD-based real-time database system (SD-RTDBS). Then we investigate the feasibility of real-time transaction processing in the SD cluster using the experiment model. We also evaluate the cross effect of real-time transaction processing algorithms and SD cluster algorithms under a wide variety of database workloads.
PDF KSCI

A Modified Fragmentation Technique for Reducing Network Cost in A Scalable and Highly Available Clustered Database (확장 가능한 고가용 데이터 베이스에서 네트워크 비용을 줄이기 위한 변형된 분할기법)

유병섭;이충호;이재동;배해영
- Proceedings of the Korean Information Science Society Conference
- /
- 2002.04b
- /
- pp.193-195
- /
- 2002
최근 근자상거래와 같은 웹 기반 응용프로그램에서는 높은 가용성과 확장성을 가지며 빠른 응답시간을 갖는 데이터베이스에 대한 필요성이 대두되고 있다. 이러한 요구에 대한 해결책의 하나로 비공유 구조의 클러스터 시스템을 구성하고 분활과 복제정책을 사용한다. 즉, 해쉬함수나 범위값에 의해 분할하여 여러 노드에 분산 시키고 서로 다른 노드에 마스터와 백업을 두어 가용성을 높이고 있다. 그러나 기존의 방법은 하나의 갱신 질의에 대해서 마스터와 백업에 각각 질의를 보내주어야 하고 온라인 확장시에는 모든 마스터와 백업의 데이터가 재구성되어야 하므로 네트워크 비용이 크다는 문제점이 있다. 따라서, 본 논문에서는 이러한 네트워크 비용을 줄이기 위한 변형된 분활 기법을 제안한다. 제안된 기법에서 마스터는 기존의 기법과 동일한 방법으로 저장하나 백업은 네트워크를 통해 지정된 노드로 포워딩을 하지 않고 질의를 받은 서버에 그대로 저장함으로써 클러스터를 구성하는 노드 사이에 통신 비용을 줄인다. 또한 온라인 확장에서는 기존의 기법과 달리 백업데이터는 같은 서버의 마스터데이터와 중복되는 것만 이동시킴으로써 데이터 이동비용을 줄이며, 전체 트랜잭션 처리량을 높인다.
PDF

Shared Data Synchronization and Change Notification between A-SMGCS System Node (A-SMGCS 시스템 노드 간 공유 데이터 동기화 및 변경 통지 방법에 관한 연구)

Gang, Ho-Yeong;Lee, Seok-Chan;Sin, Yong-Hak
- 한국항공운항학회:학술대회논문집
- /
- 2015.11a
- /
- pp.154-157
- /
- 2015
A-SMGCS 시스템은 HMI 단말 노드, 외부 시스템 연계 노드, 운영 데이터베이스 노드, 서버 노드 등 여러 노드로 구성되어져 있으며 이 노드들이 협업하여 운영자에게 원활한 서비스를 제공하기 위해서는 최소한의 공유 데이터가 서로 동기화되고 변경 발생 시 해당 노드에 자동으로 통지될 수 있어야 한다. 본 연구에서는A-SMGCS 시스템 운영에 필요한 최소한의 공유 데이터 동기화 및 변경 통지 기능 제공을 위해 협업 미들웨어인 ZooKeeper를 활용하고자 한다. 본 연구를 통해 클러스터를 구성하는 복수개의 ZooKeeper에 공유 데이터를 계층적으로 저장하고 특정 데이터에 변경이 발생할 시 자동으로 A-SMGCS 시스템 노드들에 통지가 됨을 확인하였다. 이러한 기능은 A-SMGCS 시스템뿐만 아니라 시스템 노드 간 신뢰성 있는 공유 데이터 실시간 동기가 필요한 다양한 시스템에도 쉽게 적용이 가능하다.
PDF

Health Diagnosis System of Pet Dog Using ART2 Algorithm (ART2 알고리즘을 이용한 애견 진단 시스템)

Oh, Sei-Woong;Kim, Ji-Hong
- Journal of Digital Contents Society
- /
- v.10 no.2
- /
- pp.327-332
- /
- 2009
In this paper, we propose the diagnosis system that can predict pet's state of health for pet lovers lacking a technical knowledge of dog-diseases. The proposed system deduces diseases of dogs from input symptoms by our database constructed with 105 kinds of diseases and symptoms. First, a disease is clustered by ART2, the self-learning method in neural network and secondly, the result values, outputs and the weight values clustered by the algorithm are stored to database. Finally, our system diagnoses the state of health by means of comparing the learned information of diseases with the input vectors of each symptom and the related results of questions on diseases. The correct information of diseases and symptom diagnosing is important to predict the state of health of dogs. Therefore, in this paper, the proposed system can manage symptoms and diseases efficiently by database and ART2. We ask veterinary specialist with the efficiency of our system. As a result, we could confirm the possibility as the auxiliary diagnosis system for dog diseases.
PDF

Discovering Association Rules using Item Clustering on Frequent Pattern Network (빈발 패턴 네트워크에서 아이템 클러스터링을 통한 연관규칙 발견)

Oh, Kyeong-Jin;Jung, Jin-Guk;Ha, In-Ay;Jo, Geun-Sik
- Journal of Intelligence and Information Systems
- /
- v.14 no.1
- /
- pp.1-17
- /
- 2008
Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.
PDF

Dense Clustering Index Based Efficient Join Method to Handle Skewed Data in Distributed Environment (분산 환경에서의 클러스터화된 밀집 인덱스 기반 효율적인 불균등 분포 데이터의 조인 기법)

Kim, Jae Hyung;Park, Sanghyun
- Proceedings of the Korea Information Processing Society Conference
- /
- 2014.04a
- /
- pp.656-659
- /
- 2014
오픈소스로부터 촉발된 분산 시스템의 보편화로 기존 상용 시스템으로는 제공하지 못한 다양한 종류의 서비스가 각광받고 있다. 특히, 테라바이트 단위를 넘어 페타바이트 단위의 데이터를 다루는 서비스의 등장으로 드러난 오픈소스 분산 시스템의 문제를 개선하기 위한 시도가 학계 및 업계에서 다각적으로 이뤄지고 있다. 이러한 시도는 새로운 방법론을 제시하는 것에서부터 기존 분산 데이터베이스 관리 시스템(Distributed DBMS)에서 사용된 방법론들을 적용하는 것까지 다양하게 이뤄지고 있다. 본 논문에서는 특정 키 값(Key Value)에 불균등 분포된 데이터에 대한 조인 연산의 탐색 공간을 밀집 인덱스를 통해 줄여 비교적 높은 시간 복잡도를 완화하는 방법론을 제시하고자 한다.
https://doi.org/10.3745/PKIPS.y2014m04a.656 인용 PDF

Design and Implementation of a Benchmarking System Based on ArangoDB (ArangoDB기반 벤치마킹 시스템 설계 및 구현)

Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
- The Journal of the Korea Contents Association
- /
- v.21 no.9
- /
- pp.198-208
- /
- 2021
ArangoDB is a NoSQL database system that has been popularly utilized in many applications for storing large amounts of data. In order to apply a new NoSQL database system such as ArangoDB, to real work environments we need a benchmarking system that can evaluate its performance. In this paper, we design and implement a ArangoDB based benchmarking system that measures a kernel level performance well as an application level performance. We partially modify YCSB to measure the performance of a NoSQL database system in the cluster environment. We also define three real-world workload types by analyzing the existing materials. We prove the feasibility of the proposed system through the benchmarking of three workload types. We derive available workloads in ArangoDB and show that performance at the kernel layer as well as the application layer can be visualized through benchmarking of three workload types. It is expected that applicability and risk reviews will be possible through benchmarking of this system in environments that need to transfer data from the existing database engine to ArangoDB.
https://doi.org/10.5392/JKCA.2021.21.09.198 인용 PDF KSCI HTML

A Content-based Audio Retrieval System Supporting Efficient Expansion of Audio Database (음원 데이터베이스의 효율적 확장을 지원하는 내용 기반 음원 검색 시스템)

Park, Ji Hun;Kang, Hyunchul
- Journal of Digital Contents Society
- /
- v.18 no.5
- /
- pp.811-820
- /
- 2017
For content-based audio retrieval which is one of main functions in audio service, the techniques for extracting fingerprints from the audio source, storing and indexing them in a database are widely used. However, if the fingerprints of new audio sources are continually inserted into the database, there is a problem that space efficiency as well as audio retrieval performance are gradually deteriorated. Therefore, there is a need for techniques to support efficient expansion of audio database without periodic reorganization of the database that would increase the system operation cost. In this paper, we design a content-based audio retrieval system that solves this problem by using MapReduce and NoSQL database in a cluster computing environment based on the Shazam's fingerprinting algorithm, and evaluate its performance through a detailed set of experiments using real world audio data.
https://doi.org/10.9728/dcs.2017.18.5.811 인용 PDF KSCI

Search Result 62, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)