Search | Korea Science

The Performance Bottleneck of Subsequence Matching in Time-Series Databases: Observation, Solution, and Performance Evaluation (시계열 데이타베이스에서 서브시퀀스 매칭의 성능 병목 : 관찰, 해결 방안, 성능 평가)

김상욱
- Journal of KIISE:Databases
- /
- v.30 no.4
- /
- pp.381-396
- /
- 2003
Subsequence matching is an operation that finds subsequences whose changing patterns are similar to a given query sequence from time-series databases. This paper points out the performance bottleneck in subsequence matching, and then proposes an effective method that improves the performance of entire subsequence matching significantly by resolving the performance bottleneck. First, we analyze the disk access and CPU processing times required during the index searching and post processing steps through preliminary experiments. Based on their results, we show that the post processing step is the main performance bottleneck in subsequence matching, and them claim that its optimization is a crucial issue overlooked in previous approaches. In order to resolve the performance bottleneck, we propose a simple but quite effective method that processes the post processing step in the optimal way. By rearranging the order of candidate subsequences to be compared with a query sequence, our method completely eliminates the redundancy of disk accesses and CPU processing occurred in the post processing step. We formally prove that our method is optimal and also does not incur any false dismissal. We show the effectiveness of our method by extensive experiments. The results show that our method achieves significant speed-up in the post processing step 3.91 to 9.42 times when using a data set of real-world stock sequences and 4.97 to 5.61 times when using data sets of a large volume of synthetic sequences. Also, the results show that our method reduces the weight of the post processing step in entire subsequence matching from about 90% to less than 70%. This implies that our method successfully resolves th performance bottleneck in subsequence matching. As a result, our method provides excellent performance in entire subsequence matching. The experimental results reveal that it is 3.05 to 5.60 times faster when using a data set of real-world stock sequences and 3.68 to 4.21 times faster when using data sets of a large volume of synthetic sequences compared with the previous one.
PDF KSCI

Finding Pseudo Periods over Data Streams based on Multiple Hash Functions (다중 해시함수 기반 데이터 스트림에서의 아이템 의사 주기 탐사 기법)

Lee, Hak-Joo;Kim, Jae-Wan;Lee, Won-Suk
- Journal of Information Technology Services
- /
- v.16 no.1
- /
- pp.73-82
- /
- 2017
Recently in-memory data stream processing has been actively applied to various subjects such as query processing, OLAP, data mining, i.e., frequent item sets, association rules, clustering. However, finding regular periodic patterns of events in an infinite data stream gets less attention. Most researches about finding periods use autocorrelation functions to find certain changes in periodic patterns, not period itself. And they usually find periodic patterns in time-series databases, not in data streams. Literally a period means the length or era of time that some phenomenon recur in a certain time interval. However in real applications a data set indeed evolves with tiny differences as time elapses. This kind of a period is called as a pseudo-period. This paper proposes a new scheme called FPMH (Finding Periods using Multiple Hash functions) algorithm to find such a set of pseudo-periods over a data stream based on multiple hash functions. According to the type of pseudo period, this paper categorizes FPMH into three, FPMH-E, FPMH-PC, FPMH-PP. To maximize the performance of the algorithm in the data stream environment and to keep most recent periodic patterns in memory, we applied decay mechanism to FPMH algorithms. FPMH algorithm minimizes the usage of memory as well as processing time with acceptable accuracy.
https://doi.org/10.9716/KITS.2017.16.1.073 인용 PDF KSCI

A Research on the Development of a GIS-based Real-time Urban Water Management System (GIS기반 실시간 도시용수 관리시스템 구현에 관한 연구)

Kim, Seong-Hoon;Kim, Eui-Myoung;Lim, Yong-Min
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.11
- /
- pp.5290-5299
- /
- 2011
The ultimate purpose of this research is to propose a method to improve water supply management efficiency. As an effort to solve this comprehensive problem, the purposes of this paper are summarized into the following two main subjects. One is the development of a series of demand forecasting models targeting for each theme of urban water such as residential, commercial, industrial water. The other is the suggestion on the development and utilization plan of a GIS-based information system where the developed models are incorporated. For these, a series of efforts were performed such as evaluating and choosing of the candidate field areas, selecting a proper sensor and an installation point for each theme. Installed are sensors, a wireless communication infrastructure, and a field data acquisition and management server. Developed are a protocol for the wireless communication and a real-time data monitoring system. Nextly, the urban water facility-related and other necessary data were handled to make those into a series of GIS-ready databases. Finally, a GIS-based management system was designed and a blueprint for the implementation is suggested.
https://doi.org/10.5762/KAIS.2011.12.11.5290 인용 PDF KSCI

Design of Framework for Implementation of the New Paradigm Map (신 패러다임 맵 구현을 위한 프레임워크 설계)

Kim, Sun-Woo;Yang, Kwang-Ho;Park, Ki-Shik;Park, Ju-Young;Ra, In-Ho
- The Journal of the Korea Contents Association
- /
- v.15 no.3
- /
- pp.32-39
- /
- 2015
In this paper, We propose the futuristic map using variety technology of advanced ICT-based. The futuristic maps are expected to developed into a new format of user participation to express the results in various formats through the understanding and interpretation of the facts and phenomena of tangible and intangible that exist in the real world. In the future, the map is expected to be developed into form of a new paradigm map made in real time that economy, industry, the collection of information necessary for everyday life, processing, usage, analysis, distribution and sharing. In this paper, we provide a real-time personalized contents to digitize the information of the real space based on the concept of map, databases, spatial analysis and describes the key technologies that characterized by the representation of time-series data by analyzing and prediction every field macro phenomena of society, economy, culture and etc. And we establish the concepts of the 'New Paradigm Map' for future creative economy.
https://doi.org/10.5392/JKCA.2015.15.03.032 인용 PDF KSCI

Planning of Part Feeder and Design of a Data Base for Part Feeder Planning System (자동 부품 정렬기 응용계획과 전용 DB 설계)

Guk, Geum-Hwan;Park, Yong-Taek
- Journal of the Korean Society for Precision Engineering
- /
- v.19 no.7
- /
- pp.116-124
- /
- 2002
The planning of part feeder and other manufacturing automation equipments is almost always underestimated. Planning ahead for those crucial pitfalls can permit steps to take to minimize heir impacts, especially if the problems can be discovered in the planning phase, not on the shop floor. Planning process is an engineering process, namely a series of trade-offs. The effective trade-offs in the shortest amount of time can be possible with the help of a computer-aided ngineering (CAE) technique. The main parts of CAE fur part feeder are database system of fabricated workpiece parts, part feeders, part feeder components. In this study, a planning process of part feeder is presented. Especially, a systematic analysis of workpiece parts and part feeders is performed for the design of databases of CAE system.
PDF KSCI

Design and Implementation of Rule Discovery Algorithm strongly coupled with Time-series databases (시계열 데이터베이스와 강결합된 규칙발견 알고리즘 설계와 구현)

박인창;김성규
- Proceedings of the Korean Information Science Society Conference
- /
- 2001.04b
- /
- pp.43-45
- /
- 2001
마이닝 시스템은 그 특성에 따라 매우 다른 형태의 구현 방법이 존재한다. 그러므로 마이닝 시스템간 호환성이나 재사용성은 매우 낮다. 본 노문에서는 이 문제를 시계열 데이터베이스를 통한 RDB와 강 결합함으로써 표준화에 대한 문제를 해겨라고자 시도하였다. RDB와의 강 결합은 표준화 문제를 해결함과 더불어 마이닝 시스템에 DBMS의 관련 기술을 이용함으로써 성능을 극대화시킨다. 특히 DBMS의 인텍스 기능을 이용함으로써 마이닝 시스템의 성능 향상을 시도하였다. 본 논문에서는 기존의 순차패턴 탐사의 시간개념 부재, 트랜잭션 데이터베이스 기반구조, 그리고 알고리즘 수행에 있어서 메모리 한계에 따른 문제등의 단점을 지적하고, 이를 수정하고 보완하기 위해서 시간 거리와 패턴 길이의 개념을 확장하였으며 그에 따른 연관규칙의 관련 공식을 수정 보완하여 제안한다. 또한 RDB와의 강 결합되어 기존의 트랜잭션 데이터베이스 구조를 벗어나 시계열 데이터에 보다 쉽게 적용할 수 있는 절차와 알고리즘을 제안한다.
PDF

An Index-Based Subsequence Matching Algorithm Supporting Normalization Transform in Time-Series Databases (시계열 데이타베이스의 인덱스 보간법을 기반으로 정규화 변환을 지원하는 서브시퀀스 매칭 알고리즘)

노웅기;감상욱;황규영
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.04b
- /
- pp.152-154
- /
- 2000
본 논문에서는 시계열 데이터베이스에서 정규화 변환을 지원하는 서브시퀀스 매칭 알고리즘을 제안한다. 정규화 변환은 시계열 데이터간의 절대적인 유클리드 거리에 관계없이, 구성하는 값들의 상대적인 변화 추이가 유사한 패턴을 갖는 시계열 데이터를 검색하는 데에 유용하다. 제안된 알고리즘은 몇 개의 질의 시퀀스 길이에 대해서만 각각 인덱스를 생성한 후, 이를 이용하여 모든 가능한 길이의 질의 시퀀스에 대해서 탐색을 수행한다. 이때, 착오 기각이 발생하지 않음을 증명한다. 본 논문에서는 이와 같이 인덱스가 요구되는 모든 경우 중에서 적당한 간격의 일부에 대해서만 생성된 인덱스를 이용한 탐색 기법을 인덱스 보간법이라 부른다. 질의 시퀀스의 길이 256~512 중 다섯 개의 길이에 대해 인덱스를 생성하여 실험한 결과, 탐색 결과를 선택률이 10-5일 때 제안된 알고리즘의 탐색 성능이 순차 검색에 비하여 평균 14.6배 개선되었다.
PDF

Association Based Similarity Search in Time Series Databases (시퀀스 데이타들 간의 관계성에 기반한 유사 검색 기법)

Kang, Seong-Goo;Lee, Suk-Ho
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.11b
- /
- pp.52-54
- /
- 2005
시퀀스 데이타는 크기를 가지는 일련의 값들로 이루어져 있어 일반적인 상품 데이타와는 달리 서로간의 관계성을 파악하기가 어려운 것으로 알려져 있다. 본 논문에서는 이러한 문제점을 해결하기 위하여 관계성을 보이는 시퀀스를 유사 시퀀스로 검색해 내는 기법을 제안한다. 이를 위해 유클리드 거리만으로 유사도가 결정되던 기존의 유사 검색을 변형하여 시퀀스의 상대적 위치와 형태를 고려한 시퀀스의 변화율을 척도로 사용하였으며 고차원이라는 문제를 해결하기 위하여 관계성을 수치로 표현하였다. 또한 본 논문에서는 기존의 하르 웨이블릿을 변형한 기하 웨이블릿을 이용하여 인덱스를 구성하였으며 보정 과정을 통해 기존의 유사 검색 기법으로도 문제가 변형될 수 있음을 보였다.
PDF

Virtual Manufacturing for an Automotive Company (II) - Constuction and Operation of a Virtual Body Shop (자동차 가상생산 기술 적용 (II) - 차체공장 가상플랜트 구축 및 운영)

Noh, Sang-Do;Hong, Sung-Won;Kim, Duk-Young;Sohn, Chang-Young;Hahn, Hyung-Sang
- IE interfaces
- /
- v.14 no.2
- /
- pp.127-133
- /
- 2001
Virtual Manufacturing is a technology facilitating effective development and agile production of products via computer models representing physical and logical schema and the behavior of the real manufacturing systems. For the successful application of this technology, a virtual plant as a well-designed and integrated environment is essential. We propose a series of systematic approaches and effective methods for construction and operation of a virtual plant in this paper, such as a 3-D CAD modeling, cell and line simulations and databases. We developed key technologies for measuring and 3-D CAD modeling of many equipments, facilities and structures of the buildings. In order to study the benefit of virtual manufacturing, we constructed a sophisticated virtual plant model of a Korean automotive company's body shop, and conducted precise simulations of unit cell, lines and the whole plant. We could obtain the benefit of savings in time and cost in many manufacturing preparation activities in the new car development processes.
PDF

Optimizing the Post-Processing Step of Subsequence Matching in Time-Series Databases (시계열 데이터베이스를 위한 서브시퀀스 매칭 후처리 과정의 최적화)

Kim, Sang-Wook;Park, Dae-Hyun;Lee, Heon-Gil;Jung, Byong-Dae;Son, Sung-Yong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2001.10a
- /
- pp.39-42
- /
- 2001
본 논문에서는 시계열 데이터베이스에서 서브시퀀스 매칭을 효과적으로 처리하는 방안에 관하여 논의한다. 먼저, 서브시퀀스 매칭의 후처리 과정에서 발생하는 기존 기법의 문제점을 지적하고, 이를 해결할 수 있는 최적의 기법을 제안하였다. 제안된 기법은 이진 트리 내에 후보 시퀀스에 대한 정보를 삽입해 둠으로써 같은 시퀀스에 속하는 후보 윈도우들과 같은 서브시퀀스에 속하는 후보 윈도우들을 연속적으로 처리하는 방식을 사용한다. 이 결과, 디스크 액세스와 서브시퀀스 비교의 측면에서 중복 작업을 완전히 제거할 수 있다. 제안된 기법의 성능 개선 효과를 검증하기 위하여 실제 주식 데이터를 위한 성능 평가를 수행하였다. 실험 결과에 의하면, 제안된 기법은 기존의 기법과 비교하여 전체적으로 55배에서 156배까지의 성능 개선 효과가 있는 것으로 나타났다.
PDF

Search Result 86, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)