• Title/Summary/Keyword: 질의필터링

Search Result 134, Processing Time 0.032 seconds

Esper-based Real-time Filtering System (Esper 기반 실시간 필터링 시스템)

  • Park, Sebin;Lee, Sanghun;Moon, Yang-Sae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.552-555
    • /
    • 2016
  • 본 논문에서는 데이터 스트림 대상의 필터링 문제를 다룬다. 데이터 스트림은 지속적으로 생성되며, 크기 또한 거대해서 이를 실시간 처리하기 위해서는 분석에 불필요한 데이터를 충분히 필터링해야 한다. 하지만, 기존 필터링 알고리즘은 하나의 데이터 형식에만 사용이 가능하여 다양하고 복잡한 스트림 환경에서는 사용하기가 어렵다. 따라서, 본 논문에서는 이 같은 문제를 해결하기 위해 스트림 형식에 따라 필터링 알고리즘을 다양하게 선택할 수 있는 필터링 시스템을 제안한다. 그리고 실시간 필터링을 위해 대표적인 오픈소스 DSMS(data stream management system)인 에스퍼 기반으로 구현한다. 또한 웹 기반 클라이언트-서버 모델로 확장 구현하여 사용자가 언제 어디에서든 필터링 시스템을 사용할 수 있게 한다. 제안하는 에스퍼 기반 실시간 필터링 시스템은 데이터 스트림으로 실시간 데이터 스트림과 벌크 데이터 스트림을 지원한다. 그리고 필터링 알고리즘으로 질의 필터링, 블룸 필터링, 베이지안 필터링을 제공한다. 제안하는 필터링 시스템 구현 결과, 데이터 스트림 특성에 적합한 필터링 알고리즘을 선택적으로 제공함으로써, 사용자가 보다 정확하고 효율적으로 의미있는 데이터를 추출 가능하게 하였다.

Efficient Skyline Query Processing Scheme in Mobile P2P Networks (모바일 P2P 네트워크에서 효율적인 스카이라인 질의 처리 기법)

  • Bok, Kyoung-Soo;Park, Sun-Yong;Kim, Dae-Yeon;Lim, Jong-Tae;Shin, Jae-Ryong;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.30-42
    • /
    • 2015
  • In this paper, we propose a new skyline query processing scheme to enhance accuracy of query processing and communication cost in mobile P2P environments. The proposed scheme consists of three stages such as the pre-skyline processing, the query transmission range extension policy, and the continuous skyline query processing. In the pre-skyline processing, a peer selects the candidate filtering objects who have the potential to be selected. By doing so, the proposed scheme reduces the filtering cost when processing the query. In the query transmission range extension policy, we have improved the accuracy by extending the query transmission range. In addition, it can handle continuous skyline query by performing the monitoring after the first skyline query processing. In order to show the superiority of the proposed method, we compare it with the existing schemes through performance evaluation. As a result, it was shown that the proposed scheme outperforms the existing schemes.

Semantic schema data processing using cache mechanism (캐쉬메카니즘을 이용한 시맨틱 스키마 데이터 처리)

  • Kim, Byung-Gon;Oh, Sung-Kyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.3
    • /
    • pp.89-97
    • /
    • 2011
  • In semantic web information system like ontology that access distributed information from network, efficient query processing requires an advanced caching mechanism to reduce the query response time. P2P network system have become an important infra structure in web environment. In P2P network system, when the query is initiated, reducing the demand of data transformation to source peer is important aspect of efficient query processing. Caching of query and query result takes a particular advantage by adding or removing a query term. Many of the answers may already be cached and can be delivered to the user right away. In web environment, semantic caching method has been proposed which manages the cache as a collection of semantic regions. In this paper, we propose the semantic caching technique in cluster environment of peers. Especially, using schema data filtering technique and schema similarity cache replacement method, we enhanced the query processing efficiency.

A Filtering Technique of Streaming XML Data based Postfix Sharing for Partial matching Path Queries (부분매칭 경로질의를 위한 포스트픽스 공유에 기반한 스트리밍 XML 데이타 필터링 기법)

  • Park Seog;Kim Young-Soo
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.138-149
    • /
    • 2006
  • As the environment with sensor network and ubiquitous computing is emerged, there are many demands of handling continuous, fast data such as streaming data. As work about streaming data has begun, work about management of streaming data in Publish-Subscribe system is started. The recent emergence of XML as a standard for information exchange on Internet has led to more interest in Publish - Subscribe system. A filtering technique of streaming XML data in the existing Publish- Subscribe system is using some schemes based on automata and YFilter, which is one of filtering techniques, is very popular. YFilter exploits commonality among path queries by sharing the common prefixes of the paths so that they are processed at most one and that is using the top-down approach. However, because partial matching path queries interrupt the common prefix sharing and don't calculate from root, throughput of YFilter decreases. So we use sharing of commonality among path queries with the common postfixes of the paths and use the bottom-up approach instead of the top-down approach. This filtering technique is called as PoSFilter. And we verify this technique through comparing with YFilter about throughput.

An Efficient Skyline Computation using Data Filtering in a MapReduce Environment (맵리듀스 환경에서 데이터 필터링을 이용한 효율적인 스카이라인 계산)

  • Kim, Jihyun;Kim, Myung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.582-584
    • /
    • 2016
  • 데이터의 다차원 특성을 고려한 스카이라인 계산은 의사결정 시스템이나 추천 시스템 등에 활용도가 높은 질의 처리이다. 최근 들어 빅데이터의 분석에도 스카이라인 질의가 유용하게 사용됨에 따라, 맵리듀스 환경에서 스카이라인 질의를 효율적으로 계산하는 데에 많은 관심이 집중되고 있다. 본 연구에서는 데이터 필터링을 적용하여 기존의 기법들과는 달리 하나의 잡(job)으로 스카이라인을 신속하게 계산하는 알고리즘을 소개한다. 제안하는 기법은 기존의 기법들에 비해 효율적이다.

A Keyword-based Filtering Technique of Document-centric XML using NFA Representation (NFA 표현을 사용한 문서-중심적 XML의 키워드 기반 필터링 기법)

  • Lee, Kyoung-Han;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.437-452
    • /
    • 2006
  • In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to filter element contents well, using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. Owing to sharing the common prefix characters of the operands in value-based predicates, the Pfilter improves the performance in processing those. We show several performance studies, comparing Pfilter with Yfilter in respect to efficiency and scalability as using multi-query processing time (MQPT), and reporting the results with respect to inserting, deleting, and processing of value-based predicates. In conclusion, our approach provides a core algorithm for evaluating the contains() function of XPath queries in previous XML filtering researches, and a foundation for building XML-based distributed information systems.

An Energy Efficient Query Processing Mechanism using Cache Filtering in Cluster-based Wireless Sensor Networks (클러스터 기반 WSN에서 캐시 필터링을 이용한 에너지 효율적인 질의처리 기법)

  • Lee, Kwang-Won;Hwang, Yoon-Cheol;Oh, Ryum-Duck
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.8
    • /
    • pp.149-156
    • /
    • 2010
  • As following the development of the USN technology, sensor node used in sensor network has capability of quick data process and storage to support efficient network configuration is enabled. In addition, tree-based structure was transformed to cluster in the construction of sensor network. However, query processing based on existing tree structure could be inefficient under the cluster-based network. In this paper, we suggest energy efficient query processing mechanism using filtering through data attribute classification in cluster-based sensor network. The suggestion mechanism use advantage of cluster-based network so reduce energy of query processing and designed more intelligent query dissemination. And, we prove excellence of energy efficient side with MATLab.

Pre-Filtering based Post-Load Shedding Method for Improving Spatial Queries Accuracy in GeoSensor Environment (GeoSensor 환경에서 공간 질의 정확도 향상을 위한 선-필터링을 이용한 후-부하제한 기법)

  • Kim, Ho;Baek, Sung-Ha;Lee, Dong-Wook;Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.12 no.1
    • /
    • pp.18-27
    • /
    • 2010
  • In u-GIS environment, GeoSensor environment requires that dynamic data captured from various sensors and static information in terms of features in 2D or 3D are fused together. GeoSensors, the core of this environment, are distributed over a wide area sporadically, and are collected in any size constantly. As a result, storage space could be exceeded because of restricted memory in DSMS. To solve this kind of problems, a lot of related studies are being researched actively. There are typically 3 different methods - Random Load Shedding, Semantic Load Shedding, and Sampling. Random Load Shedding chooses and deletes data in random. Semantic Load Shedding prioritizes data, then deletes it first which has lower priority. Sampling uses statistical operation, computes sampling rate, and sheds load. However, they are not high accuracy because traditional ones do not consider spatial characteristics. In this paper 'Pre-Filtering based Post Load Shedding' are suggested to improve the accuracy of spatial query and to restrict load shedding in DSMS. This method, at first, limits unnecessarily increased loads in stream queue with 'Pre-Filtering'. And then, it processes 'Post-Load Shedding', considering data and spatial status to guarantee the accuracy of result. The suggested method effectively reduces the number of the performance of load shedding, and improves the accuracy of spatial query.

A Filtering Method of Trajectory Query for Efficient Process of Combined Query (복합질의의 효율적 수행을 위한 궤적질의 필터링 기법)

  • Ban, Chae-Hoon;Kim, Jong-Min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.9
    • /
    • pp.1584-1590
    • /
    • 2008
  • The combined query which consists of the region and trajectory query finds trajectories of moving objects which locate in a certain region. The trajectory query is very informant factor to determine query performance because it processes a point query continuously to find predecessors. This results in bad performance due to revisiting nodes in an index. This paper suggests an efficient method for the combined query based on the 3-dimensional R-tree which has good performance of the region query. The basic idea is that we define the least common search line which enables to search single path and a filtering method based on prediction without revisiting nodes.

Design of Structural Retrieval Scheme Using Element Type in XML Documents (XML 문서에서 엘리먼트 타입을 이용한 구조적 검색 기법의 설계)

  • 김성완;정헌석;이재호;임해철
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.584-586
    • /
    • 2003
  • XML 문서의 검색을 위한 많은 연구들이 수행되고 있지만 순수하게 구조적 관계성만을 대상으로 하는 검색 즉, 구조적 검색 처리 기법에 대해서는 많이 다루지 않고 있거나 XML 문서 트리에 대한 반복적인 순회를 기반으로 처리하는 방법들이 제안되었다. 또한, 사용자가 원하지 않는 엘리먼트들을 제외하기 위해서는 부가적인 필터링 과정을 필요로 한다. 한편, 대부분의 XML 문서의 검색 관련 연구들은 엘리먼트의 삽입 또는 삭제 등 XML 문서의 부분적인 갱신 및 변경이 발생하는 환경을 고려하지 않고 있다. 본 논문에서는 사용자로부터 주어지는 질의에 포함된 엘리먼트 타입 정보 이용하여 XML 문서 트리에 대한 순회를 없애거나 최소화시키고, 필터링 과정도 필요로 하지 않는 구조적 검색 기법을 설계한다. 또한, 엘리먼트의 삭제 및 삽입 등 동적인 변경에 빠르고 유연하게 대처할 수 있는 인덱스 구조를 설계하고 이를 기반으로 구조적 검색 질의의 주요 유형에 대한 처리 방안을 예를 들어 설명한다.

  • PDF