• Title/Summary/Keyword: query analysis

Search Result 457, Processing Time 0.029 seconds

Sensitivity Analysis of Decision Tree's Learning Effectiveness in Boolean Query Reformulation (불리언 질의 재구성에서 의사결정나무의 학습 성능 감도 분석)

  • 윤정미;김남호;권영식
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.23 no.4
    • /
    • pp.141-149
    • /
    • 1998
  • One of the difficulties in using the current Boolean-based information retrieval systems is that it is hard for a user, especially a novice, to formulate an effective Boolean query. One solution to this problem is to let the system formulate a query for a user from his relevance feedback documents in this research, an intelligent query reformulation mechanism based on ID3 is proposed and the sensitivity of its retrieval effectiveness, i.e., recall, precision, and E-measure, to various input settings is analyzed. The parameters in the input settings is the number of relevant documents. Experiments conducted on the test set of Medlars revealed that the effectiveness of the proposed system is in fact sensitive to the number of the initial relevant documents. The case with two or more initial relevant documents outperformed the case with one initial relevant document with statistical significances. It is our conclusion that formulation of an effective query in the proposed system requires at least two relevant documents in its initial input set.

  • PDF

A Query Model for Consecutive Analyses of Dynamic Multivariate Graphs (동적 다변량 그래프의 연속적 분석을 위한 질의 모델 설계 및 구현)

  • Bae, Yechan;Ham, Doyoung;Kim, Taeyang;Jeong, Hayjin;Kim, Dongyoon
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.6
    • /
    • pp.103-113
    • /
    • 2014
  • This study designed and implemented a query model for consecutive analyses of dynamic multivariate graph data. First, the query model consists of two procedures; setting the discriminant function, and determining an alteration method. Second, the query model was implemented as a query system that consists of a query panel, a graph visualization panel, and a property panel. A Node-Link Diagram and the Force-Directed Graph Drawing algorithm were used for the visualization of the graph. The results of the queries are visually presented through the graph visualization panel. Finally, this study used the data of worldwide import & export data of small arms to verify our model. The significance of this research is in the fact that, through the model which is able to conduct consecutive analyses on dynamic graph data, it helps overcome the limitations of previous models which can only perform discrete analysis on dynamic data. This research is expected to contribute to future studies such as online decision making and complex network analysis, that use dynamic graph models.

  • PDF

Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce (맵리듀스를 이용한 그리드 기반 인덱스 생성 및 k-NN 조인 질의 처리 알고리즘)

  • Jang, Miyoung;Chang, Jae Woo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1303-1313
    • /
    • 2015
  • MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.

User Information Needs Analysis based on Query Log Big Data of the National Archives of Korea (국가기록원 질의로그 빅데이터 기반 이용자 정보요구 유형 분석)

  • Baek, Ji-yeon;Oh, Hyo-Jung
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.183-205
    • /
    • 2019
  • Among the various methods for identifying users's information needs, Log analysis methods can realistically reflect the users' actual search behavior and analyze the overall usage of most users. Based on the large quantity of query log big data obtained through the portal service of the National Archives of Korea, this study conducted an analysis by the information type and search result type in order to identify the users' information needs. The Query log used in analysis were based on 1,571,547 query data collected over a total of 141 months from 2007 to December 2018, when the National Archives of Korea provided search services via the web. Furthermore, based on the analysis results, improvement methods were proposed to improve user search satisfaction. The results of this study could actually be used to improve and upgrade the National Archives of Korea search service.

Performance Enhancement of Proxy Mobile IPv6 using Binding Query (Binding Query를 활용한 Proxy Mobile IPv6의 성능 향상 기법)

  • Park, Jae-Wan;Kim, Ji-In;Koh, Seok-Joo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.11B
    • /
    • pp.1269-1276
    • /
    • 2011
  • In the Proxy Mobile IPv6 (PMIPv6), the data transmission performance may be degraded when the two communicating hosts are located within the same mobile domain, since all the data packet shall be delivered by way of Local Mobility Anchor (LMA). In this paper we propose an extensional scheme of PMIPv6 using binding query. In the proposed Query-based PMIPv6 (Q-PMIPv6) scheme, the Mobile Access Gateway (NAG) of Correspondent Node (CN) sends a binding query to LMA to obtain the Proxy Care-of-Address of Mobile Node (MN). Since then, CN and MN can communicate with each other by using an optimized data path. For comparison, we performed the numerical analysis and the ns-2 simulations for the proposed Q-PMIPv6 scheme and the existing PMIPv6 and PMIPv6 Localized Routing (PMIPv6-LR). From the results, we can see that the proposed scheme outperforms the existing PMIPv6 and PMIPv6-LR schemes in terms of the signaling control and data delivery costs.

Examining Categorical Transition and Query Reformulation Patterns in Image Search Process (이미지 검색 과정에 나타난 질의 전환 및 재구성 패턴에 관한 연구)

  • Chung, Eun-Kyung;Yoon, Jung-Won
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.2
    • /
    • pp.37-60
    • /
    • 2010
  • The purpose of this study is to investigate image search query reformulation patterns in relation to image attribute categories. A total of 592 sessions and 2,445 queries from the Excite Web search engine log data were analyzed by utilizing Batley's visual information types and two facets and seven sub-facets of query reformulation patterns. The results of this study are organized with two folds: query reformulation and categorical transition. As the most dominant categories of queries are specific and general/nameable, this tendency stays over various search stages. From the perspective of reformulation patterns, while the Parallel movement is the most dominant, there are slight differences depending on initial or preceding query categories. In examining categorical transitions, it was found that 60-80% of search queries were reformulated within the same categories of image attributes. These findings may be applied to practice and implementation of image retrieval systems in terms of assisting users' query term selection and effective thesauri development.

Implementation and Evaluation of a Web Ontology Storage based on Relation Analysis of OWL Elements and Query Patterns (OWL 요소와 질의 패턴에 대한 관계 분석에 웹 온톨로지 저장소의 구현 및 평가)

  • Jeong, Dong-Won;Choi, Myoung-Hoi;Jeong, Young-Sik;Han, Sung-Kook
    • Journal of KIISE:Databases
    • /
    • v.35 no.3
    • /
    • pp.231-242
    • /
    • 2008
  • W3C has selected OWL as a standard for Web ontology description and a necessity of research on storage models that can store OWL ontologies effectively has been issued. Until now, relational model-based storage systems such as Jena, Sesame, and DLDB, have been developed, but there still remain several issues. Especially, they lead inefficient query processing performance. The structural problems of their low query processing performance are as follow: Jena has a simple structure which is not normalized and also stores most information in a single table. It exponentially decreases the performance because of comparison with unnecessary information for processing queries requiring join operations as well as simple search. The structures of storages(e.g., Sesame) have been completely normalized. Therefore it executes many join operations for query processing. The storages require many join operations to find simply a specific class. This paper proposes a storage model to resolve the problems that the query processing performance is decreased because of non-normalization or complete normalization of the existing storages. To achieve this goal, we analyze the problems of existing storage models as well as relations of OWL elements and query patterns. The proposed model, defined with the analysis results, provides an optimal normalized structure to minimize join operations or unnecessary information comparison. For the experiment of query processing performance, a LUBM data sets are used and query patterns are defined considering search targets and their hierarchical relations. In addition, this paper conducts experiments on correctness and completeness of query results to verify data loss of the proposed model, and the results are described. With the comparative evaluation results, our proposal showed a better performance than the existing storage models.

Spatio-Temporal Query Processing Over Sensor Networks: Challenges, State Of The Art And Future Directions

  • Jabeen, Farhana;Nawaz, Sarfraz;Tanveer, Sadaf;Iqbal, Majid
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.7
    • /
    • pp.1756-1776
    • /
    • 2012
  • Wireless sensor networks (WSNs) are likely to be more prevalent as their cost-effectiveness improves. The spectrum of applications for WSNs spans multiple domains. In environmental sciences, in particular, they are on the way to become an essential technology for monitoring the natural environment and the dynamic behavior of transient physical phenomena over space. Existing sensor network query processors (SNQPs) have also demonstrated that in-network processing is an effective and efficient means of interaction with WSNs for performing queries over live data. Inspired by these findings, this paper investigates the question as to whether spatio-temporal and historical analysis can be carried over WSNs using distributed query-processing techniques. The emphasis of this work is on the spatial, temporal and historical aspects of sensed data, which are not adequately addressed in existing SNQPs. This paper surveys the novel approaches of storing the data and execution of spatio-temporal and historical queries. We introduce the challenges and opportunities of research in the field of in-network storage and in-network spatio-temporal query processing as well as illustrate the current status of research in this field. We also present new areas where the spatio-temporal and historical query processing can be of significant importance.

EMRQ: An Efficient Multi-keyword Range Query Scheme in Smart Grid Auction Market

  • Li, Hongwei;Yang, Yi;Wen, Mi;Luo, Hongwei;Lu, Rongxing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.11
    • /
    • pp.3937-3954
    • /
    • 2014
  • With the increasing electricity consumption and the wide application of renewable energy sources, energy auction attracts a lot of attention due to its economic benefits. Many schemes have been proposed to support energy auction in smart grid. However, few of them can achieve range query, ranked search and personalized search. In this paper, we propose an efficient multi-keyword range query (EMRQ) scheme, which can support range query, ranked search and personalized search simultaneously. Based on the homomorphic Paillier cryptosystem, we use two super-increasing sequences to aggregate multidimensional keywords. The first one is used to aggregate one buyer's or seller's multidimensional keywords to an aggregated number. The second one is used to create a summary number by aggregating the aggregated numbers of all sellers. As a result, the comparison between the keywords of all sellers and those of one buyer can be achieved with only one calculation. Security analysis demonstrates that EMRQ can achieve confidentiality of keywords, authentication, data integrity and query privacy. Extensive experiments show that EMRQ is more efficient compared with the scheme in [3] in terms of computation and communication overhead.

The Performance Analysis of Nearest Neighbor Query Process using Circular Search Distance (순환검색거리를 이용하는 최대근접 질의처리의 성능분석)

  • Seon, Hwi-Joon;Kim, Won-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.83-90
    • /
    • 2010
  • The number of searched nodes and the computation time in an index should be minimized for optimizing the processing cost of the nearest neighbor query. The Measurement of search distance considered a circular location property of objects is required to accurately select the nodes which will be searched in the nearest neighbor query. In this paper, we propose the processing method of the nearest neighbor query be considered a circular location property of object where the search space consists of a circular domain and show its performance by experiments. The proposed method uses the circular minimum distance and the circular optimal distance which are the search measurements for optimizing the processing cost of the nearest neighbor query.