• Title/Summary/Keyword: Analysis of Query

Search Result 459, Processing Time 0.031 seconds

Trends and Changes of Web Searching Behavior (웹 검색 행태의 추이 및 변화 분석)

  • Park, So-Yeon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.45 no.1
    • /
    • pp.377-393
    • /
    • 2011
  • This study aims to investigate trends of internet searching behavior of users of NAVER, a major Korean search portal. In particular, this study analyzed trends of query submission behaviors, behaviors related to typos, multimedia searching behaviors, and click behaviors. In conducting this study, query logs and click logs of unified search service were analyzed. The results of this study show that there were little changes in the topic and length of queries, the pattern of typos, and multimedia seeking behavior over a year's period. However, click counts of documents have gradually increased over time. The results of this study can be implemented to increase the portal's effective development of internet contents and searching algorithms.

Structural Analysis and Performance Test of Graph Databases using Relational Data (관계형데이터를 이용한 그래프 데이터베이스의 모델별 구조 분석과 쿼리 성능 비교 연구)

  • Bae, Suk Min;Kim, Jin Hyung;Yoo, Jae Min;Yang, Seong Ryul;Jung, Jai Jin
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1036-1045
    • /
    • 2019
  • Relational databases have a notion of normalization, in which the model for storing data is standardized according to the organization's business processes or data operations. However, the graph database is relatively early in this standardization and has a high degree of freedom in modeling. Therefore various models can be created with the same data, depending on the database designers. The essences of the graph database are two aspects. First, the graph database allows accessing relationships between the objects semantically. Second, it makes relationships between entities as important as individual data. Thus increasing the degree of freedom in modeling and providing the modeling developers with a more creative system. This paper introduces different graph models with test data. It compares the query performances by the results of response speeds to the query executions per graph model to find out how the efficiency of each model can be maximized.

A Study of Command & Control Server through Analysis - DNS query log (명령제어서버 탐색 방법 - DNS 분석 중심으로)

  • Cheon, Yang-Ha
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.12
    • /
    • pp.1849-1856
    • /
    • 2013
  • DOS attack, the short of Denial of Service attack is an internet intrusion technique which harasses service availability of legitimate users. To respond the DDoS attack, a lot of methods focusing attack source, target and intermediate network, have been proposed, but there have not been a clear solution. In this paper, we purpose the prevention of malicious activity and early detection of DDoS attack by detecting and removing the activity of botnets, or other malicious codes. For the purpose, the proposed method monitors the network traffic, especially DSN traffic, which is originated from botnets or malicious codes.

Retrieval of Identical Clothing Images Based on Non-Static Color Histogram Analysis

  • Choi, Yoo-Joo;Moon, Nam-Mee;Kim, Ku-Jin
    • Journal of Broadcast Engineering
    • /
    • v.14 no.4
    • /
    • pp.397-408
    • /
    • 2009
  • In this paper, we present a non-static color histogram method to retrieve clothing images that are similar to a query clothing. Given clothing area, our method automatically extracts major colors by using the octree-based quantization approach[16]. Then, a color palette that is composed of the major colors is generated. The feature of each clothing, which can be either a query or a database clothing image, is represented as a color histogram based on its color palette. We define the match color bins between two possibly different color palettes, and unify the color palettes by merging or deleting some color bins if necessary. The similarity between two histograms is measured by using the weighted Euclidean distance between the match color bins, where the weight is derived from the frequency of each bin. We compare our method with previous histogram matching methods through experiments. Compared to HSV cumulative histogram-based approach, our method improves the retrieval precision by 13.7 % with less number of color bins.

Standard-based Integration of Heterogeneous Large-scale DNA Microarray Data for Improving Reusability

  • Jung, Yong;Seo, Hwa-Jeong;Park, Yu-Rang;Kim, Ji-Hun;Bien, Sang Jay;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • v.9 no.1
    • /
    • pp.19-27
    • /
    • 2011
  • Gene Expression Omnibus (GEO) has kept the largest amount of gene-expression microarray data that have grown exponentially. Microarray data in GEO have been generated in many different formats and often lack standardized annotation and documentation. It is hard to know if preprocessing has been applied to a dataset or not and in what way. Standard-based integration of heterogeneous data formats and metadata is necessary for comprehensive data query, analysis and mining. We attempted to integrate the heterogeneous microarray data in GEO based on Minimum Information About a Microarray Experiment (MIAME) standard. We unified the data fields of GEO Data table and mapped the attributes of GEO metadata into MIAME elements. We also discriminated non-preprocessed raw datasets from others and processed ones by using a two-step classification method. Most of the procedures were developed as semi-automated algorithms with some degree of text mining techniques. We localized 2,967 Platforms, 4,867 Series and 103,590 Samples with covering 279 organisms, integrated them into a standard-based relational schema and developed a comprehensive query interface to extract. Our tool, GEOQuest is available at http://www.snubi.org/software/GEOQuest/.

An Efficient Tag Identification Algorithm using Bit Pattern Prediction Method (비트 패턴 예측 기법을 이용한 효율적인 태그 인식 알고리즘)

  • Kim, Young-Back;Kim, Sung-Soo;Chung, Kyung-Ho;Kwon, Kee-Koo;Ahn, Kwang-Seon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.5
    • /
    • pp.285-293
    • /
    • 2013
  • The procedure of the arbitration which is the tag collision is essential because the multiple tags response simultaneously in the same frequency to the request of the Reader. This procedure is known as Anti-collision and it is a key technology in the RFID system. In this paper, we propose the Bit Pattern Prediction Algorithm(BPPA) for the efficient identification of the multiple tags. The BPPA is based on the tree algorithm using the time slot and identify the tag quickly and efficiently using accurate bit pattern prediction method. Through mathematical performance analysis, We proved that the BPPA is an O(n) algorithm by analyzing the worst-case time complexity and the BPPA's performance is improved compared to existing algorithms. Through MATLAB simulation experiments, we verified that the BPPA require the average 1.2 times query per one tag identification and the BPPA ensure stable performance regardless of the number of the tags.

Performance Analysis of Gen-2 Q-Algorithm According to Initial Slot-Count Size (초기 슬롯-카운트 크기에 따른 Gen-2 Q-알고리즘의 성능 분석)

  • Lim, In-Taek
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.445-446
    • /
    • 2010
  • In Gen-2 Q-algorithm, the initial value of $Q_{fp}$, which is the slot-count parameter, is not defined in the standard. In this case, if we let the initial $Q_{fp}$ be large, the number of empty slot will be large during the initial query round. On the other hand, if the initial $Q_{fp}$ is small, almost all the slots will be collided. As a result, it is anticipated that the performance will be declined because the frame size does not converge to the optimal point quickly during the query round. In this paper, we analyze how the performances of Gen-2 Q-algorithm will be affected by the initial slot-count size.

  • PDF

An Analysis of Search Log from a Story Database Service and a New Story Search Method based on Story Map (스토리 검색 서비스의 사용자 기록 분석 및 스토리맵에 의한 새로운 스토리 검색 방법)

  • Kim, Myoung-Jun
    • Journal of Digital Contents Society
    • /
    • v.16 no.5
    • /
    • pp.795-803
    • /
    • 2015
  • is a service providing story synopsis that matches user's query. This paper analyzes the user log of and shows the tendency distribution of user creation in comparison to stories in database. We also investigate the log to see possible improvements on search method. This paper proposes a concept of Story Map, in which the query-answer information is projected into spatial coordinates, and a new story search UI based on it. Using the Story Map, users are able to see entire spatial distribution of story database so that they can quickly and intuitively find a story on the map.

Implementation of an Efficient Music Retrieval System based on the Analysis of User Query Pattern (사용자 질의 패턴 분석을 통한 효율적인 음악 검색 시스템의 구현)

  • Rho, Seung-min;Hwang, Een-jun
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.737-748
    • /
    • 2003
  • With the popularity of digital music contents, querying and retrieving music contents efficiently from database has become essential. In this paper, we propose a Fast Melody Finder (FMF) that can retrieve melodies fast and efficiently from music database using frequently queried tunes. This scheme is based on the observation that users have a tendency to memorize and query a small number of melody segments, and indexing such segments enables fast retrieval. To handle those tunes, FMF transcribes all the acoustic and common music notational inputs into a specific string such as UDR and LSR. We have implemented a prototype system and showed on its performance through various experiments.

Performance Analysis on Declustering High-Dimensional Data by GRID Partitioning (그리드 분할에 의한 다차원 데이터 디클러스터링 성능 분석)

  • Kim, Hak-Cheol;Kim, Tae-Wan;Li, Ki-Joune
    • The KIPS Transactions:PartD
    • /
    • v.11D no.5
    • /
    • pp.1011-1020
    • /
    • 2004
  • A lot of work has been done to improve the I/O performance of such a system that store and manage a massive amount of data by distributing them across multiple disks and access them in parallel. Most of the previous work has focused on an efficient mapping from a grid ceil, which is determined bY the interval number of each dimension, to a disk number on the assumption that each dimension is split into disjoint intervals such that entire data space is GRID-like partitioned. However, they have ignored the effects of a GRID partitioning scheme on declustering performance. In this paper, we enhance the performance of mapping function based declustering algorithms by applying a good GRID par-titioning method. For this, we propose an estimation model to count the number of grid cells intersected by a range query and apply a GRID partitioning scheme which minimizes query result size among the possible schemes. While it is common to do binary partition for high-dimensional data, we choose less number of dimensions than needed for binary partition and split several times along that dimensions so that we can reduce the number of grid cells touched by a query. Several experimental results show that the proposed estimation model gives accuracy within 0.5% error ratio regardless of query size and dimension. We can also improve the performance of declustering algorithm based on mapping function, called Kronecker Sequence, which has been known to be the best among the mapping functions for high-dimensional data, up to 23 times by applying an efficient GRID partitioning scheme.