• Title/Summary/Keyword: Top-k 검색

Search Result 86, Processing Time 0.035 seconds

Design Blockchain as a Service and Smart Contract with Secure Top-k Search that Improved Accuracy (정확도가 향상된 안전한 Top-k 검색 기반 서비스형 블록체인과 스마트 컨트랙트 설계)

  • Hobin Jang;Ji Young Chun;Ik Rae Jeong;Geontae Noh
    • Journal of Internet Computing and Services
    • /
    • v.24 no.5
    • /
    • pp.85-96
    • /
    • 2023
  • With advance of cloud computing technology, Blockchain as a Service of Cloud Service Provider has been utilized in various areas such as e-Commerce and financial companies to manage customer history and distribution history. However, if users' search history, purchase history, etc. are to be utilized in a BaaS in areas such as recommendation algorithms and search engine development, the users' search queries will be exposed to the company operating the BaaS, and privacy issues will be occured. Z. Guan et al. ensure the unlinkability between users' search query and search result using searchable encryption, and based on the inner product similarity, they select Top-k results that are highly relevant to the users' search query. However, there is a problem that the Top-k results selection may be not possible due to ties of inner product similarity, and BaaS over cloud is not considered. Therefore, this paper solve the problem of Z. Guan et al. using cosine similarity, so we improve accuracy of search result. And based on this, we design a BaaS with secure Top-k search that improved accuracy. Furthermore, we design a smart contracts that preserve privacy of users' search and obtain Top-k search results that are highly relevant to the users' search.

A Comparison and Study among Reverse Top-k Query Methods (Reverse Top-k 질의 처리 방법 비교 및 문제점 분석)

  • Ihm, Sun-Young;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1162-1164
    • /
    • 2013
  • Top-k 질의 처리가 사용자가 원하는 데이터를 검색하는 방법인 반면에, Reverse Top-k 질의 처리는 데이터의 관점에서 특정 데이터를 가장 선호할 만한 사용자를 검색하는 방법으로 생산자의 입장에서 매우 중요한 연구이다. 본 논문에서는 Reverse Top-k 질의 처리 방법들을 소개하고 비교 및 문제점을 분석한다.

Effective Keyword Search on Semantic RDF Data (시맨틱 RDF 데이터에 대한 효과적인 키워드 검색)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.209-220
    • /
    • 2017
  • As a semantic data is widely used in various applications such as Knowledge Bases and Semantic Web, needs for effective search over a large amount of RDF data have been increasing. Previous keyword search methods based on distinct root semantics only retrieve a set of answer trees having different root nodes. Thus, they often find answer trees with similar meanings or low query relevance together while those with the same root node cannot be retrieved together even if they have different meanings and high query relevance. We propose a new method to find diverse and relevant answers to the query by permitting duplication of root nodes among them. We present an efficient query processing algorithm using path indexes to find top-k answers given a maximum amount of root duplication a set of answer trees can have. We show by experiments using a real dataset that the proposed approach can produce effective answer trees which are less redundant in their content nodes and more relevant to the query than the previous method.

Improving Diversity of Keyword Search on Graph-structured Data by Controlling Similarity of Content Nodes (콘텐트 노드의 유사성 제어를 통한 그래프 구조 데이터 검색의 다양성 향상)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.3
    • /
    • pp.18-30
    • /
    • 2020
  • Recently, as graph-structured data is widely used in various fields such as social networks and semantic Webs, needs for an effective and efficient search on a large amount of graph data have been increasing. Previous keyword-based search methods often find results by considering only the relevance to a given query. However, they are likely to produce semantically similar results by selecting answers which have high query relevance but share the same content nodes. To improve the diversity of search results, we propose a top-k search method that finds a set of subtrees which are not only relevant but also diverse in terms of the content nodes by controlling their similarity. We define a criterion for a set of diverse answer trees and design two kinds of diversified top-k search algorithms which are based on incremental enumeration and A heuristic search, respectively. We also suggest an improvement on the A search algorithm to enhance its performance. We show by experiments using real data sets that the proposed heuristic search method can find relevant answers with diverse content nodes efficiently.

A Study on Top-k Query Processing using List-based Approach (List 기반의 접근법을 사용하는 Top-k 질의 처리 연구)

  • Ihm, Sun-Young;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1249-1252
    • /
    • 2011
  • 최근 인터넷의 발달과 사용량의 증가로 데이터의 양이 급증하고 있다. 사용자들은 빠른 시간 내에 원하는 검색 결과를 얻기를 원한다. 또한 사용자 마다 모두 다른 선호도를 가지기 때문에 사용자 질의에 기반 하여 검색되어야 한다. 따라서 본 논문에서는 사용자 질의에 따라 빠른 시간 내에 효율적으로 List 기반의 접근법을 사용하여 top k 질의를 하는 기존의 연구를 소개 및 분석하고 문제점을 파악한다.

Survey on Top-k Related Pair Search Method Using Cosine Similarity (코사인 유사도 기법을 이용한 top-k 관련쌍 검색 방법 조사)

  • Kim, Sungchul;Kim, Jeong-Hwan;Kim, Na-Yeong;Kim, Taehoon;Yu, Hwanjo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.808-809
    • /
    • 2017
  • 유사도 검색은 전통적으로 데이터베이스 그리고 웹검색 분야의 핵심이었으나, 대용량 데이터의 등장으로 검색의 정확도뿐만이 아니라 효율성 측면에서의 요구가 증가하며 여전히 다양한 분야에서 활발히 연구되고 있다. 아이템간의 유사도를 측정하기 위한 방법론 중 코사인 유사도 방법론은 고차원공간에서의 활용이 유리하다는 이점 때문에 가장 널리 활용되고 있는 방법론으로, 정보검색, 장바구니 분석, 생물정보학 등 다양한 분야에서 활용되고 있다. 본 논문에서는 코사인 유사도를 소개하고, 연관성 분석 측면에서 코사인 유사도를 사용한 기존의 연구들을 소개한다.

A Method for Non-redundant Keyword Search over Graph Data (그래프 데이터에 대한 비-중복적 키워드 검색 방법)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.6
    • /
    • pp.205-214
    • /
    • 2016
  • As a large amount of graph-structured data is widely used in various applications such as social networks, semantic web, and bio-informatics, keyword-based search over graph data has been getting a lot of attention. In this paper, we propose an efficient method for keyword search over graph data to find a set of top-k answers that are relevant as well as non-redundant in structure. We define a non-redundant answer structure for a keyword query and a relevance measure for the answer. We suggest a new indexing scheme on the relevant paths between nodes and keyword terms in the graph, and also propose a query processing algorithm to find top-k non-redundant answers efficiently by exploiting the pre-calculated indexes. We present effectiveness and efficiency of the proposed approach compared to the previous method by conducting an experiment using a real dataset.

Development of a top-K search engine for drug discovery (신약 발견을 위한 top-K 검색 엔진의 개발)

  • Seo, In;Lee, Seungmin;Ahmed, Muhammad Ejaz;Chae, Songyi
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.810-811
    • /
    • 2017
  • 신약 개발은 고부가가치를 창출하는 차세대 전략 산업으로 주목받고 있지만, 동물 실험과 임상 시험에 막대한 비용이 필요한 고위험-초고소득(high risk-super high return) 산업이다. 따라서 신약 후보군의 선정이 매우 중요하며 약물 유사도를 랭킹함수를 사용하는 top-k 질의 처리를 통해 후보군을 효과적으로 선정할 수 있다. 본 논문에서는 ChEMBL 데이터베이스[4]에 존재하는 화합물들 중 사용자가 원하는 특성을 갖는 k개의 화합물들을 후보군으로 추천해주는 검색 엔진을 개발하였다.

An Survey on Top-k Query Processing using Convex Hulls (Convex hull을 사용하는 Top-k 질의처리 방법에 관한 분석)

  • Lee, Ji-Hyeon;Park, Young-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.1073-1074
    • /
    • 2012
  • 최근 인터넷의 발달과 사용량의 증가로 데이터의 양이 급증함에 따라 대용량 데이터를 효율적으로 검색하는 top k 질의 처리가 중요시 되고 있다. Layer 기반 방법은 가장 잘 알려진 top k 질의처리 방법이며, 객체의 모든 속성의 값들을 이용하여 객체들을 layer들의 리스트로 구성하는 방법이다. 본 논문에서는 그 중에서 convex hull을 사용하여 layer list를 생성하는 기존 연구를 조사하고 문제점을 파악한다.

Odysseus/Parallel-OOSQL: A Parallel Search Engine using the Odysseus DBMS Tightly-Coupled with IR Capability (오디세우스/Parallel-OOSQL: 오디세우스 정보검색용 밀결합 DBMS를 사용한 병렬 정보 검색 엔진)

  • Ryu, Jae-Joon;Whang, Kyu-Young;Lee, Jae-Gil;Kwon, Hyuk-Yoon;Kim, Yi-Reun;Heo, Jun-Suk;Lee, Ki-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.412-429
    • /
    • 2008
  • As the amount of electronic documents increases rapidly with the growth of the Internet, a parallel search engine capable of handling a large number of documents are becoming ever important. To implement a parallel search engine, we need to partition the inverted index and search through the partitioned index in parallel. There are two methods of partitioning the inverted index: 1) document-identifier based partitioning and 2) keyword-identifier based partitioning. However, each method alone has the following drawbacks. The former is convenient in inserting documents and has high throughput, but has poor performance for top h query processing. The latter has good performance for top-k query processing, but is inconvenient in inserting documents and has low throughput. In this paper, we propose a hybrid partitioning method to compensate for the drawback of each method. We design and implement a parallel search engine that supports the hybrid partitioning method using the Odysseus DBMS tightly coupled with information retrieval capability. We first introduce the architecture of the parallel search engine-Odysseus/parallel-OOSQL. We then show the effectiveness of the proposed system through systematic experiments. The experimental results show that the query processing time of the document-identifier based partitioning method is approximately inversely proportional to the number of blocks in the partition of the inverted index. The results also show that the keyword-identifier based partitioning method has good performance in top-k query processing. The proposed parallel search engine can be optimized for performance by customizing the methods of partitioning the inverted index according to the application environment. The Odysseus/parallel OOSQL parallel search engine is capable of indexing, storing, and querying 100 million web documents per node or tens of billions of web documents for the entire system.