• Title/Summary/Keyword: Similarity Query

Search Result 246, Processing Time 0.031 seconds

An Implementation of XML document searching system based on Structure and Semantics Similarity (구조와 내용 유사도에 기반한 XML 웹 문서 검색시스템 구축)

  • Park Uchang;Seo Yeojin
    • Journal of Internet Computing and Services
    • /
    • v.6 no.2
    • /
    • pp.99-115
    • /
    • 2005
  • Extensible Markup Language (XML) is an Internet standard that is used to express and convert data, In order to find the necessary information out of XML documents, you need a search system for XML documents, In this research, we have developed a search system that can find documents that matches the structure and content of a given XML document, making the best use of XML structure, Search metrics take account of the similarity in tag names, tag values, and the structure of tags, After a search, the system displays the ranked results in the order of aggregate similarity, Three methods of query are provided: keyword search which is conventional; search with tag names and their values; and search with XML documents, These three methods enable users to choose the method that best suits their preference, resulting in the increase of the usefulness of the system.

  • PDF

A Natural Language Question Answering System-an Application for e-learning

  • Gupta, Akash;Rajaraman, Prof. V.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.285-291
    • /
    • 2001
  • This paper describes a natural language question answering system that can be used by students in getting as solution to their queries. Unlike AI question answering system that focus on the generation of new answers, the present system retrieves existing ones from question-answer files. Unlike information retrieval approaches that rely on a purely lexical metric of similarity between query and document, it uses a semantic knowledge base (WordNet) to improve its ability to match question. Paper describes the design and the current implementation of the system as an intelligent tutoring system. Main drawback of the existing tutoring systems is that the computer poses a question to the students and guides them in reaching the solution to the problem. In the present approach, a student asks any question related to the topic and gets a suitable reply. Based on his query, he can either get a direct answer to his question or a set of questions (to a maximum of 3 or 4) which bear the greatest resemblance to the user input. We further analyze-application fields for such kind of a system and discuss the scope for future research in this area.

  • PDF

Retrieving of Compositionally Similar Images Using Straight Line Elements (직선 성분을 이용하는 구도가 유사한 사진 검색 방법)

  • Hwang, Joo-Yeon;Lim, Dong-Sup;Paik, Doo-Won
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.11
    • /
    • pp.1539-1546
    • /
    • 2009
  • According to photography, lines are important elements that make composition and mood of photo. In this paper, we proposed a measure for compositional dissimilarity between photos using lines which are basic elements of photography. To identify patterns of lines which classify composition of photos, we investigated both features of compositionally same photos and compositionally different photos. Then we developed effective measure for compositional dissimilarity between photos by applying the investigated features to the measure, and we implemented an image searching system which retrieves photo compositionally similar to given query to evaluate performance of proposed method. The searching system showed the precision of about 85% maximally for the highly matched 10 results and was capable of reliably retrieving compositionally similar to given query even if some objects were included in photos.

  • PDF

Shape-Based Leaf Image Retrieval System (모양 기반의 식물 잎 이미지 검색 시스템)

  • Nam Yun-Young;Hwang Een-Jun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.1 s.104
    • /
    • pp.29-36
    • /
    • 2006
  • In this paper, we present a leaf image retrieval system that represents and retrieves leaf images based on their shape. For more effective representation of leaf images, we improved an existing MPP algorithm. Also, in order to reduce the response time, we proposed a new dynamic matching algorithm at basically revises the Nearest Neighbor search. The system provides users with an interface for uploading query images or tools to generate queries based on shape features and retrieves images based on their similarity. For convenience, users are allowed to easily query images by sketching leaf shape or leaf arrangement on the web. In the experiment, we constructed an image database of Korean native plants and measured the system performance by counting the number of similar images retrieved for queries.

Retrieval of Identical Clothing Images Based on Non-Static Color Histogram Analysis

  • Choi, Yoo-Joo;Moon, Nam-Mee;Kim, Ku-Jin
    • Journal of Broadcast Engineering
    • /
    • v.14 no.4
    • /
    • pp.397-408
    • /
    • 2009
  • In this paper, we present a non-static color histogram method to retrieve clothing images that are similar to a query clothing. Given clothing area, our method automatically extracts major colors by using the octree-based quantization approach[16]. Then, a color palette that is composed of the major colors is generated. The feature of each clothing, which can be either a query or a database clothing image, is represented as a color histogram based on its color palette. We define the match color bins between two possibly different color palettes, and unify the color palettes by merging or deleting some color bins if necessary. The similarity between two histograms is measured by using the weighted Euclidean distance between the match color bins, where the weight is derived from the frequency of each bin. We compare our method with previous histogram matching methods through experiments. Compared to HSV cumulative histogram-based approach, our method improves the retrieval precision by 13.7 % with less number of color bins.

Enhanced Cloud Service Discovery for Naïve users with Ontology based Representation

  • Viji Rajendran, V;Swamynathan, S
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.38-57
    • /
    • 2016
  • Service discovery is one of the major challenges in cloud computing environment with a large number of service providers and heterogeneous services. Non-uniform naming conventions, varied types and features of services make cloud service discovery a grueling problem. With the proliferation of cloud services, it has been laborious to find services, especially from Internet-based service repositories. To address this issue, services are crawled and clustered according to their similarity. The clustered services are maintained as a catalogue in which the data published on the cloud provider's website are stored in a standard format. As there is no standard specification and a description language for cloud services, new efficient and intelligent mechanisms to discover cloud services are strongly required and desired. This paper also proposes a key-value representation to describe cloud services in a formal way and to facilitate matching between offered services and demand. Since naïve users prefer to have a query in natural language, semantic approaches are used to close the gap between the ambiguous user requirements and the service specifications. Experimental evaluation measured in terms of precision and recall of retrieved services shows that the proposed approach outperforms existing methods.

Semantic Extention Search for Documents Using the Word2vec (Word2vec을 활용한 문서의 의미 확장 검색방법)

  • Kim, Woo-ju;Kim, Dong-he;Jang, Hee-won
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.10
    • /
    • pp.687-692
    • /
    • 2016
  • Conventional way to search documents is keyword-based queries using vector space model, like tf-idf. Searching process of documents which is based on keywords can make some problems. it cannot recogize the difference of lexically different but semantically same words. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.

Development of melody similarity based on chroma representation, dynamic time warping, and hinge distance (크로마 레벨 표현, 동적 시간 왜곡, 꺾인 거리함수에 기반한 멜로디 사이의 유사도 개발)

  • Jang, Dalwon;Park, Sung-Ju;Jang, Sei-Jin;Lee, Seok-Pil
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.07a
    • /
    • pp.258-260
    • /
    • 2011
  • 이 논문에서는 쿼리-바이-싱잉/허밍 (Query-by-singing/humming, QbSH) 시스템 또는 커버 노래 인식 (cover song identification) 시스템에서 사용 가능한 멜로디 유사도를 제안한다. QbSH 또는 커버 노래 인식은 디지털 음악의 사용이 보편화되면서 음악 검색의 방법으로 많은 연구가 진행되어 오고 있다. 멜로디 유사도는 이런 시스템을 구현하는데 필수적인 요소이며, 두 개의 음악에서 멜로디가 추출되었다고 가정하고, 추출된 멜로디 사이의 유사한 정도를 수치로 표현한다. QbSh 시스템이나 커버 노래 인식 시스템은 멜로디 유사도에 기반하여 입력 노래와 유사한 노래를 데이터베이스에서 검색하는 작업을 수행한다. 이 논문에서 제안하는 멜로디 유사도 방식은 기존의 많이 연구되던 동적 시간 왜곡 (dynamic time warping, DTW) 방법과 크로마 표현 방법 (chroma representation)을 사용하였다. DTW방법은 비대칭적으로 사용하고 미디 노트 영역에서 표현된 멜로디 특징은 0이상 12 미만의 크로마 레벨로 표현하였다. 기존의 방법에서는 정수값을 많이 사용하였으나 이 논문에서는 실수값을 사용한다. DTW 에 사용하는 거리 함수를 기존에 사용하던 차이의 절대값 대신 꺾인 함수 형태를 사용함으로써 성능을 높였다. QbSH 시스템에서의 실험을 통해서 성능을 검증하였다. 본 논문에서는 10-12초 길이의 1000번의 쿼리(Query)에 대해서 28시간 정도의 데이터베이스에서 실험한 결과, 순위 역의 평균 (Mean reciprocal rank, MRR) 값이 0.713을 보였다.

  • PDF

A Study on Document Retrieval of Web Using Relevance Feedback (적합성 피드백을 이용한 웹 문서검색에 관한 연구)

  • 김영천;이성주
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.3
    • /
    • pp.597-604
    • /
    • 2001
  • In conventional boolean retrieval systems, document ranking is not supported and similarity coefficients cannot be computed between queries and documents. The MMM, Paice and P-norm models have been proposed in the past to support the ranking facility for boolean retrieval systems. They have common properties of interpreting boolean operators softly. In this paper we propose a new soft evaluation method for Information retrieval using query splitting relevance feedback model. We also show through performance comparison that query splitting relevance feedback(QSRF) is more efficient and effective than MMM, Paice and P-norm.

  • PDF

Fast, Flexible Text Search Using Genomic Short-Read Mapping Model

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.518-528
    • /
    • 2016
  • The searching of an extensive document database for documents that are locally similar to a given query document, and the subsequent detection of similar regions between such documents, is considered as an essential task in the fields of information retrieval and data management. In this paper, we present a framework for such a task. The proposed framework employs the method of short-read mapping, which is used in bioinformatics to reveal similarities between genomic sequences. In this paper, documents are considered biological objects; consequently, edit operations between locally similar documents are viewed as an evolutionary process. Accordingly, we are able to apply the method of evolution tracing in the detection of similar regions between documents. In addition, we propose heuristic methods to address issues associated with the different stages of the proposed framework, for example, a frequency-based fragment ordering method and a locality-aware interval aggregation method. Extensive experiments covering various scenarios related to the search of an extensive document database for documents that are locally similar to a given query document are considered, and the results indicate that the proposed framework outperforms existing methods.