• Title/Summary/Keyword: Concept-based Information Retrieval

Search Result 116, Processing Time 0.027 seconds

Knowledge-poor Term Translation using Common Base Axis with application to Korean-English Cross-Language Information Retrieval (과도한 지식을 요구하지 않는 공통기반축에 의한 용어 번역과 한영 교차정보검색에의 응용)

  • 최용석;최기선
    • Korean Journal of Cognitive Science
    • /
    • v.14 no.1
    • /
    • pp.29-40
    • /
    • 2003
  • Cross-Language Information Retrieval (CLIR) deals with the documents in various languages by one language query. A user who uses one language can retrieve the documents in another language through CLIR system. In CLIR, query translation method is known to be more efficient. For the better performance of query translation, we need more resources like dictionary, ontology, and parallel/comparable corpus but usually not available. This paper proposes a new concept called the Common Base Axis which is adapted to Korean-English Query translation ann a new weighting method in dictionary based query translation. The essential idea is that we can express Korean and English word in one vector space by Common Base Axis and use it in calculating sense distance for query weighting. The experiments show that Common Base Axis gives us good performance without ontology and is especially good for one word query translation.

  • PDF

Construction of the Concept-Based Faceted Framework for Thesaurus Integration (시소러스 통합을 위한 개념기반 패싯 프레임워크 구축)

  • Lee, Seung-Min
    • Journal of Korean Library and Information Science Society
    • /
    • v.41 no.3
    • /
    • pp.269-290
    • /
    • 2010
  • Applying one specific thesaurus might cause several problems because each thesaurus has its own characteristics inherited from its construction process. Therefore, integration of thesauri can be an appropriate approach to overcome the difficulties. This current research selected physics as a domain and two thesauri in the domain: PACS and PIRA. By integrating these two heterogeneous thesauri, this research could construct a conceptual structure that covers the whole concepts related to physics. By constructing the conceptual structure with the use of facet analysis from integrated thesaurus, it provides knowledge base with hierarchical structure and clear relationships between concepts. It can be an alternate approach to effective and efficient information retrieval and knowledge discovery.

  • PDF

Region Based Image Similarity Search using Multi-point Relevance Feedback (다중점 적합성 피드백방법을 이용한 영역기반 이미지 유사성 검색)

  • Kim, Deok-Hwan;Lee, Ju-Hong;Song, Jae-Won
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.857-866
    • /
    • 2006
  • Performance of an image retrieval system is usually very low because of the semantic gap between the low level feature and the high level concept in a query image. Semantically relevant images may exhibit very different visual characteristics, and may be scattered in several clusters. In this paper, we propose a content based image rertrieval approach which combines region based image retrieval and a new relevance feedback method using adaptive clustering together. Our main goal is finding semantically related clusters to narrow down the semantic gap. Our method consists of region based clustering processes and cluster-merging process. All segmented regions of relevant images are organized into semantically related hierarchical clusters, and clusters are merged by finding the number of the latent clusters. This method, in the cluster-merging process, applies r: using v principal components instead of classical Hotelling's $T_v^2$ [1] to find the unknown number of clusters and resolve the singularity problem in high dimensions and demonstrate that there is little difference between the performance of $T^2$ and that of $T_v^2$. Experiments have demonstrated that the proposed approach is effective in improving the performance of an image retrieval system.

Design and Implementation of On-line Standards Development System on the World Wide Web (WWW상에서의 온라인 정보통신표준 개발 시스템 설계 및 구현)

  • 구경철;김형준;박기식;송기평;조인준;정회경
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.2 no.4
    • /
    • pp.559-573
    • /
    • 1998
  • Recently Standards Developments Organizations (SDO$\_S$) in the field of Information and Communication recognize that "More new and more complex standards should be developed in shorter time". To cope with this challenge they try to construct Standards Information Cooperation Network (SICN) or Electronic Document Handling (EDH) systems for efficient standards development process. This paper presents the design and implementation of an Extranet based Web system dedicated to effective on-line standards making environments. The system, which is called SICN (Standards Information Cooperation Network), is a workflow-based network application created with a view to fostering faster standards development with functionalities such as an electronic signature mechanism, electronic voting, comment gathering and dynamic links for ready retrieval of standards information stored in a database. This paper also describes the concept of a VSDO (Virtual Standards Development Organization) that supports all the features needed by the relevant standards making bodies to carry out their activities in dynamic on-line environments.ironments.

  • PDF

Semantic Search System using Ontology-based Inference (온톨로지기반 추론을 이용한 시맨틱 검색 시스템)

  • Ha Sang-Bum;Park Yong-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.3
    • /
    • pp.202-214
    • /
    • 2005
  • The semantic web is the web paradigm that represents not general link of documents but semantics and relation of document. In addition it enables software agents to understand semantics of documents. We propose a semantic search based on inference with ontologies, which has the following characteristics. First, our search engine enables retrieval using explicit ontologies to reason though a search keyword is different from that of documents. Second, although the concept of two ontologies does not match exactly, can be found out similar results from a rule based translator and ontological reasoning. Third, our approach enables search engine to increase accuracy and precision by using explicit ontologies to reason about meanings of documents rather than guessing meanings of documents just by keyword. Fourth, domain ontology enables users to use more detailed queries based on ontology-based automated query generator that has search area and accuracy similar to NLP. Fifth, it enables agents to do automated search not only documents with keyword but also user-preferable information and knowledge from ontologies. It can perform search more accurately than current retrieval systems which use query to databases or keyword matching. We demonstrate our system, which use ontologies and inference based on explicit ontologies, can perform better than keyword matching approach .

Implementation and Analysis of the Agent based Object-Oriented Software Test Tool, TAS (에이전트 기반의 객체지향 소프트웨어 테스트 도구인 TAS의 구현 및 분석)

  • Choi, Jeon-Geun;Choi, Byoungju
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.10
    • /
    • pp.732-742
    • /
    • 2001
  • The concept of an agent has become important in computer science and has been applied to the number of application domains such electronic commerce and information retrieval. But, no one has proposed yet in software test. The test agent system applied the concept of an agent to software test is new test tool. It consists of the User Interface Agent. the Test Case Selection & Testing Agent and the Regression Test Agent. Each of these agents, with their intelligent rules, carry out the tests autonomously by empolying the object-oriented test processes. This system has 2 advantages. Firstly since the tests are carried our autonomously, it minimizes tester interference and secondly, since redundant-free and consistent effective test cases are intellectually selected, the testing time is reduced while the fault detection effectiveness improves. In this paper, by actually showing the testing process being carried out autonomously by the 3 agents that form the TAS, we show that the TAS minimizes tester interference. By also carrying out the 4 different types of experiments on the RE-Rule, CTS-Rule, overall TAS experiment, and the fault-detection effectiveness experiment on the RE-Rule, we show the cut-down on the testing time and improvement in the fault detection effectivity.

  • PDF

Design of Algorithm for Efficient Retrieve Pure Structure-Based Query Processing and Retrieve in Structured Document (구조적 문서의 효율적인 구조 질의 처리 및 검색을 위한 알고리즘의 설계)

  • 김현주
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.8
    • /
    • pp.1089-1098
    • /
    • 2001
  • Structure information contained in a structured document supports various access paths to document. In order to use structure information contained in a structured document, it is required to construct an index structural on document structures. Content indexing and structure indexing per document require high memory overhead. Therefore, processing of pure structure queries based on document structure like relationship between elements or element orders, low memory overhead for indexing are required. This paper suggests the GDIT(Global Document Instance Tree) data structure and indexing scheme about structure of document which supports low memory overhead for indexing and powerful types of user queries. The structure indexing scheme only index the lowest level element of document and does not effect number of document having retrieval element. Based on the index structure, we propose an query processing algorithm about pure structure, proof the indexing schemes keeps up indexing efficient in terms of space. The proposed index structure bases GDR concept and uses index technique based on GDIT.

  • PDF

Semantic Similarity Measures Between Words within a Document using WordNet (워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정)

  • Kang, SeokHoon;Park, JongMin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.11
    • /
    • pp.7718-7728
    • /
    • 2015
  • Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.

The SemanticWeb Technology and its Applications (시맨틱웹 기술과 활용방안)

  • 오삼균
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.4
    • /
    • pp.298-319
    • /
    • 2002
  • The Semantic Web is a new technology that attempts to achieve effective retrieval, automation, integration, and reuse of web resources by constructing knowledge bases that are composed of machine-readable definitions and associations of resources that express the relationships among them. To have this kind of Semantic Web in place, it is necessary to have the following infrastructures: capability to assign unchangeable and unique identifier (URI) to each resource, adoption of XML namespace concept to prevent collision of element and attribute names defined by various institutions, widespread use of RDF to describe resources so that diverse metadata can be interoperable, use of RDF schema to define the meaning of metadata elements and the relationships among them, adoption of DAML+OIL that is built upon RDF(S) to increase reasoning capability and expressive power, and finally adoption of OWL that is built upon DAML+OIL by removing unnecessary constructors and adding new ones based on experience of using DAML+OIL. The purpose of this study is to describe the central concepts and technologies related to the Semantic Web and to discuss the benefits of metadata interoperability based on XML/RDF schemas and the potential applications of diverse ontologies.

A New Semantic Distance Measurement Method using TF-IDF in Linked Open Data (링크드 오픈 데이터에서 TF-IDF를 이용한 새로운 시맨틱 거리 측정 기법)

  • Cho, Jung-Gil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2020
  • Linked Data allows structured data to be published in a standard way that datasets from various domains can be interlinked. With the rapid evolution of Linked Open Data(LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. In this paper, we propose a method, on top of the basic concept of Linked Data Semantic Distance (LDSD), for calculating the Linked Data semantic distance between resources that can be used in the LOD-based recommender system. The semantic distance measurement model proposed in this paper is based on a similarity measurement that combines the LOD-based semantic distance and a new link weight using TF-IDF, which is well known in the field of information retrieval. In order to verify the effectiveness of this paper's approach, performance was evaluated in the context of an LOD-based recommendation system using mixed data of DBpedia and MovieLens. Experimental results show that the proposed method shows higher accuracy compared to other similar methods. In addition, it contributed to the improvement of the accuracy of the recommender system by expanding the range of semantic distance calculation.