• Title/Summary/Keyword: semantic document-retrieval

Search Result 59, Processing Time 0.026 seconds

Clustering System Model of Intormation Retrieval using NFC Tag Information (NFC 태그 정보를 이용한 검색 정보의 군집 시스템 모델)

  • Park, Sun;Kim, HyeongGyun;Sim, Su-Jeong
    • Smart Media Journal
    • /
    • v.2 no.3
    • /
    • pp.17-22
    • /
    • 2013
  • The growth of the propagated NFC provides the various services with respect to internet applications, which it can be predicted from the simple internet services to the privated services. This paper proposes the clustering of information retrieval system model using NFC tag of access information for utilizing the similar information of the tag. The proposed model can search the similar information of the tag using the access information of NFC tag. In addition, it can cluster the similar retrieval information into topic cluster for utilizaing users.

  • PDF

Designing Requisite Techniques of Storage Structuresupporting Efficient Retrieval in Semantic Web (시멘틱 웹의 효율적 검색을 지원하는 저장 구조의 요소 기술 설계)

  • Shin Pan-Seop
    • Journal of the Korea Computer Industry Society
    • /
    • v.7 no.3
    • /
    • pp.227-236
    • /
    • 2006
  • Semantic Web is getting popular to next web environment. Additionally, ontology language research is also activating to represent semantic relation of resource in semantic web. Specially, Ontology language as RDF and DAML+OIL appear on start point of research. But Ontology Language limited to describing characters of resource and to making a clear definition of relation of resource. So W3C suggest OWL at the next standard language for describing resource. OWL supply the lack of representation for RDF and RDF Schema. In this paper, we make Ontology to implement Online Retrieval System using OWL and propose the structure of storing Ontology document at the RDB. The structure support characters of OWL that are equivalent relationship, heterogeneous relationship, inverse relationship, union relationship and one of relationship between classes or properties. In this paper, we classify the extended elements for OWL from RDF Schema. And we propose the method of storing OWL using RDB for interoperability with many applications based on RDB. Finally, implement the storage and retrieval system based on OWL to provide advanced search function.

  • PDF

Latent Semantic Indexing Analysis of K-Means Document Clustering for Changing Index Terms Weighting (색인어 가중치 부여 방법에 따른 K-Means 문서 클러스터링의 LSI 분석)

  • Oh, Hyung-Jin;Go, Ji-Hyun;An, Dong-Un;Park, Soon-Chul
    • The KIPS Transactions:PartB
    • /
    • v.10B no.7
    • /
    • pp.735-742
    • /
    • 2003
  • In the information retrieval system, document clustering technique is to provide user convenience and visual effects by rearranging documents according to the specific topics from the retrieved ones. In this paper, we clustered documents using K-Means algorithm and present the effect of index terms weighting scheme on the document clustering. To verify the experiment, we applied Latent Semantic Indexing approach to illustrate the clustering results and analyzed the clustering results in 2-dimensional space. Experimental results showed that in case of applying local weighting, global weighting and normalization factor, the density of clustering is higher than those of similar or same weighting schemes in 2-dimensional space. Especially, the logarithm of local and global weighting is noticeable.

Semantic Information Retrieval Based on User-Word Intelligent Network (U-WIN 기반의 의미적 정보검색 기술)

  • Im, Ji-Hui;Choi, Ho-Seop;Ock, Cheol-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.547-550
    • /
    • 2006
  • The criterion which judges an information retrieval system performance is to how many accurately retrieve an information that the user wants. The search result which uses only homograph has been appears the various documents that relates to each meaning of the word or intensively appears the documents that relates to specific meaning of it. So in this paper, we suggest semantic information retrieval technique using relation within User-Word Intelligent Network(U-WIN) to solve a disambiguation of query In our experiment, queries divide into two classes, the homograph used in terminology and the general homograph, and it sets the expansion query forms at "query + hypemym". Thus we found that only web document search's precision is average 73.5% and integrated search's precision is average 70% in two portal site. It means that U-WIN-Based semantic information retrieval technique can be used efficiently for a IR system.

  • PDF

A Dynamic Ontology-based Multi-Agent Context-Awareness User Profile Construction Method for Personalized Information Retrieval

  • Gao, Qian;Cho, Young Im
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.270-276
    • /
    • 2012
  • With the increase in amount of data and information available on the web, there have been high demands on personalized information retrieval services to provide context-aware services for the web users. This paper proposes a novel dynamic multi-agent context-awareness user profile construction method based on ontology to incorporate concepts and properties to model the user profile. This method comprehensively considers the frequency and the specific of the concept in one document and its corresponding domain ontology to construct the user profile, based on which, a fuzzy c-means clustering method is adopted to cluster the user's interest domain, and a dynamic update policy is adopted to continuously consider the change of the users' interest. The simulation result shows that along with the gradual perfection of the our user profile, our proposed system is better than traditional semantic based retrieval system in terms of the Recall Ratio and Precision Ratio.

Web Document Clustering based on Graph using Hyperlinks (하이퍼링크를 이용한 그래프 기반의 웹 문서 클러스터링)

  • Lee, Joon;Kang, Jin-Beom;Choi, Joong-Min
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.590-595
    • /
    • 2009
  • With respect to the exponential increment of web documents on the internet, it is important how to improve performance of clustering method for web documents. Web document clustering techniques can offer accurate information and fast information retrieval by clustering web documents through semantic relationship. The clustering method based on mesh-graph provides high recall by calculating similarity for documents, but it requires high computation cost. This paper proposes a clustering method using hyperlinks which is structural feature of web documents in order to keep effectiveness and reduce computation cost.

  • PDF

A Study on Paper Retrieval System based on OWL Ontology (OWL 온톨로지를 기반으로 하는 논문 검색 시스템에 관한 연구)

  • Sun, Bok-Keun;We, Da-Hyun;Han, Kwang-Rok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.169-180
    • /
    • 2009
  • The conventional paper retrieval is the keyword-based search and as a huge amount of data be published, this search becomes more difficult in retrieving information that user want to find. In order to search for information to the user's intent, we need to introduce semantic Web that represents semantics of Web document resources on the Internet environment as ontology and enables the computer to understand this ontology. Therefore, we describe a paper retrieval system through OWL(Ontology Web Language) ontology-based reason in this paper. We build the paper ontology based on OWL which is new popular ontology language for semantic Web and represent the correlation among diverse paper properties as the DL(description logic) query, and then this system infers the correct results from the paper ontology by using the DL query and makes it possible to retrieve information intelligently. Finally, we compared our experimental result with the conventional retrieval.

Automatic term-network construction for Oral Documents (구술문서에 기초한 자동 용어 네트워크 구축)

  • Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.4
    • /
    • pp.25-31
    • /
    • 2007
  • An automatic term-network construction system is proposed in this paper. This system uses the statistical values of the terms appeared in a document corpus. The 186 oral history documents collected from the Saemangeum area of Chollapuk-do, Korea, are used for the research. The term relationships presented in the term-network are decided by the cosine similarities of the term vectors. The number of the terms extracted from the documents is about 1700. The system is able to show the term relationships from the term-network as quickly as like a real-time system. The way of this term-network construction is expected as one of the methods to construct the ontology system and to support the semantic retrieval system in the near future.

  • PDF

Automatic indexing as a subject analysis technique (주제분석기법으로서의 자동색인)

  • 이영자
    • Journal of Korean Library and Information Science Society
    • /
    • v.12
    • /
    • pp.61-96
    • /
    • 1985
  • The human subject analysis of a document has some critical problems. The method results in the inconsistency in analysis process and the contradiction of two objects of the subject analysis (one is the identification of the content for the retrieval of specific items and the other is to identify the content for the grouping of related materials). Since the subject analysis by mechanized has been recognized to be the possible way to aggregate the problems of manual analysis, various a n.0, pproaches of automatic indexing have been studied and experimented. This study is to examine the automatic indexing as one of the promising subject analysis techniques by statistical, syntactical and semantic a n.0, pproaches. In conclusion, the reasonable a n.0, pplication time of the automatic indexing should be made a decision based on the through investigation on the cost verse effectiveness, and automatic indexing system should be developed in the close relationship with the on-line search which is a good retrieval system for information explosion society. From now on, since the machine-readable document-text will be envisaged to be more and more available due to the rapid development of computer technology, the more substantial research on the automatic indexing will be also possible, which can bring about the increasing of practical automatic indexing systems.

  • PDF

Multiple Cause Model-based Topic Extraction and Semantic Kernel Construction from Text Documents (다중요인모델에 기반한 텍스트 문서에서의 토픽 추출 및 의미 커널 구축)

  • 장정호;장병탁
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.595-604
    • /
    • 2004
  • Automatic analysis of concepts or semantic relations from text documents enables not only an efficient acquisition of relevant information, but also a comparison of documents in the concept level. We present a multiple cause model-based approach to text analysis, where latent topics are automatically extracted from document sets and similarity between documents is measured by semantic kernels constructed from the extracted topics. In our approach, a document is assumed to be generated by various combinations of underlying topics. A topic is defined by a set of words that are related to the same topic or cooccur frequently within a document. In a network representing a multiple-cause model, each topic is identified by a group of words having high connection weights from a latent node. In order to facilitate teaming and inferences in multiple-cause models, some approximation methods are required and we utilize an approximation by Helmholtz machines. In an experiment on TDT-2 data set, we extract sets of meaningful words where each set contains some theme-specific terms. Using semantic kernels constructed from latent topics extracted by multiple cause models, we also achieve significant improvements over the basic vector space model in terms of retrieval effectiveness.