• Title/Summary/Keyword: Extended Boolean Model

Search Result 11, Processing Time 0.025 seconds

Relevance Feedback Method of an Extended Boolean Model using Hierarchical Clustering Techniques (계층적 클러스터링 기법을 이용한 확장 불리언 모델의 적합성 피드백 방법)

  • 최종필;김민구
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1374-1385
    • /
    • 2004
  • The relevance feedback process uses information obtained from a user about an initially retrieved set of documents to improve subsequent search formulations and retrieval performance. In the extended Boolean model, the relevance feedback Implies not only that new query terms must be identified, but also that the terms must be connected with the Boolean AND/OR operators properly Salton et al. proposed a relevance feedback method for the extended Boolean model, called the DNF (disjunctive normal form) method. However, this method has a critical problem in generating a reformulated queries. In this study, we investigate the problem of the DNF method and propose a relevance feedback method using hierarchical clustering techniques to solve the problem. We show the results of experiments which are performed on two data sets: the DOE collection in TREC 1 and the Web TREC 10 collection.

Cost-based Optimization of Extended Boolean Queries (확장 불리언 질의에 대한 비용 기반 최적화)

  • 박병권
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.3
    • /
    • pp.29-40
    • /
    • 2001
  • In this paper, we suggest a query optimization algorithm to select the optimal processing method of an extended boolean query on inverted files. There can be a lot of methods for processing an extended boolean query according to the processing sequence oh the keywords con tamed in the query, In this sense, the problem of optimizing an extended boolean query it essentially that of optimizing the keyword sequence in the query. In this paper, we show that the problem is basically analogous to the problem of finding the optimal join order in database query optimization, and apply the ideas in the area to the problem solving. We establish the cost model for processing an extended boolean query and develop an algorithm to filled the optimal keyword-processing sequence based on the concept of keyword rank using the keyword selectivity and the access costs of inverted file. We prove that the method selected by the optimization algorithm is really optimum, and show, through experiments, that the optimal method is superior to the others in performance We believe that the suggested optimization algorithm will contribute to the significant enhancement of the information retrieval performance.

  • PDF

Query Expansion Using Augmented Terms in an Extended Boolean Model

  • Nguyen, Tuan-Quang;Heo, Jun-Seok;Lee, Jung-Hoon;Kim, Yi-Reun;Whang, Kyu-Young
    • Journal of Computing Science and Engineering
    • /
    • v.2 no.1
    • /
    • pp.26-43
    • /
    • 2008
  • We propose a new query expansion method in the extended Boolean model that improves precision without degrading recall. For improving precision, our method promotes the ranks of documents having more query terms since users typically prefer such documents. The proposed method consists of the following three steps: (1) expanding the query by adding new terms related to each term of the query, (2) further expanding the query by adding augmented terms, which are conjunctions of the terms, (3) assigning a weight on each term so that augmented terms have higher weights than the other terms. We conduct extensive experiments to show the effectiveness of the proposed method. The experimental results show that the proposed method improves precision by up to 102% for the TREC-6 data compared with the existing query expansion method using a thesaurus proposed by Kwon et al.

A study on improving the effectiveness of a boolean retrieval system with feedback information (피드백 정보를 이용한 불논리 검색 시스템의 성능 증진에 관한 실험적 연구)

  • 신은자;정영미
    • Journal of the Korean Society for information Management
    • /
    • v.15 no.1
    • /
    • pp.129-148
    • /
    • 1998
  • The objective of this study is to develop a useful relevance feedback retrieval technique that can be applied to the current Boolean retrieval system. A feedback retrieval technique based on user model is recommended here to achieve this objective. To prove the usefulness of this feedback retrieval technique, two enhanced Boolean retrieval models including DNF model and P-norm model were evaluated first through retrieval effectiveness experiments. After selecting DNF model as the retrieval model, two feedback retrieval experiments were performed using initial and extended user models. It is proved that the feedback retrieval based on user model can greatly enhance the effectiveness of a Boolean retrieval system with a small modification.

  • PDF

A Study on an Extended Cyber Attack Tree for an Analysis of Network Vulnerability (네트워크 취약성 분석을 위한 확장된 사이버 공격 트리에 관한 연구)

  • Eom, Jung Ho;Park, Seon Ho;Chung, Tai M.
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.6 no.3
    • /
    • pp.49-57
    • /
    • 2010
  • We extended a general attack tree to apply cyber attack model for network vulnerability analysis. We defined an extended cyber attack tree (E-CAT) which extends the general attack tree by associating each node of the tree with a transition of attack that could have contributed to the cyber attack. The E-CAT resolved the limitation that a general attack tree can not express complex and sophisticate attacks. Firstly, the Boolean expression can simply express attack scenario with symbols and codes. Secondary, An Attack Generation Probability is used to select attack method in an attack tree. A CONDITION-composition can express new and modified attack transition which a aeneral attack tree can not express. The E-CAT is possible to have attack's flexibility and improve attack success rate when it is applied to cyber attack model.

Document ranking methods using term dependencies from a thesaurus (시소러스의 연관성 정보를 이용한 문서의 순위 결정 방법)

  • 이준호
    • Journal of the Korean Society for information Management
    • /
    • v.10 no.2
    • /
    • pp.3-22
    • /
    • 1993
  • In recent years various document ranking methods such as Relevance. R-Distance and K-Distance have been developed wh~ch can be used in thesaurus-based boolean retrieval systems. They give high quality document rankings in many cases by using term dependence lnformatlon from a thesaurus. However, they suffer from several problems resulting from inefficient and Ineffective evaluation of boolean operators AND. OR and NOT. In this paper we propose new thesaurus-based document ranking methods called KB-FSM and KB-EBM by exploitmg the enhanced fuzzy set model and the extended boolean model. The proposed methods overcome the problems of the previous methods and use term dependencies from a thesaurs effectively. We also show through performance comparison that KB-FSM and KBEBM provide higher retrieval effectiveness than Relevance. R-D~stance and K-Distance.

  • PDF

Boolean Query Formulation From Korean Natural Language Queries using Syntactic Analysis (구문분석에 기반한 한글 자연어 질의로부터의 불리언 질의 생성)

  • Park, Mi-Hwa;Won, Hyeong-Seok;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1219-1229
    • /
    • 1999
  • 일반적으로 AND, OR, NOT과 같은 연산자를 사용하는 불리언 질의는 사용자의 검색의도를 정확하게 표현할 수 있기 때문에 검색 전문가들은 불리언 질의를 사용하여 높은 검색성능을 얻는다고 알려져 있지만, 일반 사용자는 자신이 원하는 정보를 불리언 형태로 표현하는데 익숙하지 않다. 본 논문에서는 검색성능의 향상과 사용자 편의성을 동시에 만족하기 위하여 사용자의 자연어 질의를 확장 불리언 질의로 자동 변환하는 방법론을 제안한다. 먼저 자연어 질의를 범주문법에 기반한 구문분석을 수행하여 구문트리를 생성하고 연산자 및 키워드 정보를 추출하여 구문트리를 간략화한다. 다음으로 간략화된 구문트리로부터 명사구를 합성하고 키워드들에 대한 가중치를 부여한 후 불리언 질의를 생성하여 검색을 수행한다. 또한 구문분석의 오류로 인한 검색성능 저하를 최소화하기 위하여 상위 N개 구문트리에 대해 각각 불리언 질의를 생성하여 검색하는 N-BEST average 방법을 제안하였다. 정보검색 실험용 데이타 모음인 KTSET2.0으로 실험한 결과 제안된 방법은 수동으로 추출한 불리언 질의보다 8% 더 우수한 성능을 보였고, 기존의 벡터공간 모델에 기반한 자연어질의 시스템에 비해 23% 성능향상을 보였다. Abstract There have been a considerable evidence that trained users can achieve a good search effectiveness through a boolean query because a structural boolean query containing operators such as AND, OR, and NOT can make a more accurate representation of user's information need. However, it is not easy for ordinary users to construct a boolean query using appropriate boolean operators. In this paper, we propose a boolean query formulation method that automatically transforms a user's natural language query into a extended boolean query for both effectiveness and user convenience. First, a user's natural language query is syntactically analyzed using KCCG(Korean Combinatory Categorial Grammar) parser and resulting syntactic trees are structurally simplified using a tree-simplifying mechanism in order to catch the logical relationships between keywords. Next, in a simplified tree, plausible noun phrases are identified and added into the same tree as new additional keywords. Finally, a simplified syntactic tree is automatically converted into a boolean query using some mapping rules and linguistic heuristics. We also propose an N-BEST average method that uses top N syntactic trees to compensate for bad effects of single incorrect top syntactic tree. In experiments using KTSET2.0, we showed that a proposed method outperformed a traditional vector space model by 23%, and surprisingly manually constructed boolean queries by 8%.

A Study on the Efficient Configuration Thread Control Modeling in Version Control using Object Oriented System (객체지향 시스템을 이용한 버전제어에서 효율적인 형상 형성 제어 모델링에 관한 연구)

  • Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.4 s.36
    • /
    • pp.123-132
    • /
    • 2005
  • A version control system is used in a rapidly changed environment or a program which developed in a complicated environment. And configuration thread information supporting and it's processing method has an important part in version control. Configuration thread tool such as a system model of DSEE, a view of ClearCase, a label of SourceSafe, and the package of CCC/Harvest have applied to formalized configuration rule by user and obtained a desired configuration information of the version. But it is a problem of configuration thread in supporting information that we. in this method. can't know a exactly well-defined configuration rule information and a predefined information. And these information have a demerit that can't supported the close connection along with undefined version and a meta-information. In this paper. we have modeling a system for these problems to solve and a efficiently configuration thread supported. We also proposed a mixed retrieval model included a boolean retrieval model and a vector retrieval model for support efficiently configuration thread information. We applied and designed the libraries using extended facet method.

  • PDF

Material feature representation and identification with composite surfacelets

  • Huang, Wei;Wang, Yan;Rosen, David W.
    • Journal of Computational Design and Engineering
    • /
    • v.3 no.4
    • /
    • pp.370-384
    • /
    • 2016
  • Computer-aided materials design requires new modeling approaches to characterize and represent fine-grained geometric structures and material compositions at multiple scales. Recently, a dual-Rep approach was developed to model materials microstructures based on a new basis function, called surfacelet. As a combination of implicit surface and wavelets, surfacelets can efficiently identify and represent planar, cylindrical, and ellipsoidal geometries in material microstructures and describe the distribution of compositions and properties. In this paper, these primitive surfacelets are extended and composite surfacelets are proposed to model more complex geometries. Composite surfacelets are constructed by Boolean operations on the primitives. The surfacelet transform is applied to match geometric features in three-dimensional images. The composition of the material near the identified features can then be modeled. A cubic surfacelet and a v-joint surfacelet are developed to demonstrate the reverse engineering process of retrieving material compositions from material images.

Three-Dimensional Digital-Mold Modeling and Sand-Printing for Replication of Bronze Mirror

  • Jo, Young Hoon;Lee, Jungmin
    • Journal of Conservation Science
    • /
    • v.37 no.1
    • /
    • pp.25-33
    • /
    • 2021
  • To extend the application of digital technology to the replication of artifacts, meticulous details of the process and the diversity of three-dimensional (3D) printing output materials need to be supplemented. Thus, in this study, a bronze mirror with Hwangbichangcheon inscription was digitalized by 3D scanning, converted into a voxel model, and virtual conservation treatment was performed using a haptic device. Furthermore, the digital mold of the bronze mirror completed by Boolean modeling was printed using a 3D sand-printer. Such contactless replication based on digital technology reflects the stability, precision, expressivity, collectivity, durability, and economic feasibility of artifacts. Its application can be further extended to cultural products as well as such areas as education, exhibition, and research. It is expected to be in high demand for metal artifacts that require casting. If empirical studies through experimental research on casting are supplemented in the future, it could extend the application of digital technology-based contactless replication methods.