• 제목/요약/키워드: 검색키워드

Search Result 1,017, Processing Time 0.027 seconds

Conjunctive Boolean Query Optimization based on Join Sequence Separability in Information Retrieval Systems (정보검색시스템에서 조인 시퀀스 분리성 기반 논리곱 불리언 질의 최적화)

  • 박병권;한욱신;황규영
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.395-408
    • /
    • 2004
  • A conjunctive Boolean text query refers to a query that searches for tort documents containing all of the specified keywords, and is the most frequently used query form in information retrieval systems. Typically, the query specifies a long list of keywords for better precision, and in this case, the order of keyword processing has a significant impact on the query speed. Currently known approaches to this ordering are based on heuristics and, therefore, cannot guarantee an optimal ordering. We can use a systematic approach by leveraging a database query processing algorithm like the dynamic programming, but it is not suitable for a text query with a typically long list of keywords because of the algorithm's exponential run-time (Ο(n2$^{n-1}$)) for n keywords. Considering these problems, we propose a new approach based on a property called the join sequence separability. This property states that the optimal join sequence is separable into two subsequences of different join methods under a certain condition on the joined relations, and this property enables us to find a globally optimal join sequence in Ο(n2$^{n-1}$). In this paper we describe the property formally, present an optimization algorithm based on the property, prove that the algorithm finds an optimal join sequence, and validate our approach through simulation using an analytic cost model. Comparison with the heuristic text query optimization approaches shows a maximum of 100 times faster query processing, and comparison with the dynamic programming approach shows exponentially faster query optimization (e.g., 600 times for a 10-keyword query).

Patent data analysis using clique analysis in a keyword network (키워드 네트워크의 클릭 분석을 이용한 특허 데이터 분석)

  • Kim, Hyon Hee;Kim, Donggeon;Jo, Jinnam
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1273-1284
    • /
    • 2016
  • In this paper, we analyzed the patents on machine learning using keyword network analysis and clique analysis. To construct a keyword network, important keywords were extracted based on the TF-IDF weight and their association, and network structure analysis and clique analysis was performed. Density and clustering coefficient of the patent keyword network are low, which shows that patent keywords on machine learning are weakly connected with each other. It is because the important patents on machine learning are mainly registered in the application system of machine learning rather thant machine learning techniques. Also, our results of clique analysis showed that the keywords found by cliques in 2005 patents are the subjects such as newsmaker verification, product forecasting, virus detection, biomarkers, and workflow management, while those in 2015 patents contain the subjects such as digital imaging, payment card, calling system, mammogram system, price prediction, etc. The clique analysis can be used not only for identifying specialized subjects, but also for search keywords in patent search systems.

A Study on the Intellectual Structure Networks of International Collaboration in Psychiatry (정신의학 분야 국제공동연구의 지적구조 네트워크 분석)

  • Kim, Eunju;Roh, Sungwon;Nam, Taewoo
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.1
    • /
    • pp.53-84
    • /
    • 2016
  • This study clarified the intellectual structure of international collaboration in psychiatry based on analyzing networks in order to vitalize for international collaboration in psychiatry in South Korea. The data set was collected from Web of Science citation database during the period from 2009 to 2013. SU="psychiatry" search formulary (means field of psychiatric medical research) was used through advanced retrieval function and a total of 18,590 articles were selected among international collaborations. A total of 85 different keywords were selected from the 18,590 articles, and the results of analysis were as follows. First, this study examined 8 sub-subject areas focusing on disorders, and found that major subject areas could be divided into a total of 8 sub-subject areas. Second, this study examined 6 keywords that have a strong impact, and extend subject areas by promoting intermediation between other keywords Third, this study examined sub-subject areas by using the Knowledge Classification Scheme of the National Research Foundation of Korea through community analysis, and found a total of 15 clusters and a total of 12 sub-subject areas.

Implementation of a Content-Based Image Retrieval System with Color Assignments (칼라 지정을 이용한 내용기반 화상검색 시스템 구현)

  • Kim, Cheol-Won;Choi, Ki-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.4
    • /
    • pp.933-943
    • /
    • 1997
  • In this paper, a conernt-based image retrival system with color assigments has been stueide and implment-ed. The color of images has been extracted after changing RGB color space to HSV(hue, saturation, value)that is the most compatible color for peop]e's feeling. In the color extracting, an image is divided into 9 different areasand 3 major colors for each area are selected by using color histograms. It is possible to chose the class of umages by keywords. We are evaluate four different types of queries such as an image input, keywords with color assignments, combining an image input and keywords with color assinments, and selecting specific part of an umage. Experimental rusults show that four different query types privide precision/recall 0.55/0.37, 0.57/0.43, 0.59/0.45 and 0.63/0.61, respectively. With color assignments, the retrieval system has been able to obtain high performance and validity.

  • PDF

An Efficient Retrieval Technique for Spatial Web Objects (공간 웹 객체의 효율적인 검색 기법)

  • Yang, PyoungWoo;Nam, Kwang Woo
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.390-398
    • /
    • 2015
  • Spatial web objects refer to web documents that contain geographic information. Recently, services that create spatial web objects have increased greatly because of the advancements in devices such as smartphones. For services such as Twitter or Facebook, simple texts posted by users is stored along with information about the post's location. To search for such spatial web objects, a method that uses spatial information and text information simultaneously is required. Conventional spatial web object search methods mostly use R-tree and inverted file methods. However, these methods have a disadvantage of requiring a large volume of space when building indices. Furthermore, such methods are efficient for searching with many keywords but are inefficient for searching with a few keywords.. In this paper, we propose a spatial web object search method that uses a quad-tree and a patricia-trie. We show that the proposed technique is more effective than existing ones in searching with a small number of keywords. Furthermore, we show through an experiment that the space required by the proposed technique is much smaller than that required by existing ones.

Adaptive Web Search based on User Web Log (사용자 웹 로그를 이용한 적응형 웹 검색)

  • Yoon, Taebok;Lee, Jee-Hyong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.11
    • /
    • pp.6856-6862
    • /
    • 2014
  • Web usage mining is a method to extract meaningful patterns based on the web users' log data. Most existing patterns of web usage mining, however, do not consider the users' diverse inclination but create general models. Web users' keywords can have a variety of meanings regarding their tendency and background knowledge. This study evaluated the extraction web-user's pattern after collecting and analyzing the web usage information on the users' keywords of interest. Web-user's pattern can supply a web page network with various inclination information based on the users' keywords of interest. In addition, the Web-user's pattern can be used to recommend the most appropriate web pages and the suggested method of this experiment was confirmed to be useful.

Retrieval Model using Subject Classification Table, User Profile, and LSI (전공분류표, 사용자 프로파일, LSI를 이용한 검색 모델)

  • Woo Seon-Mi
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.789-796
    • /
    • 2005
  • Because existing information retrieval systems, in particular library retrieval systems, use 'exact keyword matching' with user's query, they present user with massive results including irrelevant information. So, a user spends extra effort and time to get the relevant information from the results. Thus, this paper will propose SULRM a Retrieval Model using Subject Classification Table, User profile, and LSI(Latent Semantic Indexing), to provide more relevant results. SULRM uses document filtering technique for classified data and document ranking technique for non-classified data in the results of keyword-based retrieval. Filtering technique uses Subject Classification Table, and ranking technique uses user profile and LSI. And, we have performed experiments on the performance of filtering technique, user profile updating method, and document ranking technique using the results of information retrieval system of our university' digital library system. In case that many documents are retrieved proposed techniques are able to provide user with filtered data and ranked data according to user's subject and preference.

Effective Searchable Symmetric Encryption System using Conjunctive Keyword on Remote Storage Environment (원격 저장소 환경에서 다중 키워드를 이용한 효율적인 검색 가능한 대칭키 암호 시스템)

  • Lee, Sun-Ho;Lee, Im-Yeong
    • The KIPS Transactions:PartC
    • /
    • v.18C no.4
    • /
    • pp.199-206
    • /
    • 2011
  • Removable Storage provides the excellent portability with light weight and small size which fits in one's hand, many users have recently turned attention to the high-capacity products. However, due to the easy of portability for Removable Storage, Removable Storage are frequently lost and stolen and then many problems have been occurred such as the leaking of private information to the public. The advent of remote storage services where data is stored throughout the network, has allowed an increasing number of users to access data. The main data of many users is stored together on remote storage, but this has the problem of disclosure by an unethical administrator or attacker. To solve this problem, the encryption of data stored on the server has become necessary, and a searchable encryption system is needed for efficient retrieval of encrypted data. However, the existing searchable encryption system has the problem of low efficiency of document insert/delete operations and multi-keyword search. In this paper, an efficient searchable encryption system is proposed.

Semantic Ontology Speech Information Extraction using Non-parametric Correlation Coefficient (비모수적 상관계수를 이용한 시맨틱 온톨로지 음성 정보 추출)

  • Lee, Byungwook
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.147-151
    • /
    • 2013
  • On retrieving high frequency keywords in information retrieval system, mismatchings to user's request are problems because of the various meanings of keywords in the existing ontology configuration. In this paper, it is to construct personnel selection ontology and rules in personnel management which are composed of various concepts and knowledges based on semantic web technology and suggest selection procedures to support these rules and knowledge retrieval system to verify suitability of selection results. This system utilizes a method of extraction of speech features by using non-parametric correlation coefficient. This proposed method has been validated by showing that the result average SNR of the experiment evaluation of the proposed techniques was shown to be decreased by .752dB.

Design of Keyword Extraction System Using TFIDF (TFIDF를 이용한 키워드 추출 시스템 설계)

  • 이말례;배환국
    • Korean Journal of Cognitive Science
    • /
    • v.13 no.1
    • /
    • pp.1-11
    • /
    • 2002
  • In this paper, a test was performed to determine whether words in Anchor Text were appropriate as key words. As a result of the test. there were proper words of high weighting factor, while some others did not even appear in the text. therefore, were not appropriate as key words. In order to resolve this problem. a new method was proposed to extract key words. Using the proposed method, inappropriate key words can be removed so that new key words be set, and then, ranking becomes possible with the TFIDF value as a weighting factor of the key word. It was verified that the new method has higher accuracy compared to the previous methods.

  • PDF