Search | Korea Science

Partitioning and Merging an Index for Efficient XML Keyword Search (효율적 XML키워드 검색을 인덱스 분할 및 합병)

Kim, Sung-Jin;Lee, Hyung-Dong;Kim, Hyoung-Joo
- Journal of KIISE:Databases
- /
- v.33 no.7
- /
- pp.754-765
- /
- 2006
In XML keyword search, a search result is defined as a set of the smallest elements (i.e., least common ancestors) containing all query keywords and a granularity of indexing is an XML element instead of a document. Under the conventional index structure, all least common ancestors produced by the combination of the elements, each of which contains a query keyword, are considered as a search result. In this paper, to avoid unnecessary operations of producing the least common ancestors and reduce query process time, we describe a way to construct a partitioned index composed of several partitions and produce a search result by merging those partitions if necessary. When a search result is restricted to be composed of the least common ancestors whose depths are higher than a given minimum depth, under the proposed partitioned index structure, search systems can reduce the query process time by considering only combinations of the elements belonging to the same partition. Even though the minimum depth is not given or unknown, search systems can obtain a search result with the partitioned index, which requires the same query process time to obtain the search result with non-partitioned index. Our experiment was conducted with the XML documents provided by the DBLP site and INEX2003, and the partitioned index could reduce a substantial amount of query processing time when the minimum depth is given.
PDF KSCI

Resolving the Ambigities in World Sense by using Automatic Keyword Network in Information Retrieval (정보검색에서의 어의 중의성 해소를 위한 자동 키워드망의 이용)

Kim, Jung-Sae;Jang, Duk-Sung
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.12
- /
- pp.3855-3865
- /
- 2000
The automatic indexing is a compulsory part for the text retrieval system. However it is impossible to rank the appropriate texts at top. Furthermore, it is more difficult to prevent to rank the inappropriate texts having homonyms at top by only the automatic indexing. In this paper, we proposed the two-level retrieval system to enhance the retrieval efficiency, in which Automatic Keyword Network (AKN) is used at the second-level process. The firsHevel search is carried out with an inverted index file generated by the automatic indexing. On the other hand the second-level search exploits AKN based on the degree of asslxiation between terms. We have developed several formulas for rearranging the rank of texts at second-level search, and evaluated the performance of the effects of them on resolving the word sense ambiguities.
PDF

A Study on Natural Language Keyword Indexing for Web-based Information Retrieval (웹기반 정보검색을 위한 자연어 키워드 색인에 관한 연구)

윤성희
- Journal of the Korea Computer Industry Society
- /
- v.4 no.12
- /
- pp.1103-1111
- /
- 2003
Information retrieval system with indexing system matching single keyword is simple and popular. But with single keyword matching it is very hard to represent the exact meaning of documents and the set of documents from retrieval is very large, therefore it can't satisfy the user of the information retrieval systems. This paper proposes a phrase-based indexing system based on the phrase, the larger syntax unit than a single keyword. Web documents include lots of syntactic errors, the natural language parser with high Quality cannot be expected in Web. Partial trees, even not a full tree, from fully bottom-up parsing is still useful for extracting phrases, and they are much more discriminative than single keyword for index. It helps the information retrieval system enhance the efficiency and reduce the processing overhead.
PDF

Usability evaluation of navigation aid for searching menu items on mobile phone (휴대전화를 위한 메뉴검색 지원도구의 사용성 평가)

Park, Won-Kyu;Han, Sung-H.;Chae, Byung-Kee;Cha, Joo-Hyoung;Kim, Se-Na
- 한국HCI학회:학술대회논문집
- /
- 2006.02b
- /
- pp.169-174
- /
- 2006
최근의 휴대전화는 음성 통화뿐만 아니라 메시지 및 이메일 송/수신, 사진/동영상 촬영 등 다양한 종류의 작업을 수행할 수 있으며, 그 기능이 점차 확장되고 있는 추세이다. 그러나 화면 공간의 제약으로 인한 메뉴 항목명의 축약, 메뉴 항목수의 증가, 메뉴 구조의 복잡화 등의 원인 때문에 많은 사용자들이 메뉴검색에 불편함을 겪고 있다. 이러한 문제점을 해결하기 위해, 본 연구에서는 기존의 메뉴 네비게이션을 통한 검색 방식, 하위 메뉴항목 제시 방식 이외에, 키워드 검색방식 및 유사 키워드 검색 방식을 개발하고 4가지 메뉴 검색방식의 사용성 평가실험을 수행하였다. 사용성 평가 실험 결과, 수행도 측면에서는 메뉴 검색 방식간의 차이가 유의하지 않았지만, 사용자 만족도 측면에서는 통계적으로 유의한 차이가 존재하였다. 4가지 메뉴 검색 방식 중 유사 키워드 방식은 사용자 만족도 측면에서 가장 선호되는 방식으로써, 추후 실제 휴대전화에 적용될 경우 사용자 만족도를 제고할 수 있을 것으로 기대된다.
PDF

Design and Implementation of Ontology Based Search System for Problem Based Learning (문제해결학습을 위한 온톨로지 기반 검색 시스템의 설계 및 구현)

Choi, Suk-Young;Kim, Min-Jung;Ahn, Seong-Hun
- The Journal of the Korea Contents Association
- /
- v.6 no.12
- /
- pp.177-185
- /
- 2006
It is a difficult problem that learner have to need much times and efforts to search informations for problem solving. This is caused that the web based search system used by this time have the searching method of simple keyword matching. The searching method of simple keyword matching search informations by method of whether it is simply matched with keyword. Therefore, Learner have to much times and efforts to search informations, and may lose or be out of his bearing. To solve this problems, We design and implement a ontology based search system. This system is apply to PBL of social studies on middle school students. As a result, This system is more effect than the web based search system used by this time.
PDF

Keyword Extraction from News Corpus using Modified TF-IDF (TF-IDF의 변형을 이용한 전자뉴스에서의 키워드 추출 기법)

Lee, Sung-Jick;Kim, Han-Joon
- The Journal of Society for e-Business Studies
- /
- v.14 no.4
- /
- pp.59-73
- /
- 2009
Keyword extraction is an important and essential technique for text mining applications such as information retrieval, text categorization, summarization and topic detection. A set of keywords extracted from a large-scale electronic document data are used for significant features for text mining algorithms and they contribute to improve the performance of document browsing, topic detection, and automated text classification. This paper presents a keyword extraction technique that can be used to detect topics for each news domain from a large document collection of internet news portal sites. Basically, we have used six variants of traditional TF-IDF weighting model. On top of the TF-IDF model, we propose a word filtering technique called 'cross-domain comparison filtering'. To prove effectiveness of our method, we have analyzed usefulness of keywords extracted from Korean news articles and have presented changes of the keywords over time of each news domain.
PDF

Personal Information Searching System using Dynamic Indexing and Korean Contents Based Search (동적 색인과 한국어 내용 기반 검색을 이용한 개인용 검색 시스템)

Kim, Yun-Tae;Kim, Ji-Won;Son, Su-Jeong;Lee, Hyun-Ah
- Annual Conference on Human and Language Technology
- /
- 2018.10a
- /
- pp.639-641
- /
- 2018
고전적으로 이용되던 디렉터리 분류로는 원하는 정보를 빠르게 찾기 어려워지면서, 키워드 기반 검색 시스템이 정보 처리의 중심이 되고 있다. 본 논문에서는 개인용 컴퓨터에서의 빠른 자료 검색을 위한 키워드 기반 정보검색 시스템을 제안한다. 시스템에서는 동적 색인을 통하여 기존 시스템들보다 빠른 시간 내에 검색 결과를 제공한다. 내용 기반 검색과 다양한 포맷에 대한 문서 검색 기능을 포함하여 사용자에게 편리한 환경을 제공할 뿐만 아니라, 한글 문장이 포함된 문서에 대해서 원활한 검색을 제공하고자 한다. 성능 비교 검증을 수행한 결과 기존 시스템에 비해 보다 빠른 시간 내에 많은 문서를 탐지할 수 있음을 확인하였다.
PDF

A Study on Secure Searchable Encryption Considering Index Stucture (색인 구조를 고려한 안전한 검색가능암호기술에 관한 연구)

Lee, Sun-Ho;Lee, Im-Yeong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2013.11a
- /
- pp.738-739
- /
- 2013
네트워크 및 컴퓨팅 기술의 발달로 데이터를 위탁 저장하고 이를 언제어디서든 다양한 단말로 처리할 수 있는 클라우드 스토리지 서비스가 활성화되고 있다. 하지만, 위탁 저장된 민감한 정보가 암호화 없이 저장된다면 서버에 저장된 데이터를 데이터 소유주의 동의 없이 공격자 및 비윤리적인 서버관리자가 열람할 수 있어 저장된 데이터의 암호화 및 이를 검색하는 검색 가능한 암호시스템(Searchable Encryption System)이 등장하게 되었다. 기존의 검색가능 암호 시스템은 같은 키워드를 검색하기 위해 생성된 트랩도어가 동일한 형태를 가지게 되어 공격자가 검색 쿼리를 통해 사용자가 어떤 데이터를 저장하고 검색하는지 학습이 가능하다. 본 논문은 사용자가 같은 키워드를 검색하더라도 매번 다른 트랩도어가 생성되도록 하여 비윤리적인 서버관리자가 검색 쿼리를 통해 검색 내용 및 데이터를 유추할 수 없도록 하는 일회용 트랩도어를 이용한 검색가능 암호 시스템을 제안한다.
https://doi.org/10.3745/PKIPS.y2013m11a.738 인용 PDF

Content-based Extended CAN to Support Keyword Search (키워드 검색 지원을 위한 컨텐츠 기반의 확장 CAN)

Park, Jung-Soo;Lee, Hyuk-ro;U, Uk-dong;Jo, In-june
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- v.9 no.2
- /
- pp.103-109
- /
- 2005
Research about P2P system have recently a lot of attention in connection of form that pass early Centralized P2P and is Decentralized P2P. Specially, Structured P2P System of DHT base have a attention to scalability and systematic search and high search efficiency by routing. But, Structured P2P System of DHT base have problem, file can be located only their unique File IDs that although user may wish to search for files using a set descriptive keyword or do not have the exact File ID of the files. This paper propose extended-CAN mechanism that creates File ID of Contents base and use KID and CKD for commonness keyword processing to support keyword search in P2P System of DHT base.
PDF

Relevance Feedback Agent for Improving Precision in Korean Web Information Retrieval System (한국어 웹 정보검색 시스템의 정확도 향상을 위한 연관 피드백 에이전트)

Baek, Jun-Ho;Choe, Jun-Hyeok;Lee, Jeong-Hyeon
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.7
- /
- pp.1832-1840
- /
- 1999
Since the existed Korean Web IR systems generally use boolean system, it is difficult to retrieve the information to be wanted at one time. Also, because of the feature that web documents have the frequent abbreviation and many links, the keyword extraction using the inverted document frequency extracts the improper keywords for adding ambiguous meaning problem. Therefore, users must repeat the modification of the queries until they get the proper information. In this paper, we design and implement the relevance feedback agent system for resolving the above problems. The relevance feedback agent system extracts the proper information in response to user's preferred keywords and stores these keywords in preference DB table. When users retrieve this information later, the relevance feedback agent system will search it adding relevant keywords to user's queries. As a result of this method, the system can reduce the number of modification of user's queries and improve the efficiency of the IR system.
PDF

Search Result 1,014, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)