Search | Korea Science

Boolean Query Formulation From Korean Natural Language Queries using Syntactic Analysis (구문분석에 기반한 한글 자연어 질의로부터의 불리언 질의 생성)

Park, Mi-Hwa;Won, Hyeong-Seok;Lee, Geun-Bae
- Journal of KIISE:Software and Applications
- /
- v.26 no.10
- /
- pp.1219-1229
- /
- 1999
일반적으로 AND, OR, NOT과 같은 연산자를 사용하는 불리언 질의는 사용자의 검색의도를 정확하게 표현할 수 있기 때문에 검색 전문가들은 불리언 질의를 사용하여 높은 검색성능을 얻는다고 알려져 있지만, 일반 사용자는 자신이 원하는 정보를 불리언 형태로 표현하는데 익숙하지 않다. 본 논문에서는 검색성능의 향상과 사용자 편의성을 동시에 만족하기 위하여 사용자의 자연어 질의를 확장 불리언 질의로 자동 변환하는 방법론을 제안한다. 먼저 자연어 질의를 범주문법에 기반한 구문분석을 수행하여 구문트리를 생성하고 연산자 및 키워드 정보를 추출하여 구문트리를 간략화한다. 다음으로 간략화된 구문트리로부터 명사구를 합성하고 키워드들에 대한 가중치를 부여한 후 불리언 질의를 생성하여 검색을 수행한다. 또한 구문분석의 오류로 인한 검색성능 저하를 최소화하기 위하여 상위 N개 구문트리에 대해 각각 불리언 질의를 생성하여 검색하는 N-BEST average 방법을 제안하였다. 정보검색 실험용 데이타 모음인 KTSET2.0으로 실험한 결과 제안된 방법은 수동으로 추출한 불리언 질의보다 8% 더 우수한 성능을 보였고, 기존의 벡터공간 모델에 기반한 자연어질의 시스템에 비해 23% 성능향상을 보였다. Abstract There have been a considerable evidence that trained users can achieve a good search effectiveness through a boolean query because a structural boolean query containing operators such as AND, OR, and NOT can make a more accurate representation of user's information need. However, it is not easy for ordinary users to construct a boolean query using appropriate boolean operators. In this paper, we propose a boolean query formulation method that automatically transforms a user's natural language query into a extended boolean query for both effectiveness and user convenience. First, a user's natural language query is syntactically analyzed using KCCG(Korean Combinatory Categorial Grammar) parser and resulting syntactic trees are structurally simplified using a tree-simplifying mechanism in order to catch the logical relationships between keywords. Next, in a simplified tree, plausible noun phrases are identified and added into the same tree as new additional keywords. Finally, a simplified syntactic tree is automatically converted into a boolean query using some mapping rules and linguistic heuristics. We also propose an N-BEST average method that uses top N syntactic trees to compensate for bad effects of single incorrect top syntactic tree. In experiments using KTSET2.0, we showed that a proposed method outperformed a traditional vector space model by 23%, and surprisingly manually constructed boolean queries by 8%.

XML Fragmentation for Resource-Efficient Query Processing over XML Fragment Stream (자원 효율적인 XML 조각 스트림 질의 처리를 위한 XML 분할)

Kim, Jin;Kang, Hyun-Chul
- The KIPS Transactions:PartD
- /
- v.16D no.1
- /
- pp.27-42
- /
- 2009
In realizing ubiquitous computing, techniques of efficiently using the limited resource at client such as mobile devices are required. With a mobile device with limited amount of memory, the techniques of XML stream query processing should be employed to process queries over a large volume of XML data. Recently, several techniques were proposed which fragment XML documents into XML fragments and stream them for query processing at client. During query processing, there could be great difference in resource usage (query processing time and memory usage) depending on how the source XML documents are fragmented. As such, an efficient fragmentation technique is needed. In this paper, we propose an XML fragmentation technique whereby resource efficiency in query processing at client could be enhanced. For this, we first present a cost model of query processing over XML fragment stream. Then, we propose an algorithm for resource-efficient XML fragmentation. Through implementation and experiments, we showed that our fragmentation technique outperformed previous techniques both in processing time and memory usage. The contribution of this paper is to have made the techniques of query processing over XML fragment stream more feasible for practical use.
https://doi.org/10.3745/KIPSTD.2009.16-D.1.27 인용 PDF KSCI

Design of Reliable Query Processing System in Mobile Database Environments (모바일 데이터베이스 환경의 신뢰성 보장 질의처리 시스템 설계)

Joo, Hae-Jong;Park, Young-Bae
- The KIPS Transactions:PartD
- /
- v.12D no.4 s.100
- /
- pp.521-530
- /
- 2005
Many researches are going on with regard to issues and problems related to mobile database systems, which are caused by the weak connectivity of wireless networks, the mobility and the Portability of mobile clients. Mobile computing satisfies user's demands for convenience and Performance to use information at any time and in any Place, but it has many Problems to be solved in the aspect of data management. The purpose of our study is to design Mobile Query Processing System(MQPS) to solve problems related to database hoarding, the maintenance of shared data consistency and the optimization of logging, which are caused by the weak connectivity and disconnection of wireless networks inherent in mobile database systems under mobile client-server environments. In addition, we proved the superiority of the proposed MQPS by comparing its performance to the C-I-S(Client-Intercept- Server) model.
https://doi.org/10.3745/KIPSTD.2005.12D.4.521 인용 PDF KSCI

A Design of Model For Interoperability in Multi-Database based XMDR on Distributed Environments (분산환경에서 XMDR 기반의 멀티데이터 베이스 상호운영 모델 설계)

Jung, Kye-Dong;Hwang, Chi-Gon;Choi, Young-Keun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.11 no.9
- /
- pp.1771-1780
- /
- 2007
The necessity of Information integration has emphasized by advancement of internet and change of enterprise environment. In enterprises, it usually integrates the multi-database constructing by M&A. For this integration of information it must guarantee interpretation and integration which is stabilized with solving heterogeneous characteristic problem. In this paper, we propose the method that change the global XML query to local XML query for interpretation. It is based on XMDR(eXtended Meta-Data Registry) which expresses the connection between the standard and the local for solve the interoperability problem in heterogeneous environment. Thus, we propose the legacy model that can search and modify by one Query with creating global XML Query by XMDR. and for his, we use the 2PC technique which is the distributed transaction control technique of existing.
https://doi.org/10.6109/jkiice.2007.11.9.1771 인용 PDF KSCI

Search Re-ranking Through Weighted Deep Learning Model (검색 재순위화를 위한 가중치 반영 딥러닝 학습 모델)

Gi-Taek An;Woo-Seok Choi;Jun-Yong Park;Jung-Min Park;Kyung-Soon Lee
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.5
- /
- pp.221-226
- /
- 2024
In information retrieval, queries come in various types, ranging from abstract queries to those containing specific keywords, making it a challenging task to accurately produce results according to user demands. Additionally, search systems must handle queries encompassing various elements such as typos, multilingualism, and codes. Reranking is performed through training suitable documents for queries using DeBERTa, a deep learning model that has shown high performance in recent research. To evaluate the effectiveness of the proposed method, experiments were conducted using the test collection of the Product Search Track at the TREC 2023 international information retrieval evaluation competition. In the comparison of NDCG performance measurements regarding the experimental results, the proposed method showed a 10.48% improvement over BM25, a basic information retrieval model, in terms of search through query error handling, provisional relevance feedback-based product title-based query expansion, and reranking according to query types, achieving a score of 0.7810.
https://doi.org/10.3745/TKIPS.2024.13.5.221 인용 PDF

Spatial Database Modeling based on Constraint (제약 기반의 공간 데이터베이스 모델링)

Woo, Sung-Koo;Ryu, Keun-Ho
- Journal of the Korean Association of Geographic Information Studies
- /
- v.12 no.1
- /
- pp.81-95
- /
- 2009
The CDB(Constraint Database) model is a new paradigm for massive spatial data processing such as GIS(Geographic Information System). This paper will identify the limitation of the schema structure and query processing through prior spatial database research and suggest more efficient processing mechanism of constraint data model. We presented constraint model concept, presentation method, and the examples of query processing. Especially, we represented TIN (Triangulated Irregular Network) as a constraint data model which displays the height on a plane data and compared it with prior spatial data model. Finally, we identified that we were able to formalize spatial data in a simple and refined way through constraint data modeling.
PDF

Membership Inference Attack against Text-to-Image Model Based on Generating Adversarial Prompt Using Textual Inversion (Textual Inversion을 활용한 Adversarial Prompt 생성 기반 Text-to-Image 모델에 대한 멤버십 추론 공격)

Yoonju Oh;Sohee Park;Daeseon Choi
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.33 no.6
- /
- pp.1111-1123
- /
- 2023
In recent years, as generative models have developed, research that threatens them has also been actively conducted. We propose a new membership inference attack against text-to-image model. Existing membership inference attacks on Text-to-Image models produced a single image as captions of query images. On the other hand, this paper uses personalized embedding in query images through Textual Inversion. And we propose a membership inference attack that effectively generates multiple images as a method of generating Adversarial Prompt. In addition, the membership inference attack is tested for the first time on the Stable Diffusion model, which is attracting attention among the Text-to-Image models, and achieve an accuracy of up to 1.00.
https://doi.org/10.13089/JKIISC.2023.33.6.1111 인용 PDF HTML

Cache Sensitive T-tree Main Memory Index for Range Query Search (범위질의 검색을 위한 캐시적응 T-트리 주기억장치 색인구조)

Choi, Sang-Jun;Lee, Jong-Hak
- Journal of Korea Multimedia Society
- /
- v.12 no.10
- /
- pp.1374-1385
- /
- 2009
Recently, advances in speed of the CPU have for out-paced advances in memory speed. Main-memory access is increasingly a performance bottleneck for main-memory database systems. To reduce memory access speed, cache memory have incorporated in the memory subsystem. However cache memories can reduce the memory speed only when the requested data is found in the cache. We propose a new cache sensitive T-tree index structure called as $CST^*$-tree for range query search. The $CST^*$-tree reduces the number of cache miss occurrences by loading the reduced internal nodes that do not have index entries. And it supports the sequential access of index entries for range query by connecting adjacent terminal nodes and internal index nodes. For performance evaluation, we have developed a cost model, and compared our $CST^*$-tree with existing CST-tree, that is the conventional cache sensitive T-tree, and $T^*$-tree, that is conventional the range query search T -tree, by using the cost model. The results indicate that cache miss occurrence of $CST^*$-tree is decreased by 20~30% over that of CST-tree in a single value search, and it is decreased by 10~20% over that of $T^*$-tree in a range query search.
PDF

Performance Evaluation of Re-ranking and Query Expansion for Citation Metrics: Based on Citation Index Databases (인용 지표를 이용한 재순위화 및 질의 확장의 성능 평가 - 인용색인 데이터베이스를 기반으로 -)

HyeKyung Lee;Yong-Gu lee
- Journal of the Korean Society for Library and Information Science
- /
- v.57 no.3
- /
- pp.249-277
- /
- 2023
The purpose of this study is to explore the potential contribution of citation metrics to improving the search performance of citation index databases. To this end, the study generated ten queries in the field of library and information science and conducted experiments based on the relevance assessment using 3,467 documents retrieved from the Web of Science and 60,734 documents published in 85 SSCI journals in the field of library and information science from 2000 to 2021. The experiments included re-ranking of the top 100 search results using citation metrics and search methods, query expansion experiments using vector space model retrieval systems, and the construction of a citation-based re-ranking system. The results are as follows: 1) Re-ranking using citation metrics differed from Web of Science's performance, acting as independent metrics. 2) Combining query term frequencies and citation counts positively affected performance. 3) Query expansion generally improved performance compared to the vector space model baseline. 4) User-based query expansion outperformed system-based. 5) Combining citation counts with suitability documents affected ranking within top suitability documents.
https://doi.org/10.4275/KSLIS.2023.57.3.249 인용 PDF

Conceptual Retrieval of Chinese Frequently Asked Healthcare Questions

Liu, Rey-Long;Lin, Shu-Ling
- International Journal of Knowledge Content Development & Technology
- /
- v.5 no.1
- /
- pp.49-68
- /
- 2015
Given a query (a health question), retrieval of relevant frequently asked questions (FAQs) is essential as the FAQs provide both reliable and readable information to healthcare consumers. The retrieval requires the estimation of the semantic similarity between the query and each FAQ. The similarity estimation is challenging as semantic structures of Chinese healthcare FAQs are quite different from those of the FAQs in other domains. In this paper, we propose a conceptual model for Chinese healthcare FAQs, and based on the conceptual model, present a technique ECA that estimates conceptual similarities between FAQs. Empirical evaluation shows that ECA can help various kinds of retrievers to rank relevant FAQs significantly higher. We also make ECA online to provide services for FAQ retrievers.
https://doi.org/10.5865/IJKCT.2015.5.1.049 인용 PDF KSCI

Search Result 563, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)