• Title/Summary/Keyword: query quality

Search Result 91, Processing Time 0.019 seconds

Mathematical Properties of the Formulas Evaluating Boolean Operators in Information Retrieval (정보검색에서 부울연산자를 연산하는 식의 수학적 특성)

  • 이준호;이기호;조영화
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.1
    • /
    • pp.87-97
    • /
    • 1995
  • Boolean retrieval systems have been most widely used in the area of information retrieval due to easy implementation and efficient retrieval. Conventional Boolean retrieval systems. however, cannot rank retrieved documents in decreasing order of query-document similarities because they cannot compute similarity coefficients between queries and documents. Extended Boolean models such as fuzzy set. Waller-Kraft, Paice, P-Norm and Infinite-One have been developed to provide the document ranking facility. In extended Boolean models, the formulas evaluating Boolean operators AND and OR are an important component to affect the quality of document ranking. In this paper we present mathematical properties of the formulas, and analyse their effect on retrieval effectiveness. Our analyses show that P-Norm is the most suitable for achieving high retrieval effectiveness.

  • PDF

Multi-match Packet Classification Scheme Combining TCAM with an Algorithmic Approach

  • Lim, Hysook;Lee, Nara;Lee, Jungwon
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.1
    • /
    • pp.27-38
    • /
    • 2017
  • Packet classification is one of the essential functionalities of Internet routers in providing quality of service. Since the arrival rate of input packets can be tens-of-millions per second, wire-speed packet classification has become one of the most challenging tasks. While traditional packet classification only reports a single matching result, new network applications require multiple matching results. Ternary content-addressable memory (TCAM) has been adopted to solve the multi-match classification problem due to its ability to perform fast parallel matching. However, TCAM has a fundamental issue: high power dissipation. Since TCAM is designed for a single match, the applicability of TCAM to multi-match classification is limited. In this paper, we propose a cost- and energy-efficient multi-match classification architecture that combines TCAM with a tuple space search algorithm. The proposed solution uses two small TCAM modules and requires a single-cycle TCAM lookup, two SRAM accesses, and several Bloom filter query cycles for multi-match classifications.

Document Summarization using Pseudo Relevance Feedback and Term Weighting (의사연관피드백과 용어 가중치에 의한 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.533-540
    • /
    • 2012
  • In this paper, we propose a document summarization method using the pseudo relevance feedback and the term weighting based on semantic features. The proposed method can minimize the user intervention to use the pseudo relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature. In addition, it uses the semantic feature of term weighting and the expanded query to reduce the semantic gap between the user's requirement and the result of proposed method. The experimental results demonstrate that the proposed method achieves better performant than other methods without term weighting.

High-Performance Multi-GPU Rendering Based on Implicit Synchronization (묵시적 동기화 기반의 고성능 다중 GPU 렌더링)

  • Kim, Younguk;Lee, Sungkil
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1332-1338
    • /
    • 2015
  • Recently, growing attention has been paid to multi-GPU rendering to support real-time high-quality rendering at high resolution. In order to attain high performance in real-time multi-GPU rendering, great care needs to be taken to reduce the overhead of data transfer among GPUs and frame composition. This paper presents a novel multi-GPU algorithm that greatly enhances split frame rendering with implicit query-based synchronization. In order to support implicit synchronization in frame composition, we further present a message queue-based scheduling algorithm. We carried out an experiment to evaluate our algorithm, and found that our algorithm improved rendering performance up to 200% more than previously existing algorithms.

A Study of Word Sense Ambiguation which Affects Efficiency of the Internet-based Information Retrieval (어휘의미 중의성이 인터넷 정보검색 효율에 미치는 영향에 관한 연구)

  • 황상규;오경묵;변영태
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.3
    • /
    • pp.65-82
    • /
    • 1999
  • Internet users are often frustrated when they try to find“right”piece of information quickly. The reason is that the discovery of available and quality based-resources becomes more difficult to end users while the Internet continues to expand rapidly. Not only incorrect keywords and query expression but word sense ambiguation are the cause of dropping-off in efficiency on Internet search. In this paper, studies were conducted to analyze dropping off in efficiency fir Internet search and discussed reducing user s frustration of the Internet and improving their search strategies.

  • PDF

Spanning Tree Aggregation Using Attribute of Service Boundary Line (서비스경계라인 속성을 이용한 스패닝 트리 집단화)

  • Kwon, So-Ra;Jeon, Chang-Ho
    • The KIPS Transactions:PartC
    • /
    • v.18C no.6
    • /
    • pp.441-444
    • /
    • 2011
  • In this study, we present a method for efficiently aggregating network state information. It is especially useful for aggregating links that have both delay and bandwidth in an asymmetric network. Proposed method reduces the information distortion of logical link by integration process after similar measure and grouping of logical links in multi-level topology transformation to reduce the space complexity. It is applied to transform the full mesh topology whose Service Boundary Line (SBL) serves as its logical link into a spanning tree topology. Simulation results show that aggregated information accuracy and query response accuracy are higher than that of other known method.

Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

  • Al-Sabahi, Kamal;Zuping, Zhang;Kang, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.1
    • /
    • pp.254-276
    • /
    • 2019
  • Since the amount of information on the internet is growing rapidly, it is not easy for a user to find relevant information for his/her query. To tackle this issue, the researchers are paying much attention to Document Summarization. The key point in any successful document summarizer is a good document representation. The traditional approaches based on word overlapping mostly fail to produce that kind of representation. Word embedding has shown good performance allowing words to match on a semantic level. Naively concatenating word embeddings makes common words dominant which in turn diminish the representation quality. In this paper, we employ word embeddings to improve the weighting schemes for calculating the Latent Semantic Analysis input matrix. Two embedding-based weighting schemes are proposed and then combined to calculate the values of this matrix. They are modified versions of the augment weight and the entropy frequency that combine the strength of traditional weighting schemes and word embedding. The proposed approach is evaluated on three English datasets, DUC 2002, DUC 2004 and Multilingual 2015 Single-document Summarization. Experimental results on the three datasets show that the proposed model achieved competitive performance compared to the state-of-the-art leading to a conclusion that it provides a better document representation and a better document summary as a result.

An Efficient Feature Point Extraction and Comparison Method through Distorted Region Correction in 360-degree Realistic Contents

  • Park, Byeong-Chan;Kim, Jin-Sung;Won, Yu-Hyeon;Kim, Young-Mo;Kim, Seok-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.93-100
    • /
    • 2019
  • One of critical issues in dealing with 360-degree realistic contents is the performance degradation in searching and recognition process since they support up to 4K UHD quality and have all image angles including the front, back, left, right, top, and bottom parts of a screen. To solve this problem, in this paper, we propose an efficient search and comparison method for 360-degree realistic contents. The proposed method first corrects the distortion at the less distorted regions such as front, left and right parts of the image excluding severely distorted regions such as upper and lower parts, and then it extracts feature points at the corrected region and selects the representative images through sequence classification. When the query image is inputted, the search results are provided through feature points comparison. The experimental results of the proposed method shows that it can solve the problem of performance deterioration when 360-degree realistic contents are recognized comparing with traditional 2D contents.

Web Hypermedia Resources Reuse and Integration for On-Demand M-Learning

  • Berri, Jawad;Benlamri, Rachid;Atif, Yacine;Khallouki, Hajar
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.125-136
    • /
    • 2021
  • The development of systems that can generate automatically instructional material is a challenging goal for the e-learning community. These systems pave the way towards large scale e-learning deployment as they produce instruction on-demand for users requesting to learn about any topic, anywhere and anytime. However, realizing such systems is possible with the availability of vast repositories of web information in different formats that can be searched, reused and integrated into information-rich environments for interactive learning. This paradigm of learning relieves instructors from the tedious authoring task, making them focusing more on the design and quality of instruction. This paper presents a mobile learning system (Mole) that supports the generation of instructional material in M-Learning (Mobile Learning) contexts, by reusing and integrating heterogeneous hypermedia web resources. Mole uses open hypermedia repositories to build a Learning Web and to generate learning objects including various hypermedia resources that are adapted to the user context. Learning is delivered through a nice graphical user interface allowing the user to navigate conveniently while building their own learning path. A test case scenario illustrating Mole is presented along with a system evaluation which shows that in 90% of the cases Mole was able to generate learning objects that are related to the user query.

PC-SAN: Pretraining-Based Contextual Self-Attention Model for Topic Essay Generation

  • Lin, Fuqiang;Ma, Xingkong;Chen, Yaofeng;Zhou, Jiajun;Liu, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3168-3186
    • /
    • 2020
  • Automatic topic essay generation (TEG) is a controllable text generation task that aims to generate informative, diverse, and topic-consistent essays based on multiple topics. To make the generated essays of high quality, a reasonable method should consider both diversity and topic-consistency. Another essential issue is the intrinsic link of the topics, which contributes to making the essays closely surround the semantics of provided topics. However, it remains challenging for TEG to fill the semantic gap between source topic words and target output, and a more powerful model is needed to capture the semantics of given topics. To this end, we propose a pretraining-based contextual self-attention (PC-SAN) model that is built upon the seq2seq framework. For the encoder of our model, we employ a dynamic weight sum of layers from BERT to fully utilize the semantics of topics, which is of great help to fill the gap and improve the quality of the generated essays. In the decoding phase, we also transform the target-side contextual history information into the query layers to alleviate the lack of context in typical self-attention networks (SANs). Experimental results on large-scale paragraph-level Chinese corpora verify that our model is capable of generating diverse, topic-consistent text and essentially makes improvements as compare to strong baselines. Furthermore, extensive analysis validates the effectiveness of contextual embeddings from BERT and contextual history information in SANs.