• Title/Summary/Keyword: Analysis of Query

Search Result 457, Processing Time 0.024 seconds

A Relation Analysis between NDSL User Queries and Technical Terms (NDSL 검색 질의어와 기술용어간의 관계에 대한 분석적 연구)

  • Kang, Nam-Gyu;Cho, Min-Hee;Kwon, Oh-Seok
    • Journal of Information Management
    • /
    • v.39 no.3
    • /
    • pp.163-177
    • /
    • 2008
  • In this paper, we analyzed the relationship between user query keywords that is used to search NDSL and technical terms extracted from NDSL journals. For the analysis, we extracted about 833,000 query keywords from NDSL search logs during nearly 17 months and approximately 41,000,000 technical terms from NDSL, INSPEC, FSTA journals. And we used only the English noun phrase in extracted those and then we did an experiment on analysis of equality, relationship analysis and frequency analysis.

On Development of an Automatic Tool for Extracting Association Rules of a user query using Formal Concept Analysis (형식개념분석기법을 이용한 사용자 질의 기반의 연관관계 추출 자동화지원도구의 개발)

  • Kim, Eung-Hee;Hwang, Suk-Hyung;Kim, Hong-Gee
    • The KIPS Transactions:PartD
    • /
    • v.15D no.3
    • /
    • pp.429-440
    • /
    • 2008
  • Formal Concept Analysis (FCA) is a widely used methodology for data analysis, which extracts concepts and builds a concept hierarchy from given data. A concept consists of objects and attributes shared by those objects, and a concept hierarchy includes information on super-sub relations among the concepts. In this paper, we propose a method for extracting Implication and Association rules from a concept hierarchy given a query by a user. The method also describes a way for displaying the extracted rules. Based on this method, we implemented an automatic tool, QAG-Wizard. Because the QAG-Wizard not only elicits relation information for the given query, but also displays it in structured form intuitively, we expect that it can be used in the fields of data analysis, data mining and information retrieval for various purposes.

Syntactic Analysis and Keyword Expansion for Performance Enhancement of Information Retrieval System (정보 검색 시스템의 성능 향상을 위한 구문 분석과 검색어 확장)

  • 윤성희
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.4
    • /
    • pp.303-308
    • /
    • 2004
  • Natural language query is the best user interface for the users of information retrieval systems. This paper Proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, the system performance was enhanced up to 11.3% precision and 4.7% correctness.

  • PDF

A Theoretical Study of Designing Thesaurus Browser by Clustering Algorithm (클러스터링을 이용한 시소러스 브라우저의 설계에 대한 이론적 연구)

  • Seo, Hwi
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.3
    • /
    • pp.427-456
    • /
    • 1999
  • This paper deals with the problems of information retrieval through full-test database which arise from both the deficiency of searching strategies or methods by information searcher and the difficulties of query representation, generation, extension, etc. In oder to solve these problems, we should use automatic retrieval instead of manual retrieval in the past. One of the ways to make the gap narrow between the terms by the writers and query by the searchers is that the query should be searched with the terms which the writers use. Thus, the preconditions which should be taken one accorded way to solve the problems are that all areas of information retrieval such as should taken one accorded way to solve the problems are that all areas of information retrieval such as contents analysis, information structure, query formation, query evaluation, etc. should be solved as a coherence way. We need to deal all the ares of automatic information retrieval for the efficiency of retrieval thought this paper is trying to solve the design of thesaurus browser. Thus, this paper shows the theoretical analyses about the form of information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, and information retrieval technique. As the result of analyzing them, this paper shows us theoretical model, that is to say, the thesaurus browser by clustering algorithm. The result in the paper will be a theoretical basis on new retrieval algorithm.

  • PDF

Enabling Dynamic Multi-Client and Boolean Query in Searchable Symmetric Encryption Scheme for Cloud Storage System

  • Xu, Wanshan;Zhang, Jianbiao;Yuan, Yilin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1286-1306
    • /
    • 2022
  • Searchable symmetric encryption (SSE) provides a safe and effective solution for retrieving encrypted data on cloud servers. However, the existing SSE schemes mainly focus on single keyword search in single client, which is inefficient for multiple keywords and cannot meet the needs for multiple clients. Considering the above drawbacks, we propose a scheme enabling dynamic multi-client and Boolean query in searchable symmetric encryption for cloud storage system (DMC-SSE). DMC-SSE realizes the fine-grained access control of multi-client in SSE by attribute-based encryption (ABE) and novel access control list (ACL), and supports Boolean query of multiple keywords. In addition, DMC-SSE realizes the full dynamic update of client and file. Compared with the existing multi-client schemes, our scheme has the following advantages: 1) Dynamic. DMC-SSE not only supports the dynamic addition or deletion of multiple clients, but also realizes the dynamic update of files. 2) Non-interactivity. After being authorized, the client can query keywords without the help of the data owner and the data owner can dynamically update client's permissions without requiring the client to stay online. At last, the security analysis and experiments results demonstrate that our scheme is safe and efficient.

How Query by humming, a Music Information Retrieval System, is Being Used in the Music Education Classroom

  • Bradshaw, Brian
    • Journal of Multimedia Information System
    • /
    • v.4 no.3
    • /
    • pp.99-106
    • /
    • 2017
  • This study does a qualitative and quantitative analysis of how music by humming is being used by music educators in the classroom. Music by humming is part division of music information retrieval. In order to define what a music information retrieval system is first I need to define what it is. Berger and Lafferty (1999) define information retrieval as "someone doing a query to a retrieval system, a user begins with an information need. This need is an ideal document- perfect fit for the user, but almost certainly not present in the retrieval system's collection of documents. From this ideal document, the user selects a group of identifying terms. In the context of traditional IR, one could view this group of terms as akin to expanded query." Music Information Retrieval has its background in information systems, data mining, intelligent systems, library science, music history and music theory. Three rounds of surveys using question pro where completed. The study found that there were variances in knowledge, training and level of awareness of query by humming, music information retrieval systems. Those variance relationships where based on music specialty, level that they teach, and age of the respondents.

A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms (대용량 데이터 처리를 위한 고속 분산 인메모리 플랫폼 기반 재귀적 질의 알고리즘들의 구현 및 비교분석)

  • Kang, Minseo;Kim, Jaesung;Lee, Jaegil
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.621-626
    • /
    • 2016
  • Recursive query algorithm is used in many social network services, e.g., reachability queries in social networks. Recently, the size of social network data has increased as social network services evolve. As a result, it is almost impossible to use the recursive query algorithm on a single machine. In this paper, we implement recursive query on two popular in-memory distributed platforms, Spark and Twister, to solve this problem. We evaluate the performance of two implementations using 50 machines on Amazon EC2, and real-world data sets: LiveJournal and ClueWeb. The result shows that recursive query algorithm shows better performance on Spark for the Livejournal input data set with relatively high average degree, but smaller vertices. However, recursive query on Twister is superior to Spark for the ClueWeb input data set with relatively low average degree, but many vertices.

The MeSH-Term Query Expansion Models using LDA Topic Models in Health Information Retrieval (MeSH 기반의 LDA 토픽 모델을 이용한 검색어 확장)

  • You, Sukjin
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.1
    • /
    • pp.79-108
    • /
    • 2021
  • Information retrieval in the health field has several challenges. Health information terminology is difficult for consumers (laypeople) to understand. Formulating a query with professional terms is not easy for consumers because health-related terms are more familiar to health professionals. If health terms related to a query are automatically added, it would help consumers to find relevant information. The proposed query expansion (QE) models show how to expand a query using MeSH terms. The documents were represented by MeSH terms (i.e. Bag-of-MeSH), found in the full-text articles. And then the MeSH terms were used to generate LDA (Latent Dirichlet Analysis) topic models. A query and the top k retrieved documents were used to find MeSH terms as topic words related to the query. LDA topic words were filtered by threshold values of topic probability (TP) and word probability (WP). Threshold values were effective in an LDA model with a specific number of topics to increase IR performance in terms of infAP (inferred Average Precision) and infNDCG (inferred Normalized Discounted Cumulative Gain), which are common IR metrics for large data collections with incomplete judgments. The top k words were chosen by the word score based on (TP *WP) and retrieved document ranking in an LDA model with specific thresholds. The QE model with specific thresholds for TP and WP showed improved mean infAP and infNDCG scores in an LDA model, comparing with the baseline result.

Object Modeling for Mapping from XML Document and Query to UML Class Diagram based on XML-GDM (XML-GDM을 기반으로 한 UML 클래스 다이어그램으로 사상을 위한 XML문서와 질의의 객체 모델링)

  • Park, Dae-Hyun;Kim, Yong-Sung
    • The KIPS Transactions:PartD
    • /
    • v.17D no.2
    • /
    • pp.129-146
    • /
    • 2010
  • Nowadays, XML has been favored by many companies internally and externally as a means of sharing and distributing data. there are many researches and systems for modeling and storing XML documents by an object-oriented method as for the method of saving and managing web-based multimedia document more easily. The representative tool for the object-oriented modeling of XML documents is UML (Unified Modeling Language). UML at the beginning was used as the integrated methodology for software development, but now it is used more frequently as the modeling language of various objects. Currently, UML supports various diagrams for object-oriented analysis and design like class diagram and is widely used as a tool of creating various database schema and object-oriented codes from them. This paper proposes an Efficinet Query Modelling of XML-GL using the UML class diagram and OCL for searching XML document which its application scope is widely extended due to the increased use of WWW and its flexible and open nature. In order to accomplish this, we propose the modeling rules and algorithm that map XML-GL. which has the modeling function for XML document and DTD and the graphical query function about that. In order to describe precisely about the constraint of model component, it is defined by OCL (Object Constraint Language). By using proposed technique creates a query for the XML document of holding various properties of object-oriented model by modeling the XML-GL query from XML document, XML DTD, and XML query while using the class diagram of UML. By converting, saving and managing XML document visually into the object-oriented graphic data model, user can prepare the base that can express the search and query on XML document intuitively and visually. As compared to existing XML-based query languages, it has various object-oriented characteristics and uses the UML notation that is widely used as object modeling tool. Hence, user can construct graphical and intuitive queries on XML-based web document without learning a new query language. By using the same modeling tool, UML class diagram on XML document content, query syntax and semantics, it allows consistently performing all the processes such as searching and saving XML document from/to object-oriented database.

A Study on Traceback by WAS Bypass Access Query Information of DataBase (DBMS WAS 우회접속의 쿼리정보 역추적 연구)

  • Baek, Jong-Il;Park, Dea-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.12
    • /
    • pp.181-190
    • /
    • 2009
  • DBMS access that used high speed internet web service through WAS is increasing. Need application of DB security technology for 3-Tier about DBMS by unspecified majority and access about roundabout way connection and competence control. If do roundabout way connection to DBMS through WAS, DBMS server stores WAS's information that is user who do not store roundabout way connection user's IP information, and connects to verge system. To DBMS in this investigation roundabout way connection through WAS do curie information that know chasing station security thanks recording and Forensic data study. Store session about user and query information that do login through web constructing MetaDB in communication route, and to DBMS server log storing done query information time stamp query because do comparison mapping actuality user discriminate. Apply making Rule after Pattern analysis receiving log by elevation method of security authoritativeness, and develop Module and keep in the data storing place through collection and compression of information. Kept information can minimize false positives of station chase through control of analysis and policy base administration module that utilize intelligence style DBMS security client.