• Title/Summary/Keyword: structured retrieval system

Search Result 58, Processing Time 0.024 seconds

Performance Evaluation of an XQuery-based XML Retrieval System for the Structured Queries (XQuery 기반 XML 검색시스템의 구조적인 질의 검색 성능 평가)

  • Jung, Young-Mi;Kim, Hee-Sop;Shin, Dong-Hyun;Yang, Jung-Shik
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2005.08a
    • /
    • pp.295-304
    • /
    • 2005
  • XQuery는 W3C에서 가장 최근에 발표한 XML 질의 언어 표준 초안으로 다양한 형태의 XML 데이터소스에 폭넓게 적용할 수 있도록 설계되어 있다. 또한 XQuery는 데이터 내용뿐만 아니라 구조 검색에 대해 경로 질의를 이용하여 쉽고 간단하게 처리할 수 있는 특징이 있다. 본 연구애서는 XQuery를 지원하는 XML 검색시스템을 설계 및 구현하고, 개발된 시스템(Litch Search Server)을 INEX 2004를 통해 구조적인 질의에 대한 성능을 평가하여 그 개략적인 결과에 대하여 기술하고 있다.

  • PDF

NVST DATA ARCHIVING SYSTEM BASED ON FASTBIT NOSQL DATABASE

  • Liu, Ying-Bo;Wang, Feng;Ji, Kai-Fan;Deng, Hui;Dai, Wei;Liang, Bo
    • Journal of The Korean Astronomical Society
    • /
    • v.47 no.3
    • /
    • pp.115-122
    • /
    • 2014
  • The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our slestudy brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.

XML Repository System Using DBMS and IRS

  • Kang, Hyung-Il;Yoo, Jae-Soo;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.3
    • /
    • pp.6-14
    • /
    • 2007
  • In this paper, we design and implement a XML Repository System(XRS) that exploits the advantages of DBMSs and IRSs. Our scheme uses BRS to support full text indexing and content-based queries efficiently, and ORACLE to store XML documents, multimedia data, DTD and structure information. We design databases to manage XML documents including audio, video, images as well as text. We employ the non-composition model when storing XML documents into ORACLE. We represent structured information as ETID(Element Type Id), SORD(Sibling ORDer) and SSORD(Same Sibling ORDer). ETID is a unique value assigned to each element of DTD. SORD and SSORD represent an order information between sibling nodes and an order information among the sibling nodes with the same element respectively. In order to show superiority of our XRS, we perform various experiments in terms of the document loading time, document extracting time and contents retrieval time. It is shown through experiments that our XRS outperforms the existing XML document management systems. We also show that it supports various types of queries through performance experiments.

A Study on Ontology-based Keywords Structuring for Efficient Information Retrieval (연구.학술정보 효율적 검색을 위한 온톨로지 기반의 주제 색인어 구조화 방안 연구)

  • Song, In-Seok
    • Journal of Information Management
    • /
    • v.39 no.4
    • /
    • pp.121-154
    • /
    • 2008
  • In this paper, a ontology-based keyword structuring method is proposed to represent the knowledge structure of scholarly documents and to make inferences from the semantic relationships holding among them. The characteristics of thesaurus as a knowledge organization system(KOS) for subject heading is critically reviewed from the information retrieval point of view. The domain concepts are identified and classified by analysis of the information activities occurring in a general research process based on scholarly sensemaking model. The ontological structure of keyword set is defined in terms of the semantic relationship of the canonical concepts which constitute scholarly documents such as journal articles. As a result, each ontologically structured keyword set of a document represents the knowledge structure of the corresponding document as semantic index. By means of the axioms and inference rules defined for information needs, users can efficiently explore the scholarly communication network built on the semantic relationship among documents in an analytic way based on the scholarly sensemaking model in oder to efficiently retrieve the relevant information for problem solving.

Lifecycle and Requirements for Digital Collection Management of Thai Theses and Dissertations

  • Jareonruen, Yuttana;Tuamsuk, Kulthida
    • Journal of Information Science Theory and Practice
    • /
    • v.7 no.3
    • /
    • pp.52-64
    • /
    • 2019
  • This research was aimed at studying the situation, problems, and requirements for digital collection lifecycle management of Thai theses and dissertations. The mixed research method used was composed of: (1) Study of the problem and situation in which the qualitative method was applied. The research site covered 10 higher education institutions where the Thailand Digital Collection (TDC) project is operated. The informants were key administrative officers of the TDC project of each institution. In-depth and structured interviews were conducted on an individual basis to obtain the most accurate answers. (2) Study of requirements based on the quantitative research method to survey the requirements for the digital collection management system for Thai theses and dissertations from 84 purposively-selected TDC project officers and 527 end users selected by accidental sampling, totaling 611 samples. Research findings are as follow: (1) The study of the situation and problems of digital collection lifecycle management shows that Thai higher institutions systematically manage their digital collection. The management lifecycle is consistent with the Guidance documents for lifecycle management of ETDs, which included seven steps: program planning, creation, submission, and ingestion, access and retrieval of digital objects, archiving and preservation, evaluation and assessment, interoperation (creation of institutional collaboration), and development of link data. (2) The study of requirements for digital collection management of Thai theses and dissertations shows five system requirements: acquisition and gathering, digitization, metadata standards, management of rights, and storage and retrieval, all of which are at M (mandatory) and D (desirable) levels.

An Agroclimatic Data Retrieval and Analysis System for Microcomputer Users(CLIDAS) (퍼스컴을 이용한 농업기후자료 검색 및 분석시스템)

  • 윤진일;김영찬
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.38 no.3
    • /
    • pp.253-263
    • /
    • 1993
  • Climatological informations have not been fully utilized by agricultural research and extension workers in Korea due mainly to inaccessbilty to the archived climate data. This study was initiated to improve access to historical climate data gathered from 72 weather stations of Korea Meteorological Administration for agricultural applications by using a microcomputer-based methodology. The climatological elements include daily values of average, maximum and minimum temperature, relative humidity, average and maximum wind speed, wind direction, evaporation, precipitation, sunshine duration and cloud amount. The menu-driven, user-friendly data retrieval system(CLIDAS) provides quick summaries of the data values on a daily, weekly and monthly basis and selective retrieval of weather records meeting certain user specified critical conditions. Growing degree days and potential evapotranspiration data are derived from the daily climatic data, too. Data reports can be output to the computer screen, a printer or ASCII data files. CLIDAS can be run on any IBM compatible machines with Video Graphics Array card. To run the system with the whole database, more than 50 Mb hard disk space should be available. The system can be easily upgraded for further expansion of functions due to the module-structured design.

  • PDF

Similarity checking between XML tags through expanding synonym vector (유사어 벡터 확장을 통한 XML태그의 유사성 검사)

  • Lee, Jung-Won;Lee, Hye-Soo;Lee, Ki-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.9
    • /
    • pp.676-683
    • /
    • 2002
  • The success of XML(eXtensible Markup Language) is primarily based on its flexibility : everybody can define the structure of XML documents that represent information in the form he or she desires. XML is so flexible that XML documents cannot be automatically provided with an underlying semantics. Different tag sets, different names for elements or attributes, or different document structures in general mislead the task of classifying and clustering XML documents precisely. In this paper, we design and implement a system that allows checking the semantic-based similarity between XML tags. First, this system extracts the underlying semantics of tags and then expands the synonym set of tags using an WordNet thesaurus and user-defined word library which supports the abbreviation forms and compound words for XML tags. Seconds, considering the relative importance of XML tags in the XML documents, we extend a conventional vector space model which is the most generally used for document model in Information Retrieval field. Using this method, we have been able to check the similarity between XML tags which are represented different tags.

Data Modeling for Cell-Signaling Pathway Database (세포 신호전달 경로 데이타베이스를 위한 데이타 모델링)

  • 박지숙;백은옥;이공주;이상혁;이승록;양갑석
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.573-584
    • /
    • 2003
  • Recent massive data generation by genomics and proteomics requires bioinformatic tools to extract the biological meaning from the massive results. Here we introduce ROSPath, a database system to deal with information on reactive oxygen species (ROS)-mediated cell signaling pathways. It provides a structured repository for handling pathway related data and tools for querying, displaying, and analyzing pathways. ROSPath data model provides the extensibility for representing incomplete knowledge and the accessibility for linking the existing biochemical databases via the Internet. For flexibility and efficient retrieval, hierarchically structured data model is defined by using the object-oriented model. There are two major data types in ROSPath data model: ‘bio entity’ and ‘interaction’. Bio entity represents a single biochemical entity: a protein or protein state involved in ROS cell-signaling pathways. Interaction, characterized by a list of inputs and outputs, describes various types of relationship among bio entities. Typical interactions are protein state transitions, chemical reactions, and protein-protein interactions. A complex network can be constructed from ROSPath data model and thus provides a foundation for describing and analyzing various biochemical processes.

An Experimental Comparison on Visualization Techniques of Long Menu-Lists (긴 메뉴항목 리스트의 시각화 기법 비교에 관한 실험적 연구)

  • Seo, Eun-Gyoung;Sung, Hye-Eun
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.2
    • /
    • pp.71-87
    • /
    • 2007
  • With the rapid change of the Web and E-transaction application, the search interface is providing more powerful search and visualization methods, while offering smoother integration of technology with task. Especially, visualization techniques for long menu-lists are applied in retrieval system with the goal of improving performance in user's ability to select one item from a long list. In order to review visualization techniques appropriate to the types of users and data set, this study compared the five visualization browsers such as the Tree-structured menu, the Table-of-contents menu, the Roll-over menu, the Click menu, and Fisheye menu. The result of general analyses shows that among the hierarchical methods, the experienced group prefers the Table-of-contents method menu, whereas the novice's group prefers the Tree-structure method menu. Among the linear methods, the two groups prefer the Roll-over menu. The Roll-over menu is most preferred among the five browsers by the two groups.

Design and Implementation of Multimedia Monitoring System Using WebCam Structure (WebCam을 이용한 멀티미디어 보안시스템의 설계와 구현)

  • 송은성;오용선
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2003.11a
    • /
    • pp.161-166
    • /
    • 2003
  • In this paper, we propose a novel method of design and implementation for the multimedia monitoring system using Web Camera. Recently WebCam is variously applied to many different areas and implemented as an improved performance using convenient functions of Web in this Internet era. Multimedia moving pictures has been popularly used in a variety of ways in different areas of monitoring systems in order to enhance the performance and the service with their data compression capability and the speed of the communication network these days. The design method of WebCam system presented in this paper might offer not only a convenient function of the monitoring system but great application capabilities. It can be used for a real time application of the multimedia picture and audio transmission so that the monitoring system can manage the security information in the sense for the reality. Tn addition, the monitoring system may be used as an inreal-time application using data storage and retrieval features of the Web. We offer both functions of monitoring in this structured form of implemented system.

  • PDF