• Title/Summary/Keyword: fact retrieval

Search Result 61, Processing Time 0.021 seconds

Implementation of an XML-Based Editor/Transformer for Large Volume of Similar Documents (XML 기반의 대용량 유사 문서 편집기/변환기 구현)

  • 황인준
    • The Journal of Society for e-Business Studies
    • /
    • v.9 no.1
    • /
    • pp.21-38
    • /
    • 2004
  • With its recent popularity, Web is now considered as a huge repository of information. Most documents on the web have been created using HTML(Hyper Text Markup Language). Even though HTML is simple and easy to learn, it has several features that are obstacles to the efficient information retrieval. XML(eXtensible Markup Language) can provide a solution to such problems and in fact, has already been used in many applications, XML is a standard markup language for exchanging data on the web. It can describe a document structure freely by defining its DTD, which enables efficient integration and retrieval of data on the web. In this paper, we propose a versatile and efficient XML document manager. Its features include (i) form-based XML editor that enables easy creation of new XML documents, (ii) automatic document converter that can transform HTML documents with similar structure into XML documents automatically, and (iii) GUI-based DTD editor.

  • PDF

A Study on the Feasibility of Full-Text Information Retrieval System Based on Document Content Structure (문헌의 내용단위구조에 의한 전문검색시스템의 타당성 고찰)

  • Lee Byeong-Ki
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.1
    • /
    • pp.129-154
    • /
    • 1998
  • In these days the online full-text database are increasing, but conventional full-text information retrieval system has been proved with high recall ratio and low precision ratio. One of the disadvantages of full-text IR system is that it is not designed to reflect the user's information need it is due to the fact that full-text IR system has been designed based on physical and logical structure of document without considering the content of document. Therefore, the purpose of the study examined feasibility of document content structure in full-text IR system by resolving such disadvantages of conventional system. 180 Journal articles have been analyzed to find common structure of document content and finally general model of the structure of journal articles were developed. The result shows that have relation to between user's cogntive schema structure, user's information need and contents structure of document. Thus it is concluded that full-text IR system need to be designed by using document content structure in order to meet user's information need more effectively.

  • PDF

A Development of Ontology-Based Law Retrieval System: Focused on Railroad R&D Projects (온톨로지 기반 법령 검색시스템의 개발: 철도·교통 분야 연구개발사업을 중심으로)

  • Won, Min-Jae;Kim, Dong-He;Jung, Hae-Min;Lee, Sang Keun;Hong, June Seok;Kim, Wooju
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.209-225
    • /
    • 2015
  • Research and development projects in railroad domain are different from those in other domains in terms of their close relationship with laws. Some cases are reported that new technologies from R&D projects could not be industrialized because of relevant laws restricting them. This problem comes from the fact that researchers don't know exactly what laws can affect the result of R&D projects. To deal with this problem, we suggest a model for law retrieval system that can be used by researchers of railroad R&D projects to find related legislation. Input of this system is a research plan describing the main contents of projects. After laws related to the R&D project is provided with their rankings, which are assigned by scores we developed. A ranking of a law means its order of priority to be checked. By using this system, researchers can search the laws that may affect R&D projects throughout all the stages of project cycle. So, using our system model, researchers can get a list of laws to be considered before the project they participate ends. As a result, they can adjust their project direction by checking the law list, avoiding their elaborate projects being useless.

AgeCAPTCHA: an Image-based CAPTCHA that Annotates Images of Human Faces with their Age Groups

  • Kim, Jonghak;Yang, Joonhyuk;Wohn, Kwangyun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.3
    • /
    • pp.1071-1092
    • /
    • 2014
  • Annotating images with tags that describe the content of the images facilitates image retrieval. However, this task is challenging for both humans and computers. In response, a new approach has been proposed that converts the manual image annotation task into CAPTCHA challenges. However, this approach has not been widely used because of its weak security and the fact that it can be applied only to annotate for a specific type of attribute clearly separated into mutually exclusive categories (e.g., gender). In this paper, we propose a novel image annotation CAPTCHA scheme, which can successfully differentiate between humans and computers, annotate image content difficult to separate into mutually exclusive categories, and generate verified test images difficult for computers to identify but easy for humans. To test its feasibility, we applied our scheme to annotate images of human faces with their age groups and conducted user studies. The results showed that our proposed system, called AgeCAPTCHA, annotated images of human faces with high reliability, yet the process was completed by the subjects quickly and accurately enough for practical use. As a result, we have not only verified the effectiveness of our scheme but also increased the applicability of image annotation CAPTCHAs.

A Study on Query Expansion Using Concept (개념을 이용한 질의 확장에 관한 연구)

  • Han Jung-Soo;Kim Gui-Jung
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.1
    • /
    • pp.135-145
    • /
    • 2005
  • Without detailed exact knowledge of a retrieval collection, most users find it difficult to formulate effective queries. In fact, most users may spend large amount of time formulating queries in order to obtain their desired result. A method to overcome this difficulty is to use query expansion that reformulates better query from initial query. In this paper we propose concept based query evaluation method using concept of class that retrieved from initial query. This concept is expanded through thesaurus. For efficiency evaluation of query expansion, we defined most critical value through a simulation and compared precision and recall each other.

  • PDF

Simple Signal Reconstruction by Faster Adaptive MRP Algorithm (고속 적응 MRP 알고리즘에 의한 저주파 신호 복원)

  • Jeong, Won-Yong;Kim, Jong-Su;Choe, Tae-Won;Bae, Jin-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.2
    • /
    • pp.5-14
    • /
    • 1992
  • In the fields of astronomy, communication, X-ray crystallography, and engineering, it is a very important and useful fact that the original signal can be reconstructed from a partial information, only spectral magnitude or phase, of the signal. In this paper, we proposed a modified iterative algorithm to solve the Magnitude Retrieval Problem (MRP) for 1-D, 2-D signals. In oder to accelerate the convergence rate, the unit constant initial function which is used in the references is replaced by the exponential initial function for the modified adaptive iterative method. As a result, MRP with 1-D signal and low-pass detail image is significantly enhanced from an iterative convergence rate and a computer storage memory points of view.

  • PDF

Optimization of Case-based Reasoning Systems using Genetic Algorithms: Application to Korean Stock Market (유전자 알고리즘을 이용한 사례기반추론 시스템의 최적화: 주식시장에의 응용)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul;Han, In-Goo
    • Asia pacific journal of information systems
    • /
    • v.16 no.1
    • /
    • pp.71-84
    • /
    • 2006
  • Case-based reasoning (CBR) is a reasoning technique that reuses past cases to find a solution to the new problem. It often shows significant promise for improving effectiveness of complex and unstructured decision making. It has been applied to various problem-solving areas including manufacturing, finance and marketing for the reason. However, the design of appropriate case indexing and retrieval mechanisms to improve the performance of CBR is still a challenging issue. Most of the previous studies on CBR have focused on the similarity function or optimization of case features and their weights. According to some of the prior research, however, finding the optimal k parameter for the k-nearest neighbor (k-NN) is also crucial for improving the performance of the CBR system. In spite of the fact, there have been few attempts to optimize the number of neighbors, especially using artificial intelligence (AI) techniques. In this study, we introduce a genetic algorithm (GA) to optimize the number of neighbors to combine. This study applies the novel approach to Korean stock market. Experimental results show that the GA-optimized k-NN approach outperforms other AI techniques for stock market prediction.

INCREASING TREND OF ANGSTROM EXPONENT OVER EAST ASIAN WATERS OBSERVED IN 1998-2005 SEAWIFS DATA SET

  • Fukushima, Hajime;Liping, Li;Takeno, Keisuke
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.57-60
    • /
    • 2007
  • Monthly mean data of ${\AA}ngstr{\ddot{o}}m$ exponent and Aerosol optical thickness (AOT) from Sea-viewing Wide Field-of-view Sensor (SeaWiFS) measurements over the East Asian waters were analyzed. Increasing trend of the satellite-derived ${\AA}ngstr{\ddot{o}}m$ exponent from 1998 to 2004 was found while AOT mean was observed stable during the same period. The trend of ${\AA}ngstr{\ddot{o}}m$ exponent is then interpreted as increase in fraction of small aerosol particles to give quantitative estimates on the variability of aerosols. The mean increase is evaluated to be $4{\sim}5%$ over the 7-year period in terms of the contribution of small particles to the total AOT, or sub-micron fraction (SMF). Possibilities of the observed trend arising from the sensor calibration or algorithm performance are carefully checked, which confirm our belief that this observed trend is rather a real fact than an artifact due to data processing. Another time series of SMF data (2000-2005) estimated from the fine-mode fraction (FMF) of Moderate Resolution Imaging Spectroradiometer (MODIS) supports this observation yet with different calibration system and retrieval algorithms.

  • PDF

Automatic Extraction of Opinion Words from Korean Product Reviews Using the k-Structure (k-Structure를 이용한 한국어 상품평 단어 자동 추출 방법)

  • Kang, Han-Hoon;Yoo, Seong-Joon;Han, Dong-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.470-479
    • /
    • 2010
  • In relation to the extraction of opinion words, it may be difficult to directly apply most of the methods suggested in existing English studies to the Korean language. Additionally, the manual method suggested by studies in Korea poses a problem with the extraction of opinion words in that it takes a long time. In addition, English thesaurus-based extraction of Korean opinion words leaves a challenge to reconsider the deterioration of precision attributed to the one to one mismatching between Korean and English words. Studies based on Korean phrase analyzers may potentially fail due to the fact that they select opinion words with a low level of frequency. Therefore, this study will suggest the k-Structure (k=5 or 8) method, which may possibly improve the precision while mutually complementing existing studies in Korea, in automatically extracting opinion words from a simple sentence in a given Korean product review. A simple sentence is defined to be composed of at least 3 words, i.e., a sentence including an opinion word in ${\pm}2$ distance from the attribute name (e.g., the 'battery' of a camera) of a evaluated product (e.g., a 'camera'). In the performance experiment, the precision of those opinion words for 8 previously given attribute names were automatically extracted and estimated for 1,868 product reviews collected from major domestic shopping malls, by using k-Structure. The results showed that k=5 led to a recall of 79.0% and a precision of 87.0%; while k=8 led to a recall of 92.35% and a precision of 89.3%. Also, a test was conducted using PMI-IR (Pointwise Mutual Information - Information Retrieval) out of those methods suggested in English studies, which resulted in a recall of 55% and a precision of 57%.

A Study on the Efficiency & Limitation of 3D Animation Production Management Using Production Management Tool - Focusing on Shotgun Software & Ftrack (3D 애니메이션 제작 관리를 위한 제작관리도구(Tool)의 효율성 및 한계 - 샷건(Shotgun)과 Ftrack(에프트랙)을 중심으로)

  • Lee, Esther Kkotsongyi
    • Cartoon and Animation Studies
    • /
    • s.49
    • /
    • pp.1-23
    • /
    • 2017
  • 3D animation production has had a pivotal position in current animation industry and the necessity of professional management tool for 3D animation production has claimed due to its sophisticated pipeline from advance of technology and global production partnership trend. Shotgun and Ftrack are providing the most appropriate management toolset for 3D animation management among the extant management tools and the efficiency of Shotgun & Ftrack is identified compared with the traditional document oriented management style. The biggest strength of production management using Shotgun is that all of the production staff can directly participate in the communication on the tools therefore they can share the information on Shotgun & Ftrack in real time without constraint of time and location. Moreover, all the process of the production and the history of the discussion on certain production issues are systematically accrue on the tool so that the production history can be easily tracked. Finally, the production management using tools contributes collecting and analysing the production information for the production management team in studios. However, Shotgun & Ftrack has metadata based retrieval method which cost huge amount of effort by human's manual annotation and it also has the limitation of accuracy. In addition, the fact that studios has to have technical professionals first in order to institute the tools into their studios is the actual difficulty of Korean studios when they want to use management tools for their project. Thus, this paper suggests adopting the content-based retrieval system on the tools and tools' expanded technical service for the studios as the solution of the identified issues.