• Title/Summary/Keyword: Full-text Management System

Search Result 53, Processing Time 0.025 seconds

The Current Status of Utilization and Demand on Cancer Information in the Faculties of Medical School in Korea (국내 의과대학 교수의 암정보 활용 현황과 요구도)

  • Lim, Min-Kyung;Park, Sook-Kyung;Yang, Jeong-Hee;Lee, Young-Sung
    • Journal of Preventive Medicine and Public Health
    • /
    • v.36 no.1
    • /
    • pp.39-46
    • /
    • 2003
  • Objectives : To investigate the availability and demand for overall cancer-related information, and to establish a basic plan for the construction of a cancer database and information system based on the research results from Korea. Methods : Postal and telephone surveys were carried out, between August 2001 and November 2001, of 323 affiliated faculty professors from medical universities and colleges in Korea. The data were analyzed with descriptive statistical methods, with regard to the present status and demand for health and cancer-related information. Results : Most (over 80%) subjects studied utilized the health-related information provided on Internet website from foreign countries, such as Medline, but similar comprehensive information system lacked in Korea. The construction of a cancer-related database of domestic research results was revealed to be in a great demand. Information on registration and statistics (52.8%), study results (48.5%) and study resources (37.4%) were the major ingredients required in the database. In constructing a database of the cancer-related research results, a full-text service, continuous updating of data, and the development of standardized user-friendly searching tool were regarded as the necessary components. The formulation of an information sharing system, regarding cancer-related clinical trials, was investigated as being quite feasible. Conclusion : This study demonstrated the great importance of cancer information systems, and much demand for an available cancer-related database based on Korean research results.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

A Study of Digitalization Performance of Sinological Resource in Korea (고문헌의 디지털화 성과 연구)

  • Cho Hyung-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.391-413
    • /
    • 2006
  • This study analyzed the procedures and contents of digitalization of sinological resources owned by major sinological resource institutes in Korea. It investigated the united organizations that use such sinological resources It also assessed governmental policies and future Plans for digitalization of sinological resources. Finally, it proposed steps and conditions necessary for successful digitalization of sinological resources. (1) The level of digitalization of library management, searching, and usage system of national library, university library, and research library that has been applied since 1980s has already been highly advanced. The amount of sinological resources collected is significant and its substance value is very high. The digitalized resources are already distributed on internet partially. However, the level of digitalization of sinological resources still lacks some aspects and requires further effort. (2) The data base for digitalized sinological resources already available can be grouped into bibliographic information, contents and annotation, and full text. and it includes both domestic and foreign resources. The quantities of resources are as described in the body (3) The types of digital sinological resources include antient books. archives, micro, and book blocks. (4) The encoding DB methods of digital sinological resources include text. image, PDF. and etc. (5) The united organizations of sinological resources enable us to avoid duplicated investigation and enhance service efficiency. Here are some factors to consider in order to accomplish ideal digitalization of sinological resources. (1) First of all, it is necessary to organize a control center for digitalization procedures of old materials, and allow it a certain degree of authority to develop and Proceed a comprehensive Plan. (2) Both short- and long-term plans need to be developed in order to analyze various aspects of digitalization process. and their steps need to be taken gradually (3) It is necessary to train experts for old materials and let them construct and manage DB.

A Study on the CD ROM Network(LAN) (CD-ROM 네트워크(LAN)에 관한 소고(小考))

  • Kil, Hyung-Do
    • Journal of Information Management
    • /
    • v.21 no.2
    • /
    • pp.9-23
    • /
    • 1990
  • CD-ROM technique, not more than 10 years after development, goes through rapid growth, has been taken advantage of several practical application parts. Needless to say about bibliographic data, numeric value, the phonetics, an image and a picture data that are recorded as abstract or full text, and offered and applied to industry, information service including library, it can be used for library staffs, information retrieval. Escape from the need of one disc drive and one computer to access one disc, now we organize an ideal system that can be retrieved several CD-ROM used only one drive, several users can access several information, so networking is possible through LAN. In this article, we studied the function and type, characteristics, system, structure, data block, production procedure, standardization of CD-ROM LAN.

  • PDF

A Literature Review and Classification of Recommender Systems on Academic Journals (추천시스템관련 학술논문 분석 및 분류)

  • Park, Deuk-Hee;Kim, Hyea-Kyeong;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.139-152
    • /
    • 2011
  • Recommender systems have become an important research field since the emergence of the first paper on collaborative filtering in the mid-1990s. In general, recommender systems are defined as the supporting systems which help users to find information, products, or services (such as books, movies, music, digital products, web sites, and TV programs) by aggregating and analyzing suggestions from other users, which mean reviews from various authorities, and user attributes. However, as academic researches on recommender systems have increased significantly over the last ten years, more researches are required to be applicable in the real world situation. Because research field on recommender systems is still wide and less mature than other research fields. Accordingly, the existing articles on recommender systems need to be reviewed toward the next generation of recommender systems. However, it would be not easy to confine the recommender system researches to specific disciplines, considering the nature of the recommender system researches. So, we reviewed all articles on recommender systems from 37 journals which were published from 2001 to 2010. The 37 journals are selected from top 125 journals of the MIS Journal Rankings. Also, the literature search was based on the descriptors "Recommender system", "Recommendation system", "Personalization system", "Collaborative filtering" and "Contents filtering". The full text of each article was reviewed to eliminate the article that was not actually related to recommender systems. Many of articles were excluded because the articles such as Conference papers, master's and doctoral dissertations, textbook, unpublished working papers, non-English publication papers and news were unfit for our research. We classified articles by year of publication, journals, recommendation fields, and data mining techniques. The recommendation fields and data mining techniques of 187 articles are reviewed and classified into eight recommendation fields (book, document, image, movie, music, shopping, TV program, and others) and eight data mining techniques (association rule, clustering, decision tree, k-nearest neighbor, link analysis, neural network, regression, and other heuristic methods). The results represented in this paper have several significant implications. First, based on previous publication rates, the interest in the recommender system related research will grow significantly in the future. Second, 49 articles are related to movie recommendation whereas image and TV program recommendation are identified in only 6 articles. This result has been caused by the easy use of MovieLens data set. So, it is necessary to prepare data set of other fields. Third, recently social network analysis has been used in the various applications. However studies on recommender systems using social network analysis are deficient. Henceforth, we expect that new recommendation approaches using social network analysis will be developed in the recommender systems. So, it will be an interesting and further research area to evaluate the recommendation system researches using social method analysis. This result provides trend of recommender system researches by examining the published literature, and provides practitioners and researchers with insight and future direction on recommender systems. We hope that this research helps anyone who is interested in recommender systems research to gain insight for future research.

A Case Study on the Application of AI-OCR for Data Transformation of Paper Records (종이기록 데이터화를 위한 AI-OCR 적용 사례연구)

  • Ahn, Sejin;Hwang, Hyunho;Yim, Jin Hee
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.165-193
    • /
    • 2022
  • It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

Effective Picture Search in Lifelog Management Systems using Bluetooth Devices (라이프로그 관리 시스템에서 블루투스 장치를 이용한 효과적인 사진 검색 방법)

  • Chung, Eun-Ho;Lee, Ki-Yong;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.383-391
    • /
    • 2010
  • A Lifelog management system provides users with services to store, manage, and search their life logs. This paper proposes a fully-automatic collecting method of real world social contacts and lifelog search engine using collected social contact information as keyword. Wireless short-distance network devices in mobile phones are used to detect social contacts of their users. Human-Bluetooth relationship matrix is built based on the frequency of a human-being and a Bluetooth device being observed at the same time. Results show that with 20% of social contact information out of full social contact information of the observation times used for calculation, 90% of human-Bluetooth relationship can be correctly acquired. A lifelog search-engine that takes human names as keyword is suggested which compares two vectors, a row of Human-Bluetooth matrix and a vector of Bluetooth list scanned while a lifelog was created, using vector information retrieval model. This search engine returns more lifelog than existing text-matching search engine and ranks the result unlike existing search-engine.

A study on the improvements of Foreign Research Information Center from the perspective of librarians in charge (외국학술지지원센터 개선방안에 관한 연구 - 운영 담당자의 관점을 중심으로 -)

  • Lee, Jongwook
    • Journal of Korean Library and Information Science Society
    • /
    • v.49 no.3
    • /
    • pp.283-305
    • /
    • 2018
  • Although academic library budgets have been decreasing, the rates of print and electronic journal subscription price have consistently increased. In response to this, as part of efforts to ensure access to foreign academic materials, the Ministry of Education and Korea Education & Research Information Service (KERIS) have initiated and operated Foreign Research Information Center (FRIC) since 2006, pursuing shared acquisition and sharing of foreign print journals. This study investigates the roles/values, issues raised by stakeholders, improvements in services, and new service elements of FRIC through the in-depth interviews with librarians in charge of FRIC in addition to examining its current state. The findings show that FRIC has contributed to sharing of academic materials and to promoting research. However, it was also found that the five types of stakeholders (i.e., the Ministry of Education/KERIS, universities/libraries, users, FRICs, and publishers/agencies) have diverse issues and problems with FRIC. Therefore, this study makes some suggestions to address the issues in terms of policy, system, management, and service.

XSTAR: XQuery to SQL Translation Algorithms on RDBMS (XSTAR: XML 질의의 SQL 변환 알고리즘)

  • Hong, Dong-Kweon;Jung, Min-Kyoung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.430-433
    • /
    • 2007
  • There have been several researches to manipulate XML Queries efficiently since XML has been accepted in many areas. Among the many of the researches majority of them adopt relational databases as underlying systems because relational model which is used the most widely for managing large data efficiently. In this paper we develop XQuery to SQL Translation Algorithms called XSTAR that can efficiently handle XPath, XQuery FLWORs with nested iteration expressions, element constructors and keywords retrieval on relational database as well as constructing XML fragments from the transformed SQL results. The entire algorithms mentioned in XSTAR have been implemented as the XQuery processor engine in XML management system, XPERT, and we can test and confirm it's prototype from "http ://dblab.kmu.ac.kr/project.jsp".

AJFCode: An Approach for Full Aspect-Oriented Code Generation from Reusable Aspect Models

  • Mehmood, Abid;Jawawi, Dayang N.A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1973-1993
    • /
    • 2022
  • Model-driven engineering (MDE) and aspect-oriented software development (AOSD) contribute to the common goal of development of high-quality code in reduced time. To complement each approach with the benefits of the other, various methods of integration of the two approaches were proposed in the past. Aspect-oriented code generation, which targets obtaining aspect-oriented code directly from aspect models, offers some unique advantages over the other integration approaches. However, the existing aspect-oriented code generation approaches do not comprehensively address all aspects of a model-driven code generation system, such as a textual representation of graphical models, conceptual mapping, and incorporation of behavioral diagrams. These problems limit the worth of generated code, especially in practical use. Here, we propose AJFCode, an approach for aspect-oriented model-driven code generation, which comprehensively addresses the various aspects including the graphical models and their text-based representation, mapping between visual model elements and code, and the behavioral code generation. Experiments are conducted to compare the maintainability and reusability characteristics of the aspect-oriented code generated using the AJFCode with the most comprehensive object-oriented code generation approach. AJFCode performs well in terms of all metrics related to maintainability and reusability of code. However, the most significant improvement is noticed in the separation of concerns, coupling, and cohesion. For instance, AJFCode yields significant improvement in concern diffusion over operations (19 vs 51), coupling between components (0 vs 6), and lack of cohesion in operations (5 vs 9) for one of the experimented concerns.