• Title/Summary/Keyword: Document searching

Search Result 170, Processing Time 0.022 seconds

The Construction of Multiform User Profiles Based on Transaction for Effective Recommendation and Segmentation (효과적인 추천과 세분화를 위한 트랜잭션 기반 여러 형태 사용자 프로파일의 구축)

  • Koh, Jae-Jin;An, Hyoung-Keun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.661-670
    • /
    • 2006
  • With the development of e-Commerce and the proliferation of easily accessible information, information filtering systems such as recommender and SDI systems have become popular to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. Until now, many information filtering methods have been proposed to support filtering systems. XML is emerging as a new standard for information. Recently, filtering systems need new approaches in dealing with XML documents. So, in this paper our system suggests a method to create multiform user profiles with XML's ability to represent structure. This system consists of two parts; one is an administrator profile definition part that an administrator defines to analyze users purchase pattern before a transaction such as purchase happens directly. an other is a user profile creation part module which is applied by the defined profile. Administrator profiles are made from DTD information and it is supposed to point the specific part of a document conforming to the DTD. Proposed system builds user's profile more accurately to get adaptability for user's behavior of buying and provide useful product information without inefficient searching based on such user's profile.

A Study on Implementation of SVG for ENC Applications (전자해도 활용을 위한 SVG 변환 연구)

  • Oh, Se-Woong;Park, Jong-Min;Seo, Ki-Yeol;Suh, Sang-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.10
    • /
    • pp.1930-1936
    • /
    • 2007
  • Electronic Navigational Charts(ENCs) are official nautical charts which are equivalent to paper charts with supplementary information. Although their main purpose is to be used for the safe navigation of ships, they also contain much information on coasts and seas which may be interesting to ordinary people. However, there is no easy way to access them because of their specialized data format, access method and visualization. This paper proposes m implementation of SVG for the access and services of ENCs. SVG(Scalable Vector Graphic) makes it possible to make use of Vector graphics for map services in basic internet browsing environment. Implementation of SVG for ENC applications by this research is free of special server side GIS mapping system and client side extra technology. The Implementation of SVG for ENC Applications can be summarized as follows: Firstly, SVG provides spatial information to possess searching engine to embody SVG map. Secondly SVG can provide high-quality vector map graphics and interactive facility without special Internet GIS system. It makes it possible to use services with very low cost. Thirdly, SVG information service targeting on maritime transportation can be used as template, so it can be used dynamically any other purpose such as traffic management and vessel monitoring. Many good characteristics of SVG in mapping at computer screen and reusability of SVG document provide new era of visualization of marine geographic information.

The Classification System and its Code on Archives of the Government-general Museum of Joseon in the National Museum of Korea (조선총독부박물관 문서의 분류 체계에 대한 시론)

  • Oh, Youngchan
    • MISULJARYO - National Museum of Korea Art Journal
    • /
    • v.96
    • /
    • pp.181-208
    • /
    • 2019
  • This paper presents a new classification and code system on the Archives of the Government-general Museum of Joseon in the National Museum of Korea. Three points were noted that since the Museum belonged to the Government-general of Joseon, the classification system of the Archives should be established to comply with the Archives system of the Government-general of Joseon; based on the concept of the functional provenance, it is necessary to establish a classification system in accordance with the organization structure of the Government-general Museum of Joseon; a systematic and simple classification codes should be given based on the classification system to improve the convenience of searching and using the official document. The classification system and its code are proposed in the order of major function, medium function, small function, and detailed function. The major function of the Archives is 'A-Educational affairs', medium function 'Museum.' The small function may be divided into General affairs (01), Temple (02), Scenic Spot and Natural Monument (03), Historical Site (04), and Museum (05). The detailed function and detailed sub-functions are categorized by the various work assignments in each work units. I hope that this new classification system will make a contribution to organizing and utilizing the Archives of the Government-general Museum of Joseon in the National Museum of Korea.

Abbreviation Disambiguation using Topic Modeling (토픽모델링을 이용한 약어 중의성 해소)

  • Woon-Kyo Lee;Ja-Hee Kim;Junki Yang
    • Journal of the Korea Society for Simulation
    • /
    • v.32 no.1
    • /
    • pp.35-44
    • /
    • 2023
  • In recent, there are many research cases that analyze trends or research trends with text analysis. When collecting documents by searching for keywords in abbreviations for data analysis, it is necessary to disambiguate abbreviations. In many studies, documents are classified by hand-work reading the data one by one to find the data necessary for the study. Most of the studies to disambiguate abbreviations are studies that clarify the meaning of words and use supervised learning. The previous method to disambiguate abbreviation is not suitable for classification studies of documents looking for research data from abbreviation search documents, and related studies are also insufficient. This paper proposes a method of semi-automatically classifying documents collected by abbreviations by going topic modeling with Non-Negative Matrix Factorization, an unsupervised learning method, in the data pre-processing step. To verify the proposed method, papers were collected from academic DB with the abbreviation 'MSA'. The proposed method found 316 papers related to Micro Services Architecture in 1,401 papers. The document classification accuracy of the proposed method was measured at 92.36%. It is expected that the proposed method can reduce the researcher's time and cost due to hand work.

A Study on the Current Status and Utilization of Old Map in Library and Museums in Korea (국내 도서관·박물관 소장 고지도의 현황 및 활용에 관한 연구)

  • Gi Young Kim
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.1
    • /
    • pp.97-125
    • /
    • 2024
  • The purpose of this study is to increase access to information on old maps and to discuss efficient ways to utilize old maps, such as providing services and information using old maps. To this end, the information search system of domestic institutions that provide old map information was investigated, and methods of searching for old map data and accessing information were searched on the website. In addition, the current status of the collection of old maps in domestic libraries and museums was analyzed by referring to the homepage, book, research book, and publication of each institution. As a result of the analysis, about 2,200 old maps were housed in 76 institutions, including national, public, and university libraries and museums nationwide. Each institution in the collection of old maps was carrying out publication business, such as publication of English manuscripts, exhibitions and books, publication of research document edits such as lists and summaries. However, reading and using of original documents are limited due to the rare nature of old maps and the data characteristics of the only one. In order to effectively utilize old maps, first, it is necessary to improve access to old map information services and expand academic information services. Second, it is proposed to use old maps as data for archival construction that reflects the identity of the region. Third, it is necessary to cultivate professional manpower who selects and provides information based on knowledge of old map data and humanities literacy.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

A Study on the Design of Case-based Reasoning Office Knowledge Recommender System for Office Professionals (사례기반추론을 이용한 사무지식 추천시스템)

  • Kim, Myong-Ok;Na, Jung-Ah
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.131-146
    • /
    • 2011
  • It is becoming more essential than ever for office professionals to become competent in information collection/gathering and problem solving in today's global business society. In particular, office professionals do not only assist simple chores but are also forced to make decisions as quickly and efficiently as possible in problematic situations that can end in either profit or loss to their company. Since office professionals rely heavily on their tacit knowledge to solve problems that arise in everyday business situations, it is truly helpful and efficient to refer to similar business cases from the past and share or reuse such previous business knowledge for better performance results. Case-based reasoning(CBR) is a problem-solving method which utilizes previous similar cases to solve problems. Through CBR, the closest case to the current business situation can be searched and retrieved from the case or knowledge base and can be referred to for a new solution. This reduces the time and resources needed and increase success probability. The main purpose of this study is to design a system called COKRS(Case-based reasoning Office Knowledge Recommender System) and develop a prototype for it. COKRS manages cases and their meta data, accepts key words from the user and searches the casebase for the most similar past case to the input keyword, and communicates with users to collect information about the quality of the case provided and continuously apply the information to update values on the similarity table. Core concepts like system architecture, definition of a case, meta database, similarity table have been introduced, and also an algorithm to retrieve all similar cases from past work history has also been proposed. In this research, a case is best defined as a work experience in office administration. However, defining a case in office administration was not an easy task in reality. We surveyed 10 office professionals in order to get an idea of how to define a case in office administration and found out that in most cases any type of office work is to be recorded digitally and/or non-digitally. Therefore, we have defined a record or document case as for COKRS. Similarity table was composed of items of the result of job analysis for office professionals conducted in a previous research. Values between items of the similarity table were initially set to those from researchers' experiences and literature review. The results of this study could also be utilized in other areas of business for knowledge sharing wherever it is necessary and beneficial to share and learn from past experiences. We expect this research to be a reference for researchers and developers who are in this area or interested in office knowledge recommendation system based on CBR. Focus group interview(FGI) was conducted with ten administrative assistants carefully selected from various areas of business. They were given a chance to try out COKRS in an actual work setting and make some suggestions for future improvement. FGI has identified the user-interface for saving and searching cases for keywords as the most positive aspect of COKRS, and has identified the most urgently needed improvement as transforming tacit knowledge and knowhow into recorded documents more efficiently. Also, the focus group has mentioned that it is essential to secure enough support, encouragement, and reward from the company and promote positive attitude and atmosphere for knowledge sharing for everybody's benefit in the company.

Evaluation of Toxicological Data on Food Additives and Guideline for ADI establishment - Polydimethylsiloxane as emulsifier - (식품첨가물의 독성자료 고찰과 ADI 평가지침 - 소포제 Polydimethylsiloxane를 사례로 -)

  • Choi, Chan-Woong;Jeong, Ji-Yoon;Park, Hyoung-Su;Moon, Jin-Hyun;Lee, Kwang-Ho;Lee, Hyo-Min
    • Journal of Food Hygiene and Safety
    • /
    • v.24 no.4
    • /
    • pp.352-356
    • /
    • 2009
  • The purpose of this study was to introduce the toxicological study review to evaluate the safety of PDMS on the 69th JECFA meeting. Polydimethylsiloxane is a polymer and its ADI was established at 23rd JECFA meeting in 1979. The ADI was maintained although the specification was expanded at its 26th, 29 th, 37 th meetings. Recently, it was reported that PDMS with low molecular weight and viscosity has high absorption rate and different toxicity, so it was submitted at 69th meeting. Toxicological studies of PDMS were submitted from the sponsor and additional information is collected from a document searching. The toxicological studies were reviewed in accordance with the 'Guidelines for the preparation of toxicological working papers for the Joint FAO/WHO Expert Committee on Food Additives'. In the available acute, sub-chronic and chronic toxicity studies on PDMS, dose-related increases in incidence and severity of ocular lesions(corneal crystal, inflammation of the corneal epithelium etc.) were consistently observed after oral dosing. It seems to be a local irritant effect, but the mechanism by which the ocular lesions arose is unclear, although the lack of absorption of PDMS indicates that it is unlikely to be a direct systemic effect. Consequently, the relevance of the ocular lesions for food use of PDMS could not be determined. The ADI of PDMS was re-established from 0-1.5 mg/kg bw/day to 0-0.8 mg/kg bw/day by applying additional safety factor 2 based on its ocular toxicity. The result of 0-0.8 mg/kg bw/day is a temporary ADI until further data are provided to 2010.

Social Tagging-based Recommendation Platform for Patented Technology Transfer (특허의 기술이전 활성화를 위한 소셜 태깅기반 지적재산권 추천플랫폼)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.53-77
    • /
    • 2015
  • Korea has witnessed an increasing number of domestic patent applications, but a majority of them are not utilized to their maximum potential but end up becoming obsolete. According to the 2012 National Congress' Inspection of Administration, about 73% of patents possessed by universities and public-funded research institutions failed to lead to creating social values, but remain latent. One of the main problem of this issue is that patent creators such as individual researcher, university, or research institution lack abilities to commercialize their patents into viable businesses with those enterprises that are in need of them. Also, for enterprises side, it is hard to find the appropriate patents by searching keywords on all such occasions. This system proposes a patent recommendation system that can identify and recommend intellectual rights appropriate to users' interested fields among a rapidly accumulating number of patent assets in a more easy and efficient manner. The proposed system extracts core contents and technology sectors from the existing pool of patents, and combines it with secondary social knowledge, which derives from tags information created by users, in order to find the best patents recommended for users. That is to say, in an early stage where there is no accumulated tag information, the recommendation is done by utilizing content characteristics, which are identified through an analysis of key words contained in such parameters as 'Title of Invention' and 'Claim' among the various patent attributes. In order to do this, the suggested system extracts only nouns from patents and assigns a weight to each noun according to the importance of it in all patents by performing TF-IDF analysis. After that, it finds patents which have similar weights with preferred patents by a user. In this paper, this similarity is called a "Domain Similarity". Next, the suggested system extract technology sector's characteristics from patent document by analyzing the international technology classification code (International Patent Classification, IPC). Every patents have more than one IPC, and each user can attach more than one tag to the patents they like. Thus, each user has a set of IPC codes included in tagged patents. The suggested system manages this IPC set to analyze technology preference of each user and find the well-fitted patents for them. In order to do this, the suggeted system calcuates a 'Technology_Similarity' between a set of IPC codes and IPC codes contained in all other patents. After that, when the tag information of multiple users are accumulated, the system expands the recommendations in consideration of other users' social tag information relating to the patent that is tagged by a concerned user. The similarity between tag information of perferred 'patents by user and other patents are called a 'Social Simialrity' in this paper. Lastly, a 'Total Similarity' are calculated by adding these three differenent similarites and patents having the highest 'Total Similarity' are recommended to each user. The suggested system are applied to a total of 1,638 korean patents obtained from the Korea Industrial Property Rights Information Service (KIPRIS) run by the Korea Intellectual Property Office. However, since this original dataset does not include tag information, we create virtual tag information and utilized this to construct the semi-virtual dataset. The proposed recommendation algorithm was implemented with JAVA, a computer programming language, and a prototype graphic user interface was also designed for this study. As the proposed system did not have dependent variables and uses virtual data, it is impossible to verify the recommendation system with a statistical method. Therefore, the study uses a scenario test method to verify the operational feasibility and recommendation effectiveness of the system. The results of this study are expected to improve the possibility of matching promising patents with the best suitable businesses. It is assumed that users' experiential knowledge can be accumulated, managed, and utilized in the As-Is patent system, which currently only manages standardized patent information.

Information types and characteristics within the Wireless Emergency Alert in COVID-19: Focusing on Wireless Emergency Alerts in Seoul (코로나 19 하에서 재난문자 내의 정보유형 및 특성: 서울특별시 재난문자를 중심으로)

  • Yoon, Sungwook;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.45-68
    • /
    • 2022
  • The central and local governments of the Republic of Korea provided information necessary for disaster response through wireless emergency alerts (WEAs) in order to overcome the pandemic situation in which COVID-19 rapidly spreads. Among all channels for delivering disaster information, wireless emergency alert is the most efficient, and since it adopts the CBS(Cell Broadcast Service) method that broadcasts directly to the mobile phone, it has the advantage of being able to easily access disaster information through the mobile phone without the effort of searching. In this study, the characteristics of wireless emergency alerts sent to Seoul during the past year and one month (January 2020 to January 2021) were derived through various text mining methodologies, and various types of information contained in wireless emergency alerts were analyzed. In addition, it was confirmed through the population mobility by age in the districts of Seoul that what kind of influence it had on the movement behavior of people. After going through the process of classifying key words and information included in each character, text analysis was performed so that individual sent characters can be used as an analysis unit by applying a document cluster analysis technique based on the included words. The number of WEAs sent to the Seoul has grown dramatically since the spread of Covid-19. In January 2020, only 10 WEAs were sent to the Seoul, but the number of the WEAs increased 5 times in March, and 7.7 times over the previous months. Since the basic, regional local government were authorized to send wireless emergency alerts independently, the sending behavior of related to wireless emergency alerts are different for each local government. Although most of the basic local governments increased the transmission of WEAs as the number of confirmed cases of Covid-19 increases, the trend of the increase in WEAs according to the increase in the number of confirmed cases of Covid-19 was different by region. By using structured econometric model, the effect of disaster information included in wireless emergency alerts on population mobility was measured by dividing it into baseline effect and accumulating effect. Six types of disaster information, including date, order, online URL, symptom, location, normative guidance, were identified in WEAs and analyzed through econometric modelling. It was confirmed that the types of information that significantly change population mobility by age are different. Population mobility of people in their 60s and 70s decreased when wireless emergency alerts included information related to date and order. As date and order information is appeared in WEAs when they intend to give information about Covid-19 confirmed cases, these results show that the population mobility of higher ages decreased as they reacted to the messages reporting of confirmed cases of Covid-19. Online information (URL) decreased the population mobility of in their 20s, and information related to symptoms reduced the population mobility of people in their 30s. On the other hand, it was confirmed that normative words that including the meaning of encouraging compliance with quarantine policies did not cause significant changes in the population mobility of all ages. This means that only meaningful information which is useful for disaster response should be included in the wireless emergency alerts. Repeated sending of wireless emergency alerts reduces the magnitude of the impact of disaster information on population mobility. It proves indirectly that under the prolonged pandemic, people started to feel tired of getting repetitive WEAs with similar content and started to react less. In order to effectively use WEAs for quarantine and overcoming disaster situations, it is necessary to reduce the fatigue of the people who receive WEA by sending them only in necessary situations, and to raise awareness of WEAs.