• Title/Summary/Keyword: Knowledge discovery

Search Result 392, Processing Time 0.026 seconds

Linear Programming Model Discovery from Databases (데이터베이스로부터의 선형계획모형 추출방법에 대한 연구)

  • 권오병;김윤호
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.290-293
    • /
    • 2000
  • Knowledge discovery refers to the overall process of discovering useful knowledge from data. The linear programming model is a special form of useful knowledge that is embedded in a database. Since formulating models from scratch requires knowledge-intensive efforts, knowledge-based formulation support systems have been proposed in the DSS area. However, they rely on the strict assumption that sufficient domain knowledge should already be captured as a specific knowledge representation form. Hence, the purpose of this paper is to propose a methodology that finds useful knowledge on building linear programming models from a database. The methodology consists of two parts. The first part is to find s first-cut model based on a data dictionary. To do so, we applied the GPS algorithm. The second part is to discover a second-cut model by applying neural network technique. An illustrative example is described to show the feasibility of the proposed methodology.

  • PDF

Computer-Aided Drug Discovery in Plant Pathology

  • Shanmugam, Gnanendra;Jeon, Junhyun
    • The Plant Pathology Journal
    • /
    • v.33 no.6
    • /
    • pp.529-542
    • /
    • 2017
  • Control of plant diseases is largely dependent on use of agrochemicals. However, there are widening gaps between our knowledge on plant diseases gained from genetic/mechanistic studies and rapid translation of the knowledge into target-oriented development of effective agrochemicals. Here we propose that the time is ripe for computer-aided drug discovery/design (CADD) in molecular plant pathology. CADD has played a pivotal role in development of medically important molecules over the last three decades. Now, explosive increase in information on genome sequences and three dimensional structures of biological molecules, in combination with advances in computational and informational technologies, opens up exciting possibilities for application of CADD in discovery and development of agrochemicals. In this review, we outline two categories of the drug discovery strategies: structure- and ligand-based CADD, and relevant computational approaches that are being employed in modern drug discovery. In order to help readers to dive into CADD, we explain concepts of homology modelling, molecular docking, virtual screening, and de novo ligand design in structure-based CADD, and pharmacophore modelling, ligand-based virtual screening, quantitative structure activity relationship modelling and de novo ligand design for ligand-based CADD. We also provide the important resources available to carry out CADD. Finally, we present a case study showing how CADD approach can be implemented in reality for identification of potent chemical compounds against the important plant pathogens, Pseudomonas syringae and Colletotrichum gloeosporioides.

Development of a Grid-based Framework for High-Performance Scientific Knowledge Discovery (그리드 기반의 고성능 과학기술지식처리 프레임워크 개발)

  • Jeong, Chang-Hoo;Choi, Sung-Pil;Yoon, Hwa-Mook;Choi, Yun-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.12
    • /
    • pp.877-885
    • /
    • 2009
  • In this paper, we propose the SINDI-Grid which is a high-performance framework for scientific and technological knowledge discovery using the grid computing. By using the advantages of the grid computing providing data repository of large-volume and high-speed computing power, the SINDI-Grid framework provides a variety of grid services for distributed data analysis and scientific knowledge processing. And the SINDI-Workflow tool exploits these services so that performs the design and execution for scientific and technological knowledge discovery applications which integrate various information processing algorithms.

Development of A Web Mining System Based On Document Similarity (문서 유사도 기반의 웹 마이닝 시스템 개발)

  • 이강찬;민재홍;박기식;임동순;우훈식
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.1
    • /
    • pp.75-86
    • /
    • 2002
  • In this study, we proposed design issues and structure of a web mining system and develop a system for the purpose of knowledge integration under world wide web environments resulted from our developing experiences. The developed system consists of three main functions: 1) gathering documents utilizing a search agent; 2) determining similarity coefficients between any two documents from term frequencies; 3) clustering documents based on similarity coefficients. It is believed that the developed system can be utilized for discovery of knowledge in relatively narrow domains such as news classification, index term generation in knowledge management.

  • PDF

Modeling a Business Performance Information System with Knowledge Discovery in Databases (데이터베이스 지식발견체계에 기반한 경영성과 정보시스템의 구축)

  • Cho, Seong-Hoon;Chung, Min-Yong;Kim, Jong-Hwa
    • IE interfaces
    • /
    • v.14 no.2
    • /
    • pp.164-171
    • /
    • 2001
  • We suggest a Business Performance Information System with Knowledge Discovery in Databases(KDD) as a key component of integrated information and knowledge management system. The proposed system measures business performance by considering both VA(Value-Added), which represents stakeholder's point of view and EVA(Economic Value-Added), which represents shareholder's point of view. In modeling of Business Performance Information System, we apply the following KDD processes : Data Warehouse for consistent management of a performance data, On-Line Analytic Processing(OLAP) for multidimensional analysis, Genetic Algorithms for exploring and finding dominant managing factors and Analytic Hierarchy Process(AHP) for applying expert's knowledge and experience. To demonstrate the performance of the system, we conducted a case study using financial data of Korean automobile industry over 16 years from 1981 to 1996, which is taken from database of KISFAS(Korea Investors Services Financial Analysis System).

  • PDF

Knowledge Discovery Process In Internet For Effective Knowledge Creation: Application To Stock Market (효과적인 지식창출을 위한 인터넷 상의 지식채굴과정: 주식시장에의 응용)

  • 김경재;홍태호;한인구
    • Proceedings of the Korea Database Society Conference
    • /
    • 1999.06a
    • /
    • pp.105-113
    • /
    • 1999
  • 최근 데이터와 데이터베이스의 폭발적 증가에 따라 무한한 데이터 속에서 정보나 지식을 찾고자하는 지식채굴과정 (knowledge discovery process)에 대한 관심이 높아지고 있다. 특히 기업 내외부 데이터베이스 뿐만 아니라 데이터웨어하우스 (data warehouse)를 기반으로 하는 OLAP환경에서의 데이터와 인터넷을 통한 웹 (web)에서의 정보 등 정보원의 다양화와 첨단화에 따라 다양한 환경 하에서의 지식채굴과정이 요구되고 있다. 본 연구에서는 인터넷 상의 지식을 효과적으로 채굴하기 위한 지식채굴과정을 제안한다. 제안된 지식채굴과정은 명시지 (explicit knowledge)외에 암묵지 (tacit knowledge)를 지식채굴과정에 반영하기 위해 선행지식베이스 (prior knowledge base)와 선행지식관리시스템 (prior knowledge management system)을 이용한다. 선행지식관리시스템은 퍼지인식도(fuzzy cognitive map)를 이용하여 선행지식베이스를 구축하여 이를 통해 웹에서 찾고자 하는 유용한 정보를 정의하고 추출된 정보를 지식변환시스템 (knowledge transformation system)을 통해 통합적인 추론과정에 사용할 수 있는 형태로 변환한다. 제안된 연구모형의 유용성을 검증하기 위하여 재무자료에 선행지식을 제외한 자료와 선행지식을 포함한 자료를 사례기반추론 (case-based reasoning)을 이용하여 실험한 결과, 제안된 지식채굴과정이 유용한 것으로 나타났다.

  • PDF

Knowledge Discovery Process In Internet For Effective Knowledge Creation : Application To Stock Market (효과적인 지식창출을 위한 인터넷 상의 지식채굴과정 : 주식시장에의 응용)

  • 김경재;홍태호;한인구
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.03a
    • /
    • pp.105-113
    • /
    • 1999
  • 최근 데이터와 데이터베이스의 폭발적 증가에 따라 무한한 데이터 속에서 정보나 지식을 찾고자하는 지식채굴과정(Knowledge discovery process)에 대한 관심이 높아지고 있다. 특히 기업 내외부 데이터베이스 뿐만 아니라 데이터웨어하우스(data warehouse)를 기반으로 하는 OLAP 환경에서의 데이터와 인터넷을 통한 웹(web)에서의 정보 등 정보원의 다양화와 첨단화에 따라 다양한 환경 하에서의 지식 채굴과정이 요구되고 있다. 본 연구에서는 인터넷 상의 지식을 효과적으로 채굴하기 위한 지식채굴과정을 제안한다. 제안된 지식채굴과정은 명시지(explicit knowledge)외에 암묵지(tacit knowledge)를 지식채굴과정에 반영하기 위해 선행지식베이스(prior knowledge base)와 선행지식관리시스템(prior knowledge management system)을 이용한다. 선행지식관리시스템은 퍼지인식도(fuzzy cognitive map)를 이용하여 선행지식베이스를 구축하여 이를 통해 웹에서 찾고자 하는 유용한 정보를 정의하고 추출된 정보를 지식변환시스템(knowledge transformation system)을 통해 통합적인 추론과정에 사용할 수 있는 형태로 변환한다. 제안된 연구모형의 유용성을 검증하기 위하여 재무자료에 선행지식을 제외한 자료와 선행지식을 포함한 자료를 사례기반추론 (case-based reasoning)을 이용하여 실험한 결과, 제안된 지식채굴과정이 유용한 것으로 나타났다.

  • PDF

Probabilistic filtering for a biological knowledge discovery system with text mining and automatic inference (텍스트 마이닝 및 자동 추론 기반 생물학 지식 발견 시스템을 위한 확률 기반 필터링)

  • Lee, Hee-Jin;Park, Jong-C.
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.2
    • /
    • pp.139-147
    • /
    • 2012
  • In this paper, we discuss the structure of biological knowledge discovery system based on text mining and automatic inference. Given a set of biology documents, the system produces a new hypothesis in an integrated manner. The text mining module of the system first extracts the 'event' information of predefined types from the documents. The inference module then produces a new hypothesis based on the extracted results. Such an integrated system can use information more up-to-date and diverse than other automatic knowledge discovery systems use. However, for the success of such an integrated system, the precision of the text mining module becomes crucial, as any hypothesis based on a single piece of false positive information would highly likely be erroneous. In this paper, we propose a probabilistic filtering method that filters out false positives from the extraction results. Our proposed method shows higher performance over an occurrence-based baseline method.

The HCARD Model using an Agent for Knowledge Discovery

  • Gerardo Bobby D.;Lee Jae-Wan;Joo Su-Chong
    • The Journal of Information Systems
    • /
    • v.14 no.3
    • /
    • pp.53-58
    • /
    • 2005
  • In this study, we will employ a multi-agent for the search and extraction of data in a distributed environment. We will use an Integrator Agent in the proposed model on the Hierarchical Clustering and Association Rule Discovery(HCARD). The HCARD will address the inadequacy of other data mining tools in processing performance and efficiency when use for knowledge discovery. The Integrator Agent was developed based on CORBA architecture for search and extraction of data from heterogeneous servers in the distributed environment. Our experiment shows that the HCARD generated essential association rules which can be practically explained for decision making purposes. Shorter processing time had been noted in computing for clusters using the HCARD and implying ideal processing period than computing the rules without HCARD.

  • PDF

Knowledge Discovery in Nursing Minimum Data Set Using Data Mining

  • Park Myong-Hwa;Park Jeong-Sook;Kim Chong-Nam;Park Kyung-Min;Kwon Young-Sook
    • Journal of Korean Academy of Nursing
    • /
    • v.36 no.4
    • /
    • pp.652-661
    • /
    • 2006
  • Purpose. The purposes of this study were to apply data mining tool to nursing specific knowledge discovery process and to identify the utilization of data mining skill for clinical decision making. Methods. Data mining based on rough set model was conducted on a large clinical data set containing NMDS elements. Randomized 1000 patient data were selected from year 1998 database which had at least one of the five most frequently used nursing diagnoses. Patient characteristics and care service characteristics including nursing diagnoses, interventions and outcomes were analyzed to derive the meaningful decision rules. Results. Number of comorbidity, marital status, nursing diagnosis related to risk for infection and nursing intervention related to infection protection, and discharge status were the predictors that could determine the length of stay. Four variables (age, impaired skin integrity, pain, and discharge status) were identified as valuable predictors for nursing outcome, relived pain. Five variables (age, pain, potential for infection, marital status, and primary disease) were identified as important predictors for mortality. Conclusions. This study demonstrated the utilization of data mining method through a large data set with stan dardized language format to identify the contribution of nursing care to patient's health.