• Title/Summary/Keyword: Data mining architecture

Search Result 120, Processing Time 0.029 seconds

Data Mining Approach to Clinical Decision Support System for Hypertension Management (고혈압관리를 위한 의사지원결정시스템의 데이터마이닝 접근)

  • 김태수;채영문;조승연;윤진희;김도마
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.203-212
    • /
    • 2002
  • This study examined the predictive power of data mining algorithms by comparing the performance of logistic regression and decision tree algorithm, called CHAID (Chi-squared Automatic Interaction Detection), On the contrary to the previous studies, decision tree performed better than logistic regression. We have also developed a CDSS (Clinical Decision Support System) with three modules (doctor, nurse, and patient) based on data warehouse architecture. Data warehouse collects and integrates relevant information from various databases from hospital information system (HIS ). This system can help improve decision making capability of doctors and improve accessibility of educational material for patients.

  • PDF

Development of Intelligent Credit Rating System using Support Vector Machines (Support Vector Machine을 이용한 지능형 신용평가시스템 개발)

  • Kim Kyoung-jae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.7
    • /
    • pp.1569-1574
    • /
    • 2005
  • In this paper, I propose an intelligent credit rating system using a bankruptcy prediction model based on support vector machines (SVMs). SVMs are promising methods because they use a risk function consisting of the empirical error and a regularized term which is derived from the structural risk minimization principle. This study examines the feasibility of applying SVM in Predicting corporate bankruptcies by comparing it with other data mining techniques. In addition. this study presents architecture and prototype of intelligeht credit rating systems based on SVM models.

Design and Implementation of a Distributed Data Mining Framework (분산된 데이터마이닝을 위한 프레임워크의 설계 및 구현)

  • Kadel, Prakash;Choi, Ho-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.336-340
    • /
    • 2007
  • We envisage that grid computing environments allow us to implement distributed data mining services, that is, those applications which analyze large sets of geographically distributed databases and information using the computational power and resources of a grid environment. This paper describes an experimental framework towards such a distributed data mining approach, including design considerations and a prototype implementation. Based on the "Knowledge Grid" architecture suggested by Cannataro et al., we identify four major components - user node, broker node, data node, and computation node - and define their individual roles. For implementing the prototype, we have investigated methods for utilizing distributed resources within a grid computing environment, e.g., communication and coordination among the various resources available.

  • PDF

Implementing Linear Models in Genetic Programming to Utilize Accumulated Data in Shipbuilding (조선분야의 축적된 데이터 활용을 위한 유전적프로그래밍에서의 선형(Linear) 모델 개발)

  • Lee, Kyung-Ho;Yeun, Yun-Seog;Yang, Young-Soon
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.42 no.5 s.143
    • /
    • pp.534-541
    • /
    • 2005
  • Until now, Korean shipyards have accumulated a great amount of data. But they do not have appropriate tools to utilize the data in practical works. Engineering data contains experts' experience and know-how in its own. It is very useful to extract knowledge or information from the accumulated existing data by using data mining technique This paper treats an evolutionary computation based on genetic programming (GP), which can be one of the components to realize data mining. The paper deals with linear models of GP for the regression or approximation problem when given learning samples are not sufficient. The linear model, which is a function of unknown parameters, is built through extracting all possible base functions from the standard GP tree by utilizing the symbolic processing algorithm. In addition to a standard linear model consisting of mathematic functions, one variant form of a linear model, which can be built using low order Taylor series and can be converted into the standard form of a polynomial, is considered in this paper. The suggested model can be utilized as a designing tool to predict design parameters with small accumulated data.

Distributed and Scalable Intrusion Detection System Based on Agents and Intelligent Techniques

  • El-Semary, Aly M.;Mostafa, Mostafa Gadal-Haqq M.
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.481-500
    • /
    • 2010
  • The Internet explosion and the increase in crucial web applications such as ebanking and e-commerce, make essential the need for network security tools. One of such tools is an Intrusion detection system which can be classified based on detection approachs as being signature-based or anomaly-based. Even though intrusion detection systems are well defined, their cooperation with each other to detect attacks needs to be addressed. Consequently, a new architecture that allows them to cooperate in detecting attacks is proposed. The architecture uses Software Agents to provide scalability and distributability. It works in two modes: learning and detection. During learning mode, it generates a profile for each individual system using a fuzzy data mining algorithm. During detection mode, each system uses the FuzzyJess to match network traffic against its profile. The architecture was tested against a standard data set produced by MIT's Lincoln Laboratory and the primary results show its efficiency and capability to detect attacks. Finally, two new methods, the memory-window and memoryless-window, were developed for extracting useful parameters from raw packets. The parameters are used as detection metrics.

Establishment of the roof model and optimization of the working face length in top coal caving mining

  • Chang-Xiang Wang;Qing-Heng Gu;Meng Zhang;Cheng-Yang Jia;Bao-Liang Zhang;Jian-Hang Wang
    • Geomechanics and Engineering
    • /
    • v.36 no.5
    • /
    • pp.427-440
    • /
    • 2024
  • This study concentrates on the 301 comprehensive caving working face, notable for its considerable mining height. The roof model is established by integrating prior geological data and the latest borehole rock stratum's physical and mechanical parameters. This comprehensive approach enables the determination of lithology, thickness, and mechanical properties of the roof within 50 m of the primary mining coal seam. Utilizing the transfer rock beam theory and incorporating mining pressure monitoring data, the study delves into the geometric parameters of the direct roof, basic roof movement, and roof pressure during the initial mining process of the 301 comprehensive caving working face. The direct roof of the mining working face is stratified into upper and lower sections. The lower direct roof consists of 6.0 m thick coarse sandstone, while the upper direct roof comprises 9.2 m coarse sandstone, 2.6 m sandy mudstone, and 2.8 m medium sandstone. The basic roof stratum, totaling 22.1 m in thickness, includes layers such as silty sand, medium sandstone, sandy mudstone, and coal. The first pressure step of the basic roof is 61.6 m, with theoretical research indicating a maximum roof pressure of 1.62 MPa during periodic pressure. Extensive simulations and analyses of roof subsidence and advanced abutment pressure under varying working face lengths. Optimal roof control effect is observed when the mining face length falls within the range of 140 m-155 m. This study holds significance as it optimizes the working face length in thick coal seams, enhancing safety and efficiency in coal mining operations.

Developing an User Location Prediction Model for Ubiquitous Computing based on a Spatial Information Management Technique

  • Choi, Jin-Won;Lee, Yung-Il
    • Architectural research
    • /
    • v.12 no.2
    • /
    • pp.15-22
    • /
    • 2010
  • Our prediction model is based on the development of "Semantic Location Model." It embodies geometrical and topological information which can increase the efficiency in prediction and make it easy to manipulate the prediction model. Data mining is being implemented to extract the inhabitant's location patterns generated day by day. As a result, the self-learning system will be able to semantically predict the inhabitant's location in advance. This context-aware system brings about the key component of the ubiquitous computing environment. First, we explain the semantic location model and data mining methods. Then the location prediction model for the ubiquitous computing system is described in details. Finally, the prototype system is introduced to demonstrate and evaluate our prediction model.

STATISTICALLY PREPROCESSED DATA BASED PARAMETRIC COST MODEL FOR BUILDING PROJECTS

  • Sae-Hyun Ji;Moonseo Park;Hyun-Soo Lee
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.417-424
    • /
    • 2009
  • For a construction project to progress smoothly, effective cost estimation is vital, particularly in the conceptual and schematic design stages. In these early phases, despite the fact that initial estimates are highly sensitive to changes in project scope, owners require accurate forecasts which reflect their supplying information. Thus, cost estimators need effective estimation strategies. Practically, parametric cost estimates are the most commonly used method in these initial phases, which utilizes historical cost data (Karshenas 1984, Kirkham 2007). Hence, compilation of historical data regarding appropriate cost variance governing parameters is a prime requirement. However, precedent practice of data mining (data preprocessing) for denoising internal errors or abnormal values is needed before compilation. As an effort to deal with this issue, this research proposed a statistical methodology for data preprocessing and verified that data preprocessing has a positive impact on the enhancement of estimate accuracy and stability. Moreover, Statistically Preprocessed data Based Parametric (SPBP) cost models are developed based on multiple regression equations and verified their effectiveness compared with conventional cost models.

  • PDF

Data Server Mining applied Neural Networks in Distributed Environment (분산 환경에서 신경망을 응용한 데이터 서버 마이닝)

  • 박민기;김귀태;이재완
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.05a
    • /
    • pp.473-476
    • /
    • 2003
  • Nowaday, Internet is doing the role of a large distributed information service tenter and various information and database servers managing it are in distributed network environment. However, the we have several difficulties in deciding the server to disposal input data depending on data properties. In this paper, we designed server mining mechanism and Intellectual data mining system architecture for the best efficiently dealing with input data pattern by using neural network among the various data in distributed environment. As a result, the new input data pattern could be operated after deciding the destination server according to dynamic binding method implemented by neural network. This mechanism can be applied Datawarehous, telecommunication and load pattern analysis, population census analysis and medical data analysis.

  • PDF

A Study on Process Management Method of Offshore Plant Piping Material using Process Mining Technique (프로세스 마이닝 기법을 이용한 해양플랜트 배관재 제작 공정 관리 방법에 관한 연구)

  • Park, JungGoo;Kim, MinGyu;Woo, JongHun
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.56 no.2
    • /
    • pp.143-151
    • /
    • 2019
  • This study describes a method for analyzing log data generated in a process using process mining techniques. A system for collecting and analyzing a large amount of log data generated in the process of manufacturing an offshore plant piping material was constructed. The analyzed data was visualized through various methods. Through the analysis of the process model, it was evaluated whether the process performance was correctly input. Through the pattern analysis of the log data, it is possible to check beforehand whether the problem process occurred. In addition, we analyzed the process performance data of partner companies and identified the load of their processes. These data can be used as reference data for pipe production allocation. Real-time decision-making is required to cope with the various variances that arise in offshore plant production. To do this, we have built a system that can analyze the log data of real - time system and make decisions.