• Title/Summary/Keyword: tree-based models

Search Result 437, Processing Time 0.031 seconds

ANALYZING DYNAMIC FAULT TREES DERIVED FROM MODEL-BASED SYSTEM ARCHITECTURES

  • Dehlinger, Josh;Dugan, Joanne Bechta
    • Nuclear Engineering and Technology
    • /
    • v.40 no.5
    • /
    • pp.365-374
    • /
    • 2008
  • Dependability-critical systems, such as digital instrumentation and control systems in nuclear power plants, necessitate engineering techniques and tools to provide assurances of their safety and reliability. Determining system reliability at the architectural design phase is important since it may guide design decisions and provide crucial information for trade-off analysis and estimating system cost. Despite this, reliability and system engineering remain separate disciplines and engineering processes by which the dependability analysis results may not represent the designed system. In this article we provide an overview and application of our approach to build architecture-based, dynamic system models for dependability-critical systems and then automatically generate dynamic fault trees (DFT) for comprehensive, tool-supported reliability analysis. Specifically, we use the Architectural Analysis and Design Language (AADL) to model the structural, behavioral and failure aspects of the system in a composite architecture model. From the AADL model, we seek to derive the DFT(s) and use Galileo's automated reliability analyses to estimate system reliability. This approach alleviates the dependability engineering - systems engineering knowledge expertise gap, integrates the dependability and system engineering design and development processes and enables a more formal, automated and consistent DFT construction. We illustrate this work using an example based on a dynamic digital feed-water control system for a nuclear reactor.

Machine Learning-Based Rapid Prediction Method of Failure Mode for Reinforced Concrete Column (기계학습 기반 철근콘크리트 기둥에 대한 신속 파괴유형 예측 모델 개발 연구)

  • Kim, Subin;Oh, Keunyeong;Shin, Jiuk
    • Journal of the Earthquake Engineering Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.113-119
    • /
    • 2024
  • Existing reinforced concrete buildings with seismically deficient column details affect the overall behavior depending on the failure type of column. This study aims to develop and validate a machine learning-based prediction model for the column failure modes (shear, flexure-shear, and flexure failure modes). For this purpose, artificial neural network (ANN), K-nearest neighbor (KNN), decision tree (DT), and random forest (RF) models were used, considering previously collected experimental data. Using four machine learning methodologies, we developed a classification learning model that can predict the column failure modes in terms of the input variables using concrete compressive strength, steel yield strength, axial load ratio, height-to-dept aspect ratio, longitudinal reinforcement ratio, and transverse reinforcement ratio. The performance of each machine learning model was compared and verified by calculating accuracy, precision, recall, F1-Score, and ROC. Based on the performance measurements of the classification model, the RF model represents the highest average value of the classification model performance measurements among the considered learning methods, and it can conservatively predict the shear failure mode. Thus, the RF model can rapidly predict the column failure modes with simple column details.

Data-centric XAI-driven Data Imputation of Molecular Structure and QSAR Model for Toxicity Prediction of 3D Printing Chemicals (3D 프린팅 소재 화학물질의 독성 예측을 위한 Data-centric XAI 기반 분자 구조 Data Imputation과 QSAR 모델 개발)

  • ChanHyeok Jeong;SangYoun Kim;SungKu Heo;Shahzeb Tariq;MinHyeok Shin;ChangKyoo Yoo
    • Korean Chemical Engineering Research
    • /
    • v.61 no.4
    • /
    • pp.523-541
    • /
    • 2023
  • As accessibility to 3D printers increases, there is a growing frequency of exposure to chemicals associated with 3D printing. However, research on the toxicity and harmfulness of chemicals generated by 3D printing is insufficient, and the performance of toxicity prediction using in silico techniques is limited due to missing molecular structure data. In this study, quantitative structure-activity relationship (QSAR) model based on data-centric AI approach was developed to predict the toxicity of new 3D printing materials by imputing missing values in molecular descriptors. First, MissForest algorithm was utilized to impute missing values in molecular descriptors of hazardous 3D printing materials. Then, based on four different machine learning models (decision tree, random forest, XGBoost, SVM), a machine learning (ML)-based QSAR model was developed to predict the bioconcentration factor (Log BCF), octanol-air partition coefficient (Log Koa), and partition coefficient (Log P). Furthermore, the reliability of the data-centric QSAR model was validated through the Tree-SHAP (SHapley Additive exPlanations) method, which is one of explainable artificial intelligence (XAI) techniques. The proposed imputation method based on the MissForest enlarged approximately 2.5 times more molecular structure data compared to the existing data. Based on the imputed dataset of molecular descriptor, the developed data-centric QSAR model achieved approximately 73%, 76% and 92% of prediction performance for Log BCF, Log Koa, and Log P, respectively. Lastly, Tree-SHAP analysis demonstrated that the data-centric-based QSAR model achieved high prediction performance for toxicity information by identifying key molecular descriptors highly correlated with toxicity indices. Therefore, the proposed QSAR model based on the data-centric XAI approach can be extended to predict the toxicity of potential pollutants in emerging printing chemicals, chemical process, semiconductor or display process.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

Machine Learning for Predicting Entrepreneurial Innovativeness (기계학습을 이용한 기업가적 혁신성 예측 모델에 관한 연구)

  • Chung, Doo Hee;Yun, Jin Seop;Yang, Sung Min
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.16 no.3
    • /
    • pp.73-86
    • /
    • 2021
  • The primary purpose of this paper is to explore the advanced models that predict entrepreneurial innovativeness most accurately. For the first time in the field of entrepreneurship research, it presents a model that predicts entrepreneurial innovativeness based on machine learning corresponding to data scientific approaches. It uses 22,099 the Global Entrepreneurship Monitor (GEM) data from 62 countries to build predictive models. Based on the data set consisting of 27 explanatory variables, it builds predictive models that are traditional statistical methods such as multiple regression analysis and machine learning models such as regression tree, random forest, XG boost, and artificial neural networks. Then, it compares the performance of each model. It uses indicators such as root mean square error (RMSE), mean analysis error (MAE) and correlation to evaluate the performance of the model. The analysis of result is that all five machine learning models perform better than traditional methods, while the best predictive performance model was XG boost. In predicting it through XG boost, the variables with high contribution are entrepreneurial opportunities and cross-term variables of market expansion, which indicates that the type of entrepreneur who wants to acquire opportunities in new markets exhibits high innovativeness.

Prediction of Land-cover Change in the Gongju Areas using Fuzzy Logic and Geo-spatial Information (퍼지 논리와 지리공간정보를 이용한 공주지역 토지피복 변화 예측)

  • Jang, Dong-Ho
    • Journal of Environmental Impact Assessment
    • /
    • v.14 no.6
    • /
    • pp.387-402
    • /
    • 2005
  • In this study, we tried to predict the change of future land-cover and relationships between land-cover change and geo-spatial information in the Gongju area by using fuzzy logic operation. Quantitative evaluation of prediction models was carried out using a prediction rate curve using. Based on the analysis of correlations between the geo-spatial information and land-cover change, the class with the highest correlation was extracted. Fuzzy operations were used to predict land-cover change and determine the land-cover prediction maps that were the most suitable. It was predicted that in urban areas, the urban expansion of old and new towns would occur centering on the Gem-river, and that urbanization of areas along the interchange and national roads would also expand. Among agricultural areas, areas adjacent to national roads connected to small tributaries of the Gem-river and neighboring areas would likely experience changes. Most of the forest areas are located in southeast and from this result we can guess why the wide chestnut-tree cultivation complex is located in these areas and the possibility of forest damage is very high. As a result of validation using the prediction rate curve, it was indicated that among fuzzy operators, the maximum fuzzy operator was the most suitable for analyzing land-cover change in urban and agricultural areas. Other fuzzy operators resulted in the similar prediction capabilities. However, in the prediction rate curve of integrated models for land-cover prediction in the forest areas, most fuzzy operators resulted in poorer prediction capabilities. Thus, it is necessary to apply new thematic maps or prediction models in connection with the effective prediction of changes in the forest areas.

Improving the Effectiveness of Customer Classification Models: A Pre-segmentation Approach (사전 세분화를 통한 고객 분류모형의 효과성 제고에 관한 연구)

  • Chang, Nam-Sik
    • Information Systems Review
    • /
    • v.7 no.2
    • /
    • pp.23-40
    • /
    • 2005
  • Discovering customers' behavioral patterns from large data set and providing them with corresponding services or products are critical components in managing a current business. However, the diversity of customer needs coupled with the limited resources suggests that companies should make more efforts on understanding and managing specific groups of customers, not the whole customers. The key issue of this paper is based on the fact that the behavioral patterns extracted from the specific groups of customers shall be different from those from the whole customers. This paper proposes the idea of pre-segmentation before developing customer classification models. We collected three customers' demographic and transactional data sets from a credit card, a tele-communication, and an insurance company in Korea, and then segmented customers by major variables. Different churn prediction models were developed from each segments and the whole data set, respectively, using the decision tree induction approach, and compared in terms of the hit ratio and the simplicity of generated rules.

A study on Data Preprocessing for Developing Remaining Useful Life Predictions based on Stochastic Degradation Models Using Air Craft Engine Data (항공엔진 열화데이터 기반 잔여수명 예측력 향상을 위한 데이터 전처리 방법 연구)

  • Yoon, Yeon Ah;Jung, Jin Hyeong;Lim, Jun Hyoung;Chang, Tai-Woo;Kim, Yong Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.48-55
    • /
    • 2020
  • Recently, a study of prognosis and health management (PHM) was conducted to diagnose failure and predict the life of air craft engine parts using sensor data. PHM is a framework that provides individualized solutions for managing system health. This study predicted the remaining useful life (RUL) of aeroengine using degradation data collected by sensors provided by the IEEE 2008 PHM Conference Challenge. There are 218 engine sensor data that has initial wear and production deviations. It was difficult to determine the characteristics of the engine parts since the system and domain-specific information was not provided. Each engine has a different cycle, making it difficult to use time series models. Therefore, this analysis was performed using machine learning algorithms rather than statistical time series models. The machine learning algorithms used were a random forest, gradient boost tree analysis and XG boost. A sliding window was applied to develop RUL predictions. We compared model performance before and after applying the sliding window, and proposed a data preprocessing method to develop RUL predictions. The model was evaluated by R-square scores and root mean squares error (RMSE). It was shown that the XG boost model of the random split method using the sliding window preprocessing approach has the best predictive performance.

Study on Development of Classification Model and Implementation for Diagnosis System of Sasang Constitution (사상체질 분류모형 개발 및 진단시스템의 구현에 관한 연구)

  • Beum, Soo-Gyun;Jeon, Mi-Ran;Oh, Am-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.08a
    • /
    • pp.155-159
    • /
    • 2008
  • In this thesis, in order to develop a new classification model of Sasang Constitutional medical types, which is helpful for improving the accuracy of diagnosis of medical types. various data-mining classification models such as discriminant analysis. decision trees analysis, neural networks analysis, logistics regression analysis, clustering analysis which are main classification methods were applied to the questionnaires of medical type classification. In this manner, a model which scientifically classifies constitutional medical types in the field of Sasang Constitutional Medicine, one of a traditional Korean medicine, has been developed. Also, the above-mentioned analysis models were systematically compared and analyzed. In this study, a classification of Sasang constitutional medical types was developed based on the discriminate analysis model and decision trees analysis model of which accuracy is relatively high, of which analysis procedure is easy to understand and to explain and which are easy to implement. Also, a diagnosis system of Sasang constitution was implemented applying the two analysis models.

  • PDF

System dynamics simulation of the thermal dynamic processes in nuclear power plants

  • El-Sefy, Mohamed;Ezzeldin, Mohamed;El-Dakhakhni, Wael;Wiebe, Lydell;Nagasaki, Shinya
    • Nuclear Engineering and Technology
    • /
    • v.51 no.6
    • /
    • pp.1540-1553
    • /
    • 2019
  • A nuclear power plant (NPP) is a highly complex system-of-systems as manifested through its internal systems interdependence. The negative impact of such interdependence was demonstrated through the 2011 Fukushima Daiichi nuclear disaster. As such, there is a critical need for new strategies to overcome the limitations of current risk assessment techniques (e.g. the use of static event and fault tree schemes), particularly through simulation of the nonlinear dynamic feedback mechanisms between the different NPP systems/components. As the first and key step towards developing an integrated NPP dynamic probabilistic risk assessment platform that can account for such feedback mechanisms, the current study adopts a system dynamics simulation approach to model the thermal dynamic processes in: the reactor core; the secondary coolant system; and the pressurized water reactor. The reactor core and secondary coolant system parameters used to develop system dynamics models are based on those of the Palo Verde Nuclear Generating Station. These three system dynamics models are subsequently validated, using results from published work, under different system perturbations including the change in reactivity, the steam valve coefficient, the primary coolant flow, and others. Moving forward, the developed system dynamics models can be integrated with other interacting processes within a NPP to form the basis of a dynamic system-level (systemic) risk assessment tool.