• Title/Summary/Keyword: Data-driven

Search Result 1,944, Processing Time 0.036 seconds

Multi-dimensional Contextual Conditions-driven Mutually Exclusive Learning for Explainable AI in Decision-Making

  • Hyun Jung Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.7-21
    • /
    • 2024
  • There are various machine learning techniques such as Reinforcement Learning, Deep Learning, Neural Network Learning, and so on. In recent, Large Language Models (LLMs) are popularly used for Generative AI based on Reinforcement Learning. It makes decisions with the most optimal rewards through the fine tuning process in a particular situation. Unfortunately, LLMs can not provide any explanation for how they reach the goal because the training is based on learning of black-box AI. Reinforcement Learning as black-box AI is based on graph-evolving structure for deriving enhanced solution through adjustment by human feedback or reinforced data. In this research, for mutually exclusive decision-making, Mutually Exclusive Learning (MEL) is proposed to provide explanations of the chosen goals that are achieved by a decision on both ends with specified conditions. In MEL, decision-making process is based on the tree-based structure that can provide processes of pruning branches that are used as explanations of how to achieve the goals. The goal can be reached by trade-off among mutually exclusive alternatives according to the specific contextual conditions. Therefore, the tree-based structure is adopted to provide feasible solutions with the explanations based on the pruning branches. The sequence of pruning processes can be used to provide the explanations of the inferences and ways to reach the goals, as Explainable AI (XAI). The learning process is based on the pruning branches according to the multi-dimensional contextual conditions. To deep-dive the search, they are composed of time window to determine the temporal perspective, depth of phases for lookahead and decision criteria to prune branches. The goal depends on the policy of the pruning branches, which can be dynamically changed by configured situation with the specific multi-dimensional contextual conditions at a particular moment. The explanation is represented by the chosen episode among the decision alternatives according to configured situations. In this research, MEL adopts the tree-based learning model to provide explanation for the goal derived with specific conditions. Therefore, as an example of mutually exclusive problems, employment process is proposed to demonstrate the decision-making process of how to reach the goal and explanation by the pruning branches. Finally, further study is discussed to verify the effectiveness of MEL with experiments.

Application of data-driven model reduction techniques in reactor neutron field calculations

  • Zhaocai Xiang;Qiafeng Chen;Pengcheng Zhao
    • Nuclear Engineering and Technology
    • /
    • v.56 no.8
    • /
    • pp.2948-2957
    • /
    • 2024
  • High-order harmonic techniques can be used to recreate neutron flux distributions in reactor cores using the neutron diffusion equation. However, traditional source iteration and source correction iteration techniques have sluggish convergence rates and protracted calculation periods. The correctness of the implicitly restarted Arnoldi method (IRAM) in resolving the eigenvalue problems of the one-dimensional and two-dimensional neutron diffusion equations was confirmed by computing the benchmark problems SLAB_1D_1G and two-dimensional steady-state TWIGL using IRAM. By integrating Galerkin projection with Proper Orthogonal Decomposition (POD) techniques, a POD-Galerkin reduced-order model was developed and the IRAM model was used as the full-order model. For 14 macroscopic cross-section values, the TWIGL benchmark problem was perturbed within a 20% range. We extracted 100 sample points using the Latin hypercube sampling method, and 70% of the samples were used as the testing set to assess the performance of the reduced-order model The remaining 30% were utilized as the training set to develop the reduced-order model, which was employed to rebuild the TWIGL benchmark problem. The reduced-order model demonstrates good flexibility and can efficiently and accurately forecast the effective multiplication factor and neutron flux distribution in the core. The reduced-order model predicts keff and neutron flux distribution with a high degree of agreement compared to the full-order model. Additionally, the reduced-order model's computation time is only 10.18% of that required by the full-order model.The neutron flux distribution of the steady-state TWIGL benchmark was recreated using the reduced-order model. The obtained results indicate that the reduced-order model can accurately predict the keff and neutron flux distribution of the steady-state TWIGL benchmark.Overall, the proposed technique not only has the potential to accurately project neutron flux distributions in transient settings, but is also relevant for reconstructing neutron flux distributions in steady-state conditions; thus, its applicability is bound to increase in the future.

Development of an Enhanced Risk Management System for Construction Defect Control in Industrial Plants

  • Kihun Song
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.1313-1313
    • /
    • 2024
  • This paper proposes the development of an advanced Risk Management System (RMS) using Risk-Based Methodologies (RBM) specifically tailored for addressing construction defects in industrial plants. Urbanization and industrialization demand robust frameworks to handle the complexities and safety concerns in construction projects. Traditional risk management often overlooks critical aspects such as persistent construction defects. This paper discusses the development of an innovative Risk Management System (RMS) that integrates Risk-Based Methodologies (RBM) specifically for construction defect mitigation in industrial settings. The study centers around the implementation of Risk-Based Inspection (RBI) techniques, tailored to enhance traditional risk management systems. This includes developing a specialized risk assessment tool alongside an online management platform, designed to provide continuous monitoring and comprehensive management of construction risks. The proposed system-RBE-i (Risk-Based Execution for Installation)-focuses on identifying, evaluating, and mitigating risks effectively, utilizing a systematic approach that integrates seamlessly into existing construction workflows. The RBE-i system's core lies in its ability to conduct thorough risk analyses and real-time data provision. It uses digital technologies to improve communication, operational efficiency, and decision-making processes across construction projects. By applying these methodologies, the system enhances safety and ensures more efficient project execution by preemptively identifying potential risks and addressing them promptly. Field applications of RBE-i have demonstrated its effectiveness in significantly reducing construction defects, thus validating its potential as a transformative tool in construction risk management. The system sets new industry standards by shifting from reactive to proactive risk management practices, ultimately leading to safer, more reliable, and cost-effective construction operations. In conclusion, the RMS developed through this study not only addresses the pressing needs of construction risk management but also proposes a paradigm shift towards more proactive, structured, and technology-driven practices. The successful integration of the RBE-i system across various pilot projects illustrates its significant potential to improve overall project outcomes, making it an invaluable addition to the field of construction management.

A STATISTICAL ANALYSIS OF SOLAR WIND DYNAMIC PRESSURE PULSES DURING GEOMAGNETIC STORMS (지자기폭풍 기간 동안의 태양풍 동압력 펄스에 관한 통계적 분석)

  • Baek, J.H.;Lee, D.Y.;Kim, K.C.;Choi, C.R.;Moon, Y.J.;Cho, K.S.;Park, Y.D.
    • Journal of Astronomy and Space Sciences
    • /
    • v.22 no.4
    • /
    • pp.419-430
    • /
    • 2005
  • We have carried out a statistical analysis on solar wind dynamic pressure pulses during geomagnetic storms. The Dst index was used to identify 111 geomagnetic storms that occurred in the time interval from 1997 through 2001. We have selected only the events having the minimum Dst value less than -50 nT. In order to identify the pressure impact precisely, we have used the horizontal component data of the magnetic field H (northward) at low latitudes as well as the solar wind pressure data themselves. Our analysis leads to the following results: (1) The enhancement of H due to a pressure pulse tends to be proportional to the magnitude of minimum Dst value; (2) The occurrence frequency of pressure pulses also increases with storm intensity. (3) For about $30\%$ of our storms, the occurrence frequency of pressure pulses is greater than $0.4\#/hr$, implying that to. those storms the pressure pulses occur more frequently than do periodic substorms with an average substorm duration of 2.5 hrs. In order to understand the origin of these pressure pulses, we have first examined responsible storm drivers. It turns out that $65\%$ of the studied storms we driven by coronal mass ejections (CMEs) while others are associated with corotating interaction regions $(6.3\%)$ or Type II bursts $(7.2\%)$. Out of the storms that are driven by CMEs, over $70\%$ show that the main phase interval overlaps with the sheath, namely, the region between CME body and the shock, and with the leading region of a CME. This suggests that the origin of the frequent pressure pulses is often due to density fluctuations in the sheath region and the leading edge of the CME body.

The impact of exposure to peer delinquency in elementary school students and the mediating effect of aggression: Comparison between male and female elementary school students (또래집단의 비행경험이 초등학생 비행경험에 미치는 영향: 공격성의 매개효과를 중심으로 -남녀 초등학생 비교-)

  • Lee, Sang Hoon;Choi, Bo Ram;Kim, Sung Hee;Jeong, Kyu Hyoung
    • Journal of the Korean Society of Child Welfare
    • /
    • no.58
    • /
    • pp.205-229
    • /
    • 2017
  • The purpose of this study was to examine gender differences in the impact of exposure to peer delinquency among elementary school-age students and the mediating effects of aggression. The study utilized 458 cases (220 male students, 238 female students) of data from the 2015 Korea Welfare Panel Study (KoWePS) conducted by the Korea Institute for Health and Social Affairs (KIHASA). The theoretical frameworks used in this study included Bandura's social learning theory, Akers' social learning theory, and Sutherland's differential association theory. The findings were as follows. First, there was no statistically significant effect on peer group's delinquency experience overall, aggression, and delinquency experience by gender. Second, male students' delinquency experience of their peer group had a statistically significant effect on their delinquency, however, this was not true for female students. Third, in the case of male students, aggression was found to mediate the relationship between peer group delinquency experience and their own delinquency, but not for female students. From these findings, we suggest a practical and policy-driven intervention plan, focusing on reducing the contact frequency of delinquency experience and aggression, The purpose of this study was to examine gender differences in the impact of exposure to peer delinquency among elementary school-age students and the mediating effects of aggression. The study utilized 458 cases (220 male students, 238 female students) of data from the 2015 Korea Welfare Panel Study (KoWePS) conducted by the Korea Institute for Health and Social Affairs (KIHASA). The theoretical frameworks used in this study included Bandura's social learning theory, Akers' social learning theory, and Sutherland's differential association theory. The findings were as follows. First, there was no statistically significant effect on peer group's delinquency experience overall, aggression, and delinquency experience by gender. Second, male students'delinquency experience of their peer group had a statistically significant effect on their delinquency, however, this was not true for female students. Third, in the case of male students, aggression was found to mediate the relationship between peer group delinquency experience and their own delinquency, but not for female students. From these findings, we suggest a practical and policy-driven intervention plan, focusing on reducing the contact frequency of delinquency experience and aggression, which was found to adversely affect elementary school students' delinquency.

Tokamak plasma disruption precursor onset time study based on semi-supervised anomaly detection

  • X.K. Ai;W. Zheng;M. Zhang;D.L. Chen;C.S. Shen;B.H. Guo;B.J. Xiao;Y. Zhong;N.C. Wang;Z.J. Yang;Z.P. Chen;Z.Y. Chen;Y.H. Ding;Y. Pan
    • Nuclear Engineering and Technology
    • /
    • v.56 no.4
    • /
    • pp.1501-1512
    • /
    • 2024
  • Plasma disruption in tokamak experiments is a challenging issue that causes damage to the device. Reliable prediction methods are needed, but the lack of full understanding of plasma disruption limits the effectiveness of physics-driven methods. Data-driven methods based on supervised learning are commonly used, and they rely on labelled training data. However, manual labelling of disruption precursors is a time-consuming and challenging task, as some precursors are difficult to accurately identify. The mainstream labelling methods assume that the precursor onset occurs at a fixed time before disruption, which leads to mislabeled samples and suboptimal prediction performance. In this paper, we present disruption prediction methods based on anomaly detection to address these issues, demonstrating good prediction performance on J-TEXT and EAST. By evaluating precursor onset times using different anomaly detection algorithms, it is found that labelling methods can be improved since the onset times of different shots are not necessarily the same. The study optimizes precursor labelling using the onset times inferred by the anomaly detection predictor and test the optimized labels on supervised learning disruption predictors. The results on J-TEXT and EAST show that the models trained on the optimized labels outperform those trained on fixed onset time labels.

Study on data preprocessing methods for considering snow accumulation and snow melt in dam inflow prediction using machine learning & deep learning models (머신러닝&딥러닝 모델을 활용한 댐 일유입량 예측시 융적설을 고려하기 위한 데이터 전처리에 대한 방법 연구)

  • Jo, Youngsik;Jung, Kwansue
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.1
    • /
    • pp.35-44
    • /
    • 2024
  • Research in dam inflow prediction has actively explored the utilization of data-driven machine learning and deep learning (ML&DL) tools across diverse domains. Enhancing not just the inherent model performance but also accounting for model characteristics and preprocessing data are crucial elements for precise dam inflow prediction. Particularly, existing rainfall data, derived from snowfall amounts through heating facilities, introduces distortions in the correlation between snow accumulation and rainfall, especially in dam basins influenced by snow accumulation, such as Soyang Dam. This study focuses on the preprocessing of rainfall data essential for the application of ML&DL models in predicting dam inflow in basins affected by snow accumulation. This is vital to address phenomena like reduced outflow during winter due to low snowfall and increased outflow during spring despite minimal or no rain, both of which are physical occurrences. Three machine learning models (SVM, RF, LGBM) and two deep learning models (LSTM, TCN) were built by combining rainfall and inflow series. With optimal hyperparameter tuning, the appropriate model was selected, resulting in a high level of predictive performance with NSE ranging from 0.842 to 0.894. Moreover, to generate rainfall correction data considering snow accumulation, a simulated snow accumulation algorithm was developed. Applying this correction to machine learning and deep learning models yielded NSE values ranging from 0.841 to 0.896, indicating a similarly high level of predictive performance compared to the pre-snow accumulation application. Notably, during the snow accumulation period, adjusting rainfall during the training phase was observed to lead to a more accurate simulation of observed inflow when predicted. This underscores the importance of thoughtful data preprocessing, taking into account physical factors such as snowfall and snowmelt, in constructing data models.

A Study on Web-based Technology Valuation System (웹기반 지능형 기술가치평가 시스템에 관한 연구)

  • Sung, Tae-Eung;Jun, Seung-Pyo;Kim, Sang-Gook;Park, Hyun-Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.23-46
    • /
    • 2017
  • Although there have been cases of evaluating the value of specific companies or projects which have centralized on developed countries in North America and Europe from the early 2000s, the system and methodology for estimating the economic value of individual technologies or patents has been activated on and on. Of course, there exist several online systems that qualitatively evaluate the technology's grade or the patent rating of the technology to be evaluated, as in 'KTRS' of the KIBO and 'SMART 3.1' of the Korea Invention Promotion Association. However, a web-based technology valuation system, referred to as 'STAR-Value system' that calculates the quantitative values of the subject technology for various purposes such as business feasibility analysis, investment attraction, tax/litigation, etc., has been officially opened and recently spreading. In this study, we introduce the type of methodology and evaluation model, reference information supporting these theories, and how database associated are utilized, focusing various modules and frameworks embedded in STAR-Value system. In particular, there are six valuation methods, including the discounted cash flow method (DCF), which is a representative one based on the income approach that anticipates future economic income to be valued at present, and the relief-from-royalty method, which calculates the present value of royalties' where we consider the contribution of the subject technology towards the business value created as the royalty rate. We look at how models and related support information (technology life, corporate (business) financial information, discount rate, industrial technology factors, etc.) can be used and linked in a intelligent manner. Based on the classification of information such as International Patent Classification (IPC) or Korea Standard Industry Classification (KSIC) for technology to be evaluated, the STAR-Value system automatically returns meta data such as technology cycle time (TCT), sales growth rate and profitability data of similar company or industry sector, weighted average cost of capital (WACC), indices of industrial technology factors, etc., and apply adjustment factors to them, so that the result of technology value calculation has high reliability and objectivity. Furthermore, if the information on the potential market size of the target technology and the market share of the commercialization subject refers to data-driven information, or if the estimated value range of similar technologies by industry sector is provided from the evaluation cases which are already completed and accumulated in database, the STAR-Value is anticipated that it will enable to present highly accurate value range in real time by intelligently linking various support modules. Including the explanation of the various valuation models and relevant primary variables as presented in this paper, the STAR-Value system intends to utilize more systematically and in a data-driven way by supporting the optimal model selection guideline module, intelligent technology value range reasoning module, and similar company selection based market share prediction module, etc. In addition, the research on the development and intelligence of the web-based STAR-Value system is significant in that it widely spread the web-based system that can be used in the validation and application to practices of the theoretical feasibility of the technology valuation field, and it is expected that it could be utilized in various fields of technology commercialization.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

Estimation and assessment of natural drought index using principal component analysis (주성분 분석을 활용한 자연가뭄지수 산정 및 평가)

  • Kim, Seon-Ho;Lee, Moon-Hwan;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.49 no.6
    • /
    • pp.565-577
    • /
    • 2016
  • The objective of this study is to propose a method for computing the Natural Drought Index (NDI) that does not consider man-made drought facilities. Principal Component Analysis (PCA) was used to estimate the NDI. Three monthly moving cumulative runoff, soil moisture and precipitation were selected as input data of the NDI during 1977~2012. Observed precipitation data was collected from KMA ASOS (Korea Meteorological Association Automatic Synoptic Observation System), while model-driven runoff and soil moisture from Variable Infiltration Capacity Model (VIC Model) were used. Time series analysis, drought characteristic analysis and spatial analysis were used to assess the utilization of NDI and compare with existing SPI, SRI and SSI. The NDI precisely reflected onset and termination of past drought events with mean absolute error of 0.85 in time series analysis. It explained well duration and inter-arrival time with 1.3 and 1.0 respectively in drought characteristic analysis. Also, the NDI reflected regional drought condition well in spatial analysis. The accuracy rank of drought onset, termination, duration and inter-arrival time was calculated by using NDI, SPI, SRI and SSI. The result showed that NDI is more precise than the others. The NDI overcomes the limitation of univariate drought indices and can be useful for drought analysis as representative measure of different types of drought such as meteorological, hydrological and agricultural droughts.