• Title/Summary/Keyword: Generalization ability

Search Result 133, Processing Time 0.025 seconds

Data abnormal detection using bidirectional long-short neural network combined with artificial experience

  • Yang, Kang;Jiang, Huachen;Ding, Youliang;Wang, Manya;Wan, Chunfeng
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.117-127
    • /
    • 2022
  • Data anomalies seriously threaten the reliability of the bridge structural health monitoring system and may trigger system misjudgment. To overcome the above problem, an efficient and accurate data anomaly detection method is desiderated. Traditional anomaly detection methods extract various abnormal features as the key indicators to identify data anomalies. Then set thresholds artificially for various features to identify specific anomalies, which is the artificial experience method. However, limited by the poor generalization ability among sensors, this method often leads to high labor costs. Another approach to anomaly detection is a data-driven approach based on machine learning methods. Among these, the bidirectional long-short memory neural network (BiLSTM), as an effective classification method, excels at finding complex relationships in multivariate time series data. However, training unprocessed original signals often leads to low computation efficiency and poor convergence, for lacking appropriate feature selection. Therefore, this article combines the advantages of the two methods by proposing a deep learning method with manual experience statistical features fed into it. Experimental comparative studies illustrate that the BiLSTM model with appropriate feature input has an accuracy rate of over 87-94%. Meanwhile, this paper provides basic principles of data cleaning and discusses the typical features of various anomalies. Furthermore, the optimization strategies of the feature space selection based on artificial experience are also highlighted.

Physics informed neural networks for surrogate modeling of accidental scenarios in nuclear power plants

  • Federico Antonello;Jacopo Buongiorno;Enrico Zio
    • Nuclear Engineering and Technology
    • /
    • v.55 no.9
    • /
    • pp.3409-3416
    • /
    • 2023
  • Licensing the next-generation of nuclear reactor designs requires extensive use of Modeling and Simulation (M&S) to investigate system response to many operational conditions, identify possible accidental scenarios and predict their evolution to undesirable consequences that are to be prevented or mitigated via the deployment of adequate safety barriers. Deep Learning (DL) and Artificial Intelligence (AI) can support M&S computationally by providing surrogates of the complex multi-physics high-fidelity models used for design. However, DL and AI are, generally, low-fidelity 'black-box' models that do not assure any structure based on physical laws and constraints, and may, thus, lack interpretability and accuracy of the results. This poses limitations on their credibility and doubts about their adoption for the safety assessment and licensing of novel reactor designs. In this regard, Physics Informed Neural Networks (PINNs) are receiving growing attention for their ability to integrate fundamental physics laws and domain knowledge in the neural networks, thus assuring credible generalization capabilities and credible predictions. This paper presents the use of PINNs as surrogate models for accidental scenarios simulation in Nuclear Power Plants (NPPs). A case study of a Loss of Heat Sink (LOHS) accidental scenario in a Nuclear Battery (NB), a unique class of transportable, plug-and-play microreactors, is considered. A PINN is developed and compared with a Deep Neural Network (DNN). The results show the advantages of PINNs in providing accurate solutions, avoiding overfitting, underfitting and intrinsically ensuring physics-consistent results.

Prediction of skewness and kurtosis of pressure coefficients on a low-rise building by deep learning

  • Youqin Huang;Guanheng Ou;Jiyang Fu;Huifan Wu
    • Wind and Structures
    • /
    • v.36 no.6
    • /
    • pp.393-404
    • /
    • 2023
  • Skewness and kurtosis are important higher-order statistics for simulating non-Gaussian wind pressure series on low-rise buildings, but their predictions are less studied in comparison with those of the low order statistics as mean and rms. The distribution gradients of skewness and kurtosis on roofs are evidently higher than those of mean and rms, which increases their prediction difficulty. The conventional artificial neural networks (ANNs) used for predicting mean and rms show unsatisfactory accuracy in predicting skewness and kurtosis owing to the limited capacity of shallow learning of ANNs. In this work, the deep neural networks (DNNs) model with the ability of deep learning is introduced to predict the skewness and kurtosis on a low-rise building. For obtaining the optimal generalization of the DNNs model, the hyper parameters are automatically determined by Bayesian Optimization (BO). Moreover, for providing a benchmark for future studies on predicting higher order statistics, the data sets for training and testing the DNNs model are extracted from the internationally open NIST-UWO database, and the prediction errors of all taps are comprehensively quantified by various error metrices. The results show that the prediction accuracy in this study is apparently better than that in the literature, since the correlation coefficient between the predicted and experimental results is 0.99 and 0.75 in this paper and the literature respectively. In the untrained cornering wind direction, the distributions of skewness and kurtosis are well captured by DNNs on the whole building including the roof corner with strong non-normality, and the correlation coefficients between the predicted and experimental results are 0.99 and 0.95 for skewness and kurtosis respectively.

Edge Computing Model based on Federated Learning for COVID-19 Clinical Outcome Prediction in the 5G Era

  • Ruochen Huang;Zhiyuan Wei;Wei Feng;Yong Li;Changwei Zhang;Chen Qiu;Mingkai Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.826-842
    • /
    • 2024
  • As 5G and AI continue to develop, there has been a significant surge in the healthcare industry. The COVID-19 pandemic has posed immense challenges to the global health system. This study proposes an FL-supported edge computing model based on federated learning (FL) for predicting clinical outcomes of COVID-19 patients during hospitalization. The model aims to address the challenges posed by the pandemic, such as the need for sophisticated predictive models, privacy concerns, and the non-IID nature of COVID-19 data. The model utilizes the FATE framework, known for its privacy-preserving technologies, to enhance predictive precision while ensuring data privacy and effectively managing data heterogeneity. The model's ability to generalize across diverse datasets and its adaptability in real-world clinical settings are highlighted by the use of SHAP values, which streamline the training process by identifying influential features, thus reducing computational overhead without compromising predictive precision. The study demonstrates that the proposed model achieves comparable precision to specific machine learning models when dataset sizes are identical and surpasses traditional models when larger training data volumes are employed. The model's performance is further improved when trained on datasets from diverse nodes, leading to superior generalization and overall performance, especially in scenarios with insufficient node features. The integration of FL with edge computing contributes significantly to the reliable prediction of COVID-19 patient outcomes with greater privacy. The research contributes to healthcare technology by providing a practical solution for early intervention and personalized treatment plans, leading to improved patient outcomes and efficient resource allocation during public health crises.

Deep learning-based AI constitutive modeling for sandstone and mudstone under cyclic loading conditions

  • Luyuan Wu;Meng Li;Jianwei Zhang;Zifa Wang;Xiaohui Yang;Hanliang Bian
    • Geomechanics and Engineering
    • /
    • v.37 no.1
    • /
    • pp.49-64
    • /
    • 2024
  • Rocks undergoing repeated loading and unloading over an extended period, such as due to earthquakes, human excavation, and blasting, may result in the gradual accumulation of stress and deformation within the rock mass, eventually reaching an unstable state. In this study, a CNN-CCM is proposed to address the mechanical behavior. The structure and hyperparameters of CNN-CCM include Conv2D layers × 5; Max pooling2D layers × 4; Dense layers × 4; learning rate=0.001; Epoch=50; Batch size=64; Dropout=0.5. Training and validation data for deep learning include 71 rock samples and 122,152 data points. The AI Rock Constitutive Model learned by CNN-CCM can predict strain values(ε1) using Mass (M), Axial stress (σ1), Density (ρ), Cyclic number (N), Confining pressure (σ3), and Young's modulus (E). Five evaluation indicators R2, MAPE, RMSE, MSE, and MAE yield respective values of 0.929, 16.44%, 0.954, 0.913, and 0.542, illustrating good predictive performance and generalization ability of model. Finally, interpreting the AI Rock Constitutive Model using the SHAP explaining method reveals that feature importance follows the order N > M > σ1 > E > ρ > σ3.Positive SHAP values indicate positive effects on predicting strain ε1 for N, M, σ1, and σ3, while negative SHAP values have negative effects. For E, a positive value has a negative effect on predicting strain ε1, consistent with the influence patterns of conventional physical rock constitutive equations. The present study offers a novel approach to the investigation of the mechanical constitutive model of rocks under cyclic loading and unloading conditions.

Research on damage detection and assessment of civil engineering structures based on DeepLabV3+ deep learning model

  • Chengyan Song
    • Structural Engineering and Mechanics
    • /
    • v.91 no.5
    • /
    • pp.443-457
    • /
    • 2024
  • At present, the traditional concrete surface inspection methods based on artificial vision have the problems of high cost and insecurity, while the computer vision methods rely on artificial selection features in the case of sensitive environmental changes and difficult promotion. In order to solve these problems, this paper introduces deep learning technology in the field of computer vision to achieve automatic feature extraction of structural damage, with excellent detection speed and strong generalization ability. The main contents of this study are as follows: (1) A method based on DeepLabV3+ convolutional neural network model is proposed for surface detection of post-earthquake structural damage, including surface damage such as concrete cracks, spaling and exposed steel bars. The key semantic information is extracted by different backbone networks, and the data sets containing various surface damage are trained, tested and evaluated. The intersection ratios of 54.4%, 44.2%, and 89.9% in the test set demonstrate the network's capability to accurately identify different types of structural surface damages in pixel-level segmentation, highlighting its effectiveness in varied testing scenarios. (2) A semantic segmentation model based on DeepLabV3+ convolutional neural network is proposed for the detection and evaluation of post-earthquake structural components. Using a dataset that includes building structural components and their damage degrees for training, testing, and evaluation, semantic segmentation detection accuracies were recorded at 98.5% and 56.9%. To provide a comprehensive assessment that considers both false positives and false negatives, the Mean Intersection over Union (Mean IoU) was employed as the primary evaluation metric. This choice ensures that the network's performance in detecting and evaluating pixel-level damage in post-earthquake structural components is evaluated uniformly across all experiments. By incorporating deep learning technology, this study not only offers an innovative solution for accurately identifying post-earthquake damage in civil engineering structures but also contributes significantly to empirical research in automated detection and evaluation within the field of structural health monitoring.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

The Recognition Characteristics of Science Gifted Students on the Earth System based on their Thinking Style (과학 영재 학생들의 사고양식에 따른 지구시스템에 대한 인지 특성)

  • Lee, Hyonyong;Kim, Seung-Hwan
    • Journal of Science Education
    • /
    • v.33 no.1
    • /
    • pp.12-30
    • /
    • 2009
  • The purpose of this study was to analyze recognition characteristics of science gifted students on the earth system based on their thinking style. The subjects were 24 science gifted students at the Science Institute for Gifted Students of a university located in metropolitan city in Korea. The students' thinking styles were firstly examined on the basis of the Sternberg's theory of mental self-government. And then, the students were divided into two groups: Type I group(legislative, judicial, global, liberal) and Type II group(executive, local, conservative) based on Sternberg's theory. Data was collected from three different type of questionnaires(A, B, C types), interview, word association method, drawing analyses, concept map, hidden dimension inventory, and in-depth interviews. The findings of analysis indicated that their thinking styles were characterized by 'Legislative', 'Executive', 'Anarchic', 'Global', 'External', 'Liberal' styles. Their preference were conducting new projects and using creative problem solving processes. The results of students' recognition characteristics on earth system were as follows: First, though the two groups' quantitative value on 'System Understanding' was very similar, there were considerable distinctions in details. Second, 'Understanding the Relationship in the System' was closely connected to thinking styles. Type I group was more advantageous with multiple, dynamic, and recursive approach. Third, in the relation to 'System Generalization' both of the groups had similar simple interpretational ability of the system, but Type I group was better on generalization when 'hidden dimension inventory' factor was added. On the system prediction factor, however, students' ability was weak regardless of the type. Consequently, more specific development strategies on various objects are needed for the development and application of the system learning program. Furthermore, it is expected that this study could be practically and effectively used on various fields related to system recognition.

  • PDF

An Epistemological Inquiry on the Development of Statistical Concepts (통계적 개념 발달에 관한 인식론적 고찰)

  • Lee, Young-Ha;Nam, Joo-Hyun
    • The Mathematical Education
    • /
    • v.44 no.3 s.110
    • /
    • pp.457-475
    • /
    • 2005
  • We have inquired on what the statistical classes of the secondary schools had been aiming to, say the epistermlogical objects. And we now appreciate that the main obstacle to the systematic articulation is the lack of anticipation on what the statistical concepts are. This study focuses on the ingredients of the statistical concepts. Those are to be the ground of the systematic articulation of statistic courses, especially of the one for the school kids. Thus we required that those ingredients must satisfy the followings. i) directly related to the contents of statistics ii) psychologically developing iii) mutually exclusive each other as much as possible iv) exhaustive enough to cover all statistical concepts We examined what and how statisticians had been doing and the various previous views on these. After all we suggest the following three concepts are the core of conceptual developments of statistic, say the concept of distributions, the summarizing ability and the concept of samples. By the concepts of distributions we mean the frequency views on each random categories and that is developing from the count through the probability along ages. Summarizing ability is another important resources to embed his probe with the data set. It is not only viewed as a number but also to be anticipated as one reflecting a random phenomena. Inductive generalization is one of the most hazardous thing. Statistical induction is a scientific way of challenging this and this starts from distinguishing the chance with the inevitable consequences. One's inductive logic grows up along with one's deductive arguments, nevertheless they are different. The concept of samples reflects' one's view on the sample data and the way of compounding one's logic with the data within one's hypothesis. With these three in mind we observed Korean Statistic Curriculum from K to 12. Distributional concepts are dealt with throughout but not sequenced well. The way of summarization has been introduced in the 1 st, 5th, 7th and the 10th grade as a numerical value only. One activity on the concept of sample is given at the 6th grade. And it jumps into the statistical reasoning at the selective courses of ' Mathematics I ' or of ' Probability and Statistics ' in the grades of 11-12. We want to suggest further studies on the developing stages of these three conceptual features so as to obtain a firm basis of successive statistical articulation.

  • PDF

The Effect of High School Research Project using the Science Writing Heuristic (탐구적 과학 글쓰기(SWH)를 적용한 고등학교 과제연구의 효과)

  • Moon, Saetbyeol;Choi, Wonho
    • Journal of the Korean Chemical Society
    • /
    • v.62 no.5
    • /
    • pp.398-411
    • /
    • 2018
  • The purpose of this study is to investigate the effects of research project activities using the science writing heuristic on science inquiry abilities and attitudes toward science in high school students. For this purpose, we conducted the research project activities using the science writing heuristic consisting of questioning, experimental design, observation, argument and evidence, reading, and reflection steps for 73 students of the second year of science core course in high school in Jeonnam. In order to analyze the effects of the program, we surveyed the scientific inquiry ability and attitude toward science, investigated the perception of the research project class applying science writing heuristic, and conducted interviews when there was difficulty in interpreting the results. And the results of this study are as follows. First, among the science inquiry abilities, the score of Reasoning, Hypothesis setting, Finding variables, Operational definition, Experimental design, Graphing and data interpretation, Generalization was significantly improved statistically (p<.05), but the score of Expectation was not statistically significant (p>.05). Second, among the attitudes toward science, the score of 'Leisure interest in science', 'Enjoyment of science lessons', 'Career interest in science' was significantly improved statistically (p<.05). And the score of 'Attitude to scientific inquiry' decreased but it's not significant statistically. The high school research project applying science writing heuristic had a positive effect on scientific inquiry ability and scientific attitude but it could be burden to students because it is led by students in a form different from general science class for a long time. And so continuous study on research project that minimize these disadvantages and maximize their merits is needed.