• Title/Summary/Keyword: hybrid machine learning

Search Result 170, Processing Time 0.028 seconds

Development and evaluation of watershed hybrid model using machine learning (머신러닝을 활용한 유역단위 하이브리드모델 개발 및 평가)

  • Sang Joon Bak;Gwan Jae Lee;Seo Ro Lee;Yeon Ji Jeong;Dong Hyuk Kum;Ji Chul Ryu;Woon JI Park;Kyoung Jae Lim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.212-212
    • /
    • 2023
  • 비점오염원관리와 같이 장기적인 유역 관리 계획에서 유역 내 오염원 평가는 정말 중요하다. 유역 내 오염원 평가는 강우 유출에 의한 비점오염 발생원이 어디서 얼마나 발생시키는지에 대한 정량적인 조사가 필요하다. 유역 내의 오염원에 대한 정량적인 조사는 많은 비용과 시간이 필요하다. 이러한 비용과 시간을 줄이기 위해 유역단위 수리 수문 모델을 사용하고 있다. 유역단위 수리수문 모델은 HSPF (Hydrological Simulation Program in Fortran), SWAT (Soil and Water Assessment Tool), L-THIA ACN-WQ(The Long-term Hydrologic Impact Assessment Model with Asymptotic Curve Number Regression Equation and Water Quality model)등 다양한 모델이 사용되고 있다. 하지만 유역 모델을 통한 모의는 다양한 연산 과정을 진행하여 모의까지 많은 시간이 필요하다는 단점이 있다. 이에 따라 데이터 기반 모델링 기법(머신러닝/딥러닝)을 이용한 유출 및 수질 예측 연구가 많이 이루어지고 있다. 단순 머신러닝/딥러닝 기반 모델링 기법은 점(최종유출구)에서의 예측만 가능하여 최적관리 기법 적용 등과 같은 유역관리 방안을 적용하기 힘들다는 문제점이 있다. 따라서 본 연구에서 머신러닝/딥러닝을 통해 일부 수문 프로세스를 대체하고 소유역별 하도추적 기법을 연계하여 유량 및 수질 항목들의 모의가 가능한 하이브리드 모델을 개발하였다. 이는 머신러닝/딥러닝이 유역 모델의 일부 연산 과정을 대체하여 모의시간이 빠르며, 기존 머신러닝/딥러닝 예측 모델에서 평가가 어려웠던 유역 관리 방안 및 최적관리기법 적용 평가에도 활용이 가능할 것으로 판단이 된다.

  • PDF

Meta-heuristic optimization algorithms for prediction of fly-rock in the blasting operation of open-pit mines

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Mohammadi, Mokhtar;Ibrahim, Hawkar Hashim;Rashidi, Shima;Mohammed, Adil Hussein
    • Geomechanics and Engineering
    • /
    • v.30 no.6
    • /
    • pp.489-502
    • /
    • 2022
  • In this study, a Gaussian process regression (GPR) model as well as six GPR-based metaheuristic optimization models, including GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, and GPR-SSO, were developed to predict fly-rock distance in the blasting operation of open pit mines. These models included GPR-SCA, GPR-SSO, GPR-MVO, and GPR. In the models that were obtained from the Soungun copper mine in Iran, a total of 300 datasets were used. These datasets included six input parameters and one output parameter (fly-rock). In order to conduct the assessment of the prediction outcomes, many statistical evaluation indices were used. In the end, it was determined that the performance prediction of the ML models to predict the fly-rock from high to low is GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, GPR-SSO, and GPR with ranking scores of 66, 60, 54, 46, 43, 38, and 30 (for 5-fold method), respectively. These scores correspond in conclusion, the GPR-PSO model generated the most accurate findings, hence it was suggested that this model be used to forecast the fly-rock. In addition, the mutual information test, also known as MIT, was used in order to investigate the influence that each input parameter had on the fly-rock. In the end, it was determined that the stemming (T) parameter was the most effective of all the parameters on the fly-rock.

Dam Inflow Prediction and Evaluation Using Hybrid Auto-sklearn Ensemble Model (하이브리드 Auto-sklearn 앙상블 모델을 이용한 댐 유입량 예측 및 평가)

  • Lee, Seoro;Bae, Joo Hyun;Lee, Gwanjae;Yang, Dongseok;Hong, Jiyeong;Kim, Jonggun;Lim, Kyoung Jae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.307-307
    • /
    • 2022
  • 최근 기후변화와 댐 상류 토지이용 변화 등과 같은 다양한 원인에 의해 댐 유입량의 변동성이 증가하면서 댐 관리 및 운영조작 의사 결정에 어려움이 발생하고 있다. 따라서 이러한 댐 유입량의 변동 특성을 반영하여 댐 유입량을 정확하고 효율적으로 예측할 수 있는 방안이 필요한 실정이다. 머신러닝 기술이 발전하면서 Auto-ML(Automated Machine Learning)이 다양한 분야에서 활용되고 있다. Auto-ML은 데이터 전처리, 최적 알고리즘 선택, 하이퍼파라미터 튜닝, 모델 학습 및 평가 등의 모든 과정을 자동화하는 기술이다. 그러나 아직까지 수문 분야에서 댐 유입량을 예측하기 위한 모델을 개발하는데 있어서 Auto-ML을 활용한 사례는 부족하고, 특히 댐 유입량의 예측 정확성을 확보하기 위해 High-inflow and low-inflow 의 변동 특성을 고려한 하이브리드 결합 방식을 통해 Auto-ML 기반 앙상블 모델을 개발하고 평가한 연구는 없다. 본 연구에서는 Auto-ML의 패키지 중 Auto-sklearn을 통해 홍수기, 비홍수기 유입량 변동 특성을 반영한 하이브리드 앙상블 댐 유입량 예측 모델을 개발하였다. 소양강댐을 대상으로 적용한 결과, 하이브리드 Auto-sklearn 앙상블 모델의 댐 유입량 예측 성능은 R2 0.868, RMSE 66.23 m3/s, MAE 16.45 m3/s로 단일 Auto-sklearn을 통해 구축 된 앙상블 모델보다 전반적으로 우수한 것으로 나타났다. 특히 FDC (Flow Duration Curve)의 저수기, 갈수기 구간에서 두 모델의 유입량 예측 경향은 큰 차이를 보였으며, 하이브리드 Auto-sklearn 모델의 예측 값이 관측 값과 더욱 유사한 것으로 나타났다. 이는 홍수기, 비홍수기 구간에 대한 앙상블 모델이 독립적으로 구축되는 과정에서 각 모델에 대한 하이퍼파라미터가 최적화되었기 때문이라 판단된다. 향후 본 연구의 방법론은 보다 정확한 댐 유입량 예측 자료를 생성하기 위한 방안 수립뿐만 아니라 다양한 분야의 불균형한 데이터셋을 이용한 앙상블 모델을 구축하는데도 유용하게 활용될 수 있을 것으로 사료된다.

  • PDF

A vibration-based approach for detecting arch dam damage using RBF neural networks and Jaya algorithms

  • Ali Zar;Zahoor Hussain;Muhammad Akbar;Bassam A. Tayeh;Zhibin Lin
    • Smart Structures and Systems
    • /
    • v.32 no.5
    • /
    • pp.319-338
    • /
    • 2023
  • The study presents a new hybrid data-driven method by combining radial basis functions neural networks (RBF-NN) with the Jaya algorithm (JA) to provide effective structural health monitoring of arch dams. The novelty of this approach lies in that only one user-defined parameter is required and thus can increase its effectiveness and efficiency, as compared to other machine learning techniques that often require processing a large amount of training and testing model parameters and hyper-parameters, with high time-consuming. This approach seeks rapid damage detection in arch dams under dynamic conditions, to prevent potential disasters, by utilizing the RBF-NNN to seamlessly integrate the dynamic elastic modulus (DEM) and modal parameters (such as natural frequency and mode shape) as damage indicators. To determine the dynamic characteristics of the arch dam, the JA sequentially optimizes an objective function rooted in vibration-based data sets. Two case studies of hyperbolic concrete arch dams were carefully designed using finite element simulation to demonstrate the effectiveness of the RBF-NN model, in conjunction with the Jaya algorithm. The testing results demonstrated that the proposed methods could exhibit significant computational time-savings, while effectively detecting damage in arch dam structures with complex nonlinearities. Furthermore, despite training data contaminated with a high level of noise, the RBF-NN and JA fusion remained the robustness, with high accuracy.

Bankruptcy Type Prediction Using A Hybrid Artificial Neural Networks Model (하이브리드 인공신경망 모형을 이용한 부도 유형 예측)

  • Jo, Nam-ok;Kim, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.79-99
    • /
    • 2015
  • The prediction of bankruptcy has been extensively studied in the accounting and finance field. It can have an important impact on lending decisions and the profitability of financial institutions in terms of risk management. Many researchers have focused on constructing a more robust bankruptcy prediction model. Early studies primarily used statistical techniques such as multiple discriminant analysis (MDA) and logit analysis for bankruptcy prediction. However, many studies have demonstrated that artificial intelligence (AI) approaches, such as artificial neural networks (ANN), decision trees, case-based reasoning (CBR), and support vector machine (SVM), have been outperforming statistical techniques since 1990s for business classification problems because statistical methods have some rigid assumptions in their application. In previous studies on corporate bankruptcy, many researchers have focused on developing a bankruptcy prediction model using financial ratios. However, there are few studies that suggest the specific types of bankruptcy. Previous bankruptcy prediction models have generally been interested in predicting whether or not firms will become bankrupt. Most of the studies on bankruptcy types have focused on reviewing the previous literature or performing a case study. Thus, this study develops a model using data mining techniques for predicting the specific types of bankruptcy as well as the occurrence of bankruptcy in Korean small- and medium-sized construction firms in terms of profitability, stability, and activity index. Thus, firms will be able to prevent it from occurring in advance. We propose a hybrid approach using two artificial neural networks (ANNs) for the prediction of bankruptcy types. The first is a back-propagation neural network (BPN) model using supervised learning for bankruptcy prediction and the second is a self-organizing map (SOM) model using unsupervised learning to classify bankruptcy data into several types. Based on the constructed model, we predict the bankruptcy of companies by applying the BPN model to a validation set that was not utilized in the development of the model. This allows for identifying the specific types of bankruptcy by using bankruptcy data predicted by the BPN model. We calculated the average of selected input variables through statistical test for each cluster to interpret characteristics of the derived clusters in the SOM model. Each cluster represents bankruptcy type classified through data of bankruptcy firms, and input variables indicate financial ratios in interpreting the meaning of each cluster. The experimental result shows that each of five bankruptcy types has different characteristics according to financial ratios. Type 1 (severe bankruptcy) has inferior financial statements except for EBITDA (earnings before interest, taxes, depreciation, and amortization) to sales based on the clustering results. Type 2 (lack of stability) has a low quick ratio, low stockholder's equity to total assets, and high total borrowings to total assets. Type 3 (lack of activity) has a slightly low total asset turnover and fixed asset turnover. Type 4 (lack of profitability) has low retained earnings to total assets and EBITDA to sales which represent the indices of profitability. Type 5 (recoverable bankruptcy) includes firms that have a relatively good financial condition as compared to other bankruptcy types even though they are bankrupt. Based on the findings, researchers and practitioners engaged in the credit evaluation field can obtain more useful information about the types of corporate bankruptcy. In this paper, we utilized the financial ratios of firms to classify bankruptcy types. It is important to select the input variables that correctly predict bankruptcy and meaningfully classify the type of bankruptcy. In a further study, we will include non-financial factors such as size, industry, and age of the firms. Thus, we can obtain realistic clustering results for bankruptcy types by combining qualitative factors and reflecting the domain knowledge of experts.

White striping degree assessment using computer vision system and consumer acceptance test

  • Kato, Talita;Mastelini, Saulo Martiello;Campos, Gabriel Fillipe Centini;Barbon, Ana Paula Ayub da Costa;Prudencio, Sandra Helena;Shimokomaki, Massami;Soares, Adriana Lourenco;Barbon, Sylvio Jr.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.7
    • /
    • pp.1015-1026
    • /
    • 2019
  • Objective: The objective of this study was to evaluate three different degrees of white striping (WS) addressing their automatic assessment and customer acceptance. The WS classification was performed based on a computer vision system (CVS), exploring different machine learning (ML) algorithms and the most important image features. Moreover, it was verified by consumer acceptance and purchase intent. Methods: The samples for image analysis were classified by trained specialists, according to severity degrees regarding visual and firmness aspects. Samples were obtained with a digital camera, and 25 features were extracted from these images. ML algorithms were applied aiming to induce a model capable of classifying the samples into three severity degrees. In addition, two sensory analyses were performed: 75 samples properly grilled were used for the first sensory test, and 9 photos for the second. All tests were performed using a 10-cm hybrid hedonic scale (acceptance test) and a 5-point scale (purchase intention). Results: The information gain metric ranked 13 attributes. However, just one type of image feature was not enough to describe the phenomenon. The classification models support vector machine, fuzzy-W, and random forest showed the best results with similar general accuracy (86.4%). The worst performance was obtained by multilayer perceptron (70.9%) with the high error rate in normal (NORM) sample predictions. The sensory analysis of acceptance verified that WS myopathy negatively affects the texture of the broiler breast fillets when grilled and the appearance attribute of the raw samples, which influenced the purchase intention scores of raw samples. Conclusion: The proposed system has proved to be adequate (fast and accurate) for the classification of WS samples. The sensory analysis of acceptance showed that WS myopathy negatively affects the tenderness of the broiler breast fillets when grilled, while the appearance attribute of the raw samples eventually influenced purchase intentions.

Machine Learning Based Automated Source, Sink Categorization for Hybrid Approach of Privacy Leak Detection (머신러닝 기반의 자동화된 소스 싱크 분류 및 하이브리드 분석을 통한 개인정보 유출 탐지 방법)

  • Shim, Hyunseok;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.657-667
    • /
    • 2020
  • The Android framework allows apps to take full advantage of personal information through granting single permission, and does not determine whether the data being leaked is actual personal information. To solve these problems, we propose a tool with static/dynamic analysis. The tool analyzes the Source and Sink used by the target app, to provide users with information on what personal information it used. To achieve this, we extracted the Source and Sink through Control Flow Graph and make sure that it leaks the user's privacy when there is a Source-to-Sink flow. We also used the sensitive permission information provided by Google to obtain information from the sensitive API corresponding to Source and Sink. Finally, our dynamic analysis tool runs the app and hooks information from each sensitive API. In the hooked data, we got information about whether user's personal information is leaked through this app, and delivered to user. In this process, an automated Source/Sink classification model was applied to collect latest Source/Sink information, and the we categorized latest release version of Android(9.0) with 88.5% accuracy. We evaluated our tool on 2,802 APKs, and found 850 APKs that leak personal information.

Computational estimation of the earthquake response for fibre reinforced concrete rectangular columns

  • Liu, Chanjuan;Wu, Xinling;Wakil, Karzan;Jermsittiparsert, Kittisak;Ho, Lanh Si;Alabduljabbar, Hisham;Alaskar, Abdulaziz;Alrshoudi, Fahed;Alyousef, Rayed;Mohamed, Abdeliazim Mustafa
    • Steel and Composite Structures
    • /
    • v.34 no.5
    • /
    • pp.743-767
    • /
    • 2020
  • Due to the impressive flexural performance, enhanced compressive strength and more constrained crack propagation, Fibre-reinforced concrete (FRC) have been widely employed in the construction application. Majority of experimental studies have focused on the seismic behavior of FRC columns. Based on the valid experimental data obtained from the previous studies, the current study has evaluated the seismic response and compressive strength of FRC rectangular columns while following hybrid metaheuristic techniques. Due to the non-linearity of seismic data, Adaptive neuro-fuzzy inference system (ANFIS) has been incorporated with metaheuristic algorithms. 317 different datasets from FRC column tests has been applied as one database in order to determine the most influential factor on the ultimate strengths of FRC rectangular columns subjected to the simulated seismic loading. ANFIS has been used with the incorporation of Particle Swarm Optimization (PSO) and Genetic algorithm (GA). For the analysis of the attained results, Extreme learning machine (ELM) as an authentic prediction method has been concurrently used. The variable selection procedure is to choose the most dominant parameters affecting the ultimate strengths of FRC rectangular columns subjected to simulated seismic loading. Accordingly, the results have shown that ANFIS-PSO has successfully predicted the seismic lateral load with R2 = 0.857 and 0.902 for the test and train phase, respectively, nominated as the lateral load prediction estimator. On the other hand, in case of compressive strength prediction, ELM is to predict the compressive strength with R2 = 0.657 and 0.862 for test and train phase, respectively. The results have shown that the seismic lateral force trend is more predictable than the compressive strength of FRC rectangular columns, in which the best results belong to the lateral force prediction. Compressive strength prediction has illustrated a significant deviation above 40 Mpa which could be related to the considerable non-linearity and possible empirical shortcomings. Finally, employing ANFIS-GA and ANFIS-PSO techniques to evaluate the seismic response of FRC are a promising reliable approach to be replaced for high cost and time-consuming experimental tests.

A Hybrid Forecasting Framework based on Case-based Reasoning and Artificial Neural Network (사례기반 추론기법과 인공신경망을 이용한 서비스 수요예측 프레임워크)

  • Hwang, Yousub
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.43-57
    • /
    • 2012
  • To enhance the competitive advantage in a constantly changing business environment, an enterprise management must make the right decision in many business activities based on both internal and external information. Thus, providing accurate information plays a prominent role in management's decision making. Intuitively, historical data can provide a feasible estimate through the forecasting models. Therefore, if the service department can estimate the service quantity for the next period, the service department can then effectively control the inventory of service related resources such as human, parts, and other facilities. In addition, the production department can make load map for improving its product quality. Therefore, obtaining an accurate service forecast most likely appears to be critical to manufacturing companies. Numerous investigations addressing this problem have generally employed statistical methods, such as regression or autoregressive and moving average simulation. However, these methods are only efficient for data with are seasonal or cyclical. If the data are influenced by the special characteristics of product, they are not feasible. In our research, we propose a forecasting framework that predicts service demand of manufacturing organization by combining Case-based reasoning (CBR) and leveraging an unsupervised artificial neural network based clustering analysis (i.e., Self-Organizing Maps; SOM). We believe that this is one of the first attempts at applying unsupervised artificial neural network-based machine-learning techniques in the service forecasting domain. Our proposed approach has several appealing features : (1) We applied CBR and SOM in a new forecasting domain such as service demand forecasting. (2) We proposed our combined approach between CBR and SOM in order to overcome limitations of traditional statistical forecasting methods and We have developed a service forecasting tool based on the proposed approach using an unsupervised artificial neural network and Case-based reasoning. In this research, we conducted an empirical study on a real digital TV manufacturer (i.e., Company A). In addition, we have empirically evaluated the proposed approach and tool using real sales and service related data from digital TV manufacturer. In our empirical experiments, we intend to explore the performance of our proposed service forecasting framework when compared to the performances predicted by other two service forecasting methods; one is traditional CBR based forecasting model and the other is the existing service forecasting model used by Company A. We ran each service forecasting 144 times; each time, input data were randomly sampled for each service forecasting framework. To evaluate accuracy of forecasting results, we used Mean Absolute Percentage Error (MAPE) as primary performance measure in our experiments. We conducted one-way ANOVA test with the 144 measurements of MAPE for three different service forecasting approaches. For example, the F-ratio of MAPE for three different service forecasting approaches is 67.25 and the p-value is 0.000. This means that the difference between the MAPE of the three different service forecasting approaches is significant at the level of 0.000. Since there is a significant difference among the different service forecasting approaches, we conducted Tukey's HSD post hoc test to determine exactly which means of MAPE are significantly different from which other ones. In terms of MAPE, Tukey's HSD post hoc test grouped the three different service forecasting approaches into three different subsets in the following order: our proposed approach > traditional CBR-based service forecasting approach > the existing forecasting approach used by Company A. Consequently, our empirical experiments show that our proposed approach outperformed the traditional CBR based forecasting model and the existing service forecasting model used by Company A. The rest of this paper is organized as follows. Section 2 provides some research background information such as summary of CBR and SOM. Section 3 presents a hybrid service forecasting framework based on Case-based Reasoning and Self-Organizing Maps, while the empirical evaluation results are summarized in Section 4. Conclusion and future research directions are finally discussed in Section 5.

A Study on Improvement of Collaborative Filtering Based on Implicit User Feedback Using RFM Multidimensional Analysis (RFM 다차원 분석 기법을 활용한 암시적 사용자 피드백 기반 협업 필터링 개선 연구)

  • Lee, Jae-Seong;Kim, Jaeyoung;Kang, Byeongwook
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.139-161
    • /
    • 2019
  • The utilization of the e-commerce market has become a common life style in today. It has become important part to know where and how to make reasonable purchases of good quality products for customers. This change in purchase psychology tends to make it difficult for customers to make purchasing decisions in vast amounts of information. In this case, the recommendation system has the effect of reducing the cost of information retrieval and improving the satisfaction by analyzing the purchasing behavior of the customer. Amazon and Netflix are considered to be the well-known examples of sales marketing using the recommendation system. In the case of Amazon, 60% of the recommendation is made by purchasing goods, and 35% of the sales increase was achieved. Netflix, on the other hand, found that 75% of movie recommendations were made using services. This personalization technique is considered to be one of the key strategies for one-to-one marketing that can be useful in online markets where salespeople do not exist. Recommendation techniques that are mainly used in recommendation systems today include collaborative filtering and content-based filtering. Furthermore, hybrid techniques and association rules that use these techniques in combination are also being used in various fields. Of these, collaborative filtering recommendation techniques are the most popular today. Collaborative filtering is a method of recommending products preferred by neighbors who have similar preferences or purchasing behavior, based on the assumption that users who have exhibited similar tendencies in purchasing or evaluating products in the past will have a similar tendency to other products. However, most of the existed systems are recommended only within the same category of products such as books and movies. This is because the recommendation system estimates the purchase satisfaction about new item which have never been bought yet using customer's purchase rating points of a similar commodity based on the transaction data. In addition, there is a problem about the reliability of purchase ratings used in the recommendation system. Reliability of customer purchase ratings is causing serious problems. In particular, 'Compensatory Review' refers to the intentional manipulation of a customer purchase rating by a company intervention. In fact, Amazon has been hard-pressed for these "compassionate reviews" since 2016 and has worked hard to reduce false information and increase credibility. The survey showed that the average rating for products with 'Compensated Review' was higher than those without 'Compensation Review'. And it turns out that 'Compensatory Review' is about 12 times less likely to give the lowest rating, and about 4 times less likely to leave a critical opinion. As such, customer purchase ratings are full of various noises. This problem is directly related to the performance of recommendation systems aimed at maximizing profits by attracting highly satisfied customers in most e-commerce transactions. In this study, we propose the possibility of using new indicators that can objectively substitute existing customer 's purchase ratings by using RFM multi-dimensional analysis technique to solve a series of problems. RFM multi-dimensional analysis technique is the most widely used analytical method in customer relationship management marketing(CRM), and is a data analysis method for selecting customers who are likely to purchase goods. As a result of verifying the actual purchase history data using the relevant index, the accuracy was as high as about 55%. This is a result of recommending a total of 4,386 different types of products that have never been bought before, thus the verification result means relatively high accuracy and utilization value. And this study suggests the possibility of general recommendation system that can be applied to various offline product data. If additional data is acquired in the future, the accuracy of the proposed recommendation system can be improved.