• Title/Summary/Keyword: hybrid predictive model

Search Result 66, Processing Time 0.024 seconds

Optimizing Clustering and Predictive Modelling for 3-D Road Network Analysis Using Explainable AI

  • Rotsnarani Sethy;Soumya Ranjan Mahanta;Mrutyunjaya Panda
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.9
    • /
    • pp.30-40
    • /
    • 2024
  • Building an accurate 3-D spatial road network model has become an active area of research now-a-days that profess to be a new paradigm in developing Smart roads and intelligent transportation system (ITS) which will help the public and private road impresario for better road mobility and eco-routing so that better road traffic, less carbon emission and road safety may be ensured. Dealing with such a large scale 3-D road network data poses challenges in getting accurate elevation information of a road network to better estimate the CO2 emission and accurate routing for the vehicles in Internet of Vehicle (IoV) scenario. Clustering and regression techniques are found suitable in discovering the missing elevation information in 3-D spatial road network dataset for some points in the road network which is envisaged of helping the public a better eco-routing experience. Further, recently Explainable Artificial Intelligence (xAI) draws attention of the researchers to better interprete, transparent and comprehensible, thus enabling to design efficient choice based models choices depending upon users requirements. The 3-D road network dataset, comprising of spatial attributes (longitude, latitude, altitude) of North Jutland, Denmark, collected from publicly available UCI repositories is preprocessed through feature engineering and scaling to ensure optimal accuracy for clustering and regression tasks. K-Means clustering and regression using Support Vector Machine (SVM) with radial basis function (RBF) kernel are employed for 3-D road network analysis. Silhouette scores and number of clusters are chosen for measuring cluster quality whereas error metric such as MAE ( Mean Absolute Error) and RMSE (Root Mean Square Error) are considered for evaluating the regression method. To have better interpretability of the Clustering and regression models, SHAP (Shapley Additive Explanations), a powerful xAI technique is employed in this research. From extensive experiments , it is observed that SHAP analysis validated the importance of latitude and altitude in predicting longitude, particularly in the four-cluster setup, providing critical insights into model behavior and feature contributions SHAP analysis validated the importance of latitude and altitude in predicting longitude, particularly in the four-cluster setup, providing critical insights into model behavior and feature contributions with an accuracy of 97.22% and strong performance metrics across all classes having MAE of 0.0346, and MSE of 0.0018. On the other hand, the ten-cluster setup, while faster in SHAP analysis, presented challenges in interpretability due to increased clustering complexity. Hence, K-Means clustering with K=4 and SVM hybrid models demonstrated superior performance and interpretability, highlighting the importance of careful cluster selection to balance model complexity and predictive accuracy.

Hybrid Preference Prediction Technique Using Weighting based Data Reliability for Collaborative Filtering Recommendation System (협업 필터링 추천 시스템을 위한 데이터 신뢰도 기반 가중치를 이용한 하이브리드 선호도 예측 기법)

  • Lee, O-Joun;Baek, Yeong-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.5
    • /
    • pp.61-69
    • /
    • 2014
  • Collaborative filtering recommendation creates similar item subset or similar user subset based on user preference about items and predict user preference to particular item by using them. Thus, if preference matrix has low density, reliability of recommendation will be sharply decreased. To solve these problems we suggest Hybrid Preference Prediction Technique Using Weighting based Data Reliability. Preference prediction is carried out by creating similar item subset and similar user subset and predicting user preference by each subset and merging each predictive value by weighting point applying model condition. According to this technique, we can increase accuracy of user preference prediction and implement recommendation system which can provide highly reliable recommendation when density of preference matrix is low. Efficiency of this system is verified by Mean Absolute Error. Proposed technique shows average 21.7% improvement than Hao Ji's technique when preference matrix sparsity is more than 84% through experiment.

A Study on the Comparison of the Predictability among Traditional and Choice-based Conjoint Analyses in the Choice of Service Products (서비스제품 선택에서 전통적 컨조인트기법과 선택형 컨조인트기법간의 예측력 비교에 대한 연구)

  • Lim, Byung-Hoom;Ahn, Kwang-Ho;Park, Uhn-Yong
    • Journal of Global Scholars of Marketing Science
    • /
    • v.16 no.4
    • /
    • pp.39-54
    • /
    • 2006
  • Marketing managers hope to maximize the success rate of new products by satisfying various needs of consumers. For this, an analysis called 'conjoint analysis' has been frequently applied in the process of new product development. This study was performed to compare the predictability of diverse conjoint analyses in choice of general hospitals. The comparison was performed among four models of traditional conjoint analysis and choice-based conjoint analysis. Results show that the hybrid conjoint analysis, which combined the traditional conjoint analysis and the choice-based conjoint model showed the highest predictive accuracy. Still two models show similar estimates of utility.

  • PDF

The Hybrid Multi-layer Inference Architectures and Algorithms of FPNN Based on FNN and PNN (FNN 및 PNN에 기초한 FPNN의 합성 다층 추론 구조와 알고리즘)

  • Park, Byeong-Jun;O, Seong-Gwon;Kim, Hyeon-Gi
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.49 no.7
    • /
    • pp.378-388
    • /
    • 2000
  • In this paper, we propose Fuzzy Polynomial Neural Networks(FPNN) based on Polynomial Neural Networks(PNN) and Fuzzy Neural Networks(FNN) for model identification of complex and nonlinear systems. The proposed FPNN is generated from the mutually combined structure of both FNN and PNN. The one and the other are considered as the premise part and consequence part of FPNN structure respectively. As the consequence part of FPNN, PNN is based on Group Method of Data Handling(GMDH) method and its structure is similar to Neural Networks. But the structure of PNN is not fixed like in conventional Neural Networks and self-organizing networks that can be generated. FPNN is available effectively for multi-input variables and high-order polynomial according to the combination of FNN with PNN. Accordingly it is possible to consider the nonlinearity characteristics of process and to get better output performance with superb predictive ability. As the premise part of FPNN, FNN uses both the simplified fuzzy inference as fuzzy inference method and error back-propagation algorithm as learning rule. The parameters such as parameters of membership functions, learning rates and momentum coefficients are adjusted using genetic algorithms. And we use two kinds of FNN structure according to the division method of fuzzy space of input variables. One is basic FNN structure and uses fuzzy input space divided by each separated input variable, the other is modified FNN structure and uses fuzzy input space divided by mutually combined input variables. In order to evaluate the performance of proposed models, we use the nonlinear function and traffic route choice process. The results show that the proposed FPNN can produce the model with higher accuracy and more robustness than any other method presented previously. And also performance index related to the approximation and prediction capabilities of model is evaluated and discussed.

  • PDF

The Credit Information Feature Selection Method in Default Rate Prediction Model for Individual Businesses (개인사업자 부도율 예측 모델에서 신용정보 특성 선택 방법)

  • Hong, Dongsuk;Baek, Hanjong;Shin, Hyunjoon
    • Journal of the Korea Society for Simulation
    • /
    • v.30 no.1
    • /
    • pp.75-85
    • /
    • 2021
  • In this paper, we present a deep neural network-based prediction model that processes and analyzes the corporate credit and personal credit information of individual business owners as a new method to predict the default rate of individual business more accurately. In modeling research in various fields, feature selection techniques have been actively studied as a method for improving performance, especially in predictive models including many features. In this paper, after statistical verification of macroeconomic indicators (macro variables) and credit information (micro variables), which are input variables used in the default rate prediction model, additionally, through the credit information feature selection method, the final feature set that improves prediction performance was identified. The proposed credit information feature selection method as an iterative & hybrid method that combines the filter-based and wrapper-based method builds submodels, constructs subsets by extracting important variables of the maximum performance submodels, and determines the final feature set through prediction performance analysis of the subset and the subset combined set.

Performance Evaluation and Forecasting Model for Retail Institutions (유통업체의 부실예측모형 개선에 관한 연구)

  • Kim, Jung-Uk
    • Journal of Distribution Science
    • /
    • v.12 no.11
    • /
    • pp.77-83
    • /
    • 2014
  • Purpose - The National Agricultural Cooperative Federation of Korea and National Fisheries Cooperative Federation of Korea have prosecuted both financial and retail businesses. As cooperatives are public institutions and receive government support, their sound management is required by the Financial Supervisory Service in Korea. This is mainly managed by CAEL, which is changed by CAMEL. However, NFFC's business section, managing the finance and retail businesses, is unified and evaluated; the CAEL model has an insufficient classification to evaluate the retail industry. First, there is discrimination power as regards CAEL. Although the retail business sector union can receive a higher rating on a CAEL model, defaults have often been reported. Therefore, a default prediction model is needed to support a CAEL model. As we have the default prediction model using a subdivision of indexes and statistical methods, it can be useful to have a prevention function through the estimation of the retail sector's default probability. Second, separating the difference between the finance and retail business sectors is necessary. Their businesses have different characteristics. Based on various management indexes that have been systematically managed by the National Fisheries Cooperative Federation of Korea, our model predicts retail default, and is better than the CAEL model in its failure prediction because it has various discriminative financial ratios reflecting the retail industry situation. Research design, data, and methodology - The model to predict retail default was presented using logistic analysis. To develop the predictive model, we use the retail financial statements of the NFCF. We consider 93 unions each year from 2006 to 2012 to select confident management indexes. We also adapted the statistical power analysis that is a t-test, logit analysis, AR (accuracy ratio), and AUROC (Area Under Receiver Operating Characteristic) analysis. Finally, through the multivariate logistic model, we show that it is excellent in its discrimination power and higher in its hit ratio for default prediction. We also evaluate its usefulness. Results - The statistical power analysis using the AR (AUROC) method on the short term model shows that the logistic model has excellent discrimination power, with 84.6%. Further, it is higher in its hit ratio for failure (prediction) of total model, at 94%, indicating that it is temporally stable and useful for evaluating the management status of retail institutions. Conclusions - This model is useful for evaluating the management status of retail union institutions. First, subdividing CAEL evaluation is required. The existing CAEL evaluation is underdeveloped, and discrimination power falls. Second, efforts to develop a varied and rational management index are continuously required. An index reflecting retail industry characteristics needs to be developed. However, extending this study will need the following. First, it will require a complementary default model reflecting size differences. Second, in the case of small and medium retail, it will need non-financial information. Therefore, it will be a hybrid default model reflecting financial and non-financial information.

A multi-layer approach to DN 50 electric valve fault diagnosis using shallow-deep intelligent models

  • Liu, Yong-kuo;Zhou, Wen;Ayodeji, Abiodun;Zhou, Xin-qiu;Peng, Min-jun;Chao, Nan
    • Nuclear Engineering and Technology
    • /
    • v.53 no.1
    • /
    • pp.148-163
    • /
    • 2021
  • Timely fault identification is important for safe and reliable operation of the electric valve system. Many research works have utilized different data-driven approach for fault diagnosis in complex systems. However, they do not consider specific characteristics of critical control components such as electric valves. This work presents an integrated shallow-deep fault diagnostic model, developed based on signals extracted from DN50 electric valve. First, the local optimal issue of particle swarm optimization algorithm is solved by optimizing the weight search capability, the particle speed, and position update strategy. Then, to develop a shallow diagnostic model, the modified particle swarm algorithm is combined with support vector machine to form a hybrid improved particle swarm-support vector machine (IPs-SVM). To decouple the influence of the background noise, the wavelet packet transform method is used to reconstruct the vibration signal. Thereafter, the IPs-SVM is used to classify phase imbalance and damaged valve faults, and the performance was evaluated against other models developed using the conventional SVM and particle swarm optimized SVM. Secondly, three different deep belief network (DBN) models are developed, using different acoustic signal structures: raw signal, wavelet transformed signal and time-series (sequential) signal. The models are developed to estimate internal leakage sizes in the electric valve. The predictive performance of the DBN and the evaluation results of the proposed IPs-SVM are also presented in this paper.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-15
    • /
    • 2021
  • If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.

The Analysis and Design of Advanced Neurofuzzy Polynomial Networks (고급 뉴로퍼지 다항식 네트워크의 해석과 설계)

  • Park, Byeong-Jun;O, Seong-Gwon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.3
    • /
    • pp.18-31
    • /
    • 2002
  • In this study, we introduce a concept of advanced neurofuzzy polynomial networks(ANFPN), a hybrid modeling architecture combining neurofuzzy networks(NFN) and polynomial neural networks(PNN). These networks are highly nonlinear rule-based models. The development of the ANFPN dwells on the technologies of Computational Intelligence(Cl), namely fuzzy sets, neural networks and genetic algorithms. NFN contributes to the formation of the premise part of the rule-based structure of the ANFPN. The consequence part of the ANFPN is designed using PNN. At the premise part of the ANFPN, NFN uses both the simplified fuzzy inference and error back-propagation learning rule. The parameters of the membership functions, learning rates and momentum coefficients are adjusted with the use of genetic optimization. As the consequence structure of ANFPN, PNN is a flexible network architecture whose structure(topology) is developed through learning. In particular, the number of layers and nodes of the PNN are not fixed in advance but is generated in a dynamic way. In this study, we introduce two kinds of ANFPN architectures, namely the basic and the modified one. Here the basic and the modified architecture depend on the number of input variables and the order of polynomial in each layer of PNN structure. Owing to the specific features of two combined architectures, it is possible to consider the nonlinear characteristics of process system and to obtain the better output performance with superb predictive ability. The availability and feasibility of the ANFPN are discussed and illustrated with the aid of two representative numerical examples. The results show that the proposed ANFPN can produce the model with higher accuracy and predictive ability than any other method presented previously.