• Title/Summary/Keyword: bagging

Search Result 199, Processing Time 0.021 seconds

Financial Distress Prediction Using Adaboost and Bagging in Pakistan Stock Exchange

  • TUNIO, Fayaz Hussain;DING, Yi;AGHA, Amad Nabi;AGHA, Kinza;PANHWAR, Hafeez Ur Rehman Zubair
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.1
    • /
    • pp.665-673
    • /
    • 2021
  • Default has become an extreme concern in the current world due to the financial crisis. The previous prediction of companies' bankruptcy exhibits evidence of decision assistance for financial and regulatory bodies. Notwithstanding numerous advanced approaches, this area of study is not outmoded and requires additional research. The purpose of this research is to find the best classifier to detect a company's default risk and bankruptcy. This study used secondary data from the Pakistan Stock Exchange (PSX) and it is time-series data to examine the impact on the determinants. This research examined several different classifiers as per their competence to properly categorize default and non-default Pakistani companies listed on the PSX. Additionally, PSX has remained consistent for some years in terms of growth and has provided benefits to its stockholders. This paper utilizes machine learning techniques to predict financial distress in companies listed on the PSX. Our results indicate that most multi-stage mixture of classifiers provided noteworthy developments over the individual classifiers. This means that firms will have to work on the financial variables such as liquidity and profitability to not fall into the category of liquidation. Moreover, Adaptive Boosting (Adaboost) provides a significant boost in the performance of each classifier.

Comparing the Performance of 17 Machine Learning Models in Predicting Human Population Growth of Countries

  • Otoom, Mohammad Mahmood
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.220-225
    • /
    • 2021
  • Human population growth rate is an important parameter for real-world planning. Common approaches rely upon fixed parameters like human population, mortality rate, fertility rate, which is collected historically to determine the region's population growth rate. Literature does not provide a solution for areas with no historical knowledge. In such areas, machine learning can solve the problem, but a multitude of machine learning algorithm makes it difficult to determine the best approach. Further, the missing feature is a common real-world problem. Thus, it is essential to compare and select the machine learning techniques which provide the best and most robust in the presence of missing features. This study compares 17 machine learning techniques (base learners and ensemble learners) performance in predicting the human population growth rate of the country. Among the 17 machine learning techniques, random forest outperformed all the other techniques both in predictive performance and robustness towards missing features. Thus, the study successfully demonstrates and compares machine learning techniques to predict the human population growth rate in settings where historical data and feature information is not available. Further, the study provides the best machine learning algorithm for performing population growth rate prediction.

Study of Personal Credit Risk Assessment Based on SVM

  • LI, Xin;XIA, Han
    • The Journal of Industrial Distribution & Business
    • /
    • v.13 no.10
    • /
    • pp.1-8
    • /
    • 2022
  • Purpose: Support vector machines (SVMs) ensemble has been proposed to improve classification performance of Credit risk recently. However, currently used fusion strategies do not evaluate the importance degree of the output of individual component SVM classifier when combining the component predictions to the final decision. To deal with this problem, this paper designs a support vector machines (SVMs) ensemble method based on fuzzy integral, which aggregates the outputs of separate component SVMs with importance of each component SVM. Research design, data, and methodology: This paper designs a personal credit risk evaluation index system including 16 indicators and discusses a support vector machines (SVMs) ensemble method based on fuzzy integral for designing a credit risk assessment system to discriminate good creditors from bad ones. This paper randomly selects 1500 sample data of personal loan customers of a commercial bank in China 2015-2020 for simulation experiments. Results: By comparing the experimental result SVMs ensemble with the single SVM, the neural network ensemble, the proposed method outperforms the single SVM, and neural network ensemble in terms of classification accuracy. Conclusions: The results show that the method proposed in this paper has higher classification accuracy than other classification methods, which confirms the feasibility and effectiveness of this method.

Credit Risk Evaluations of Online Retail Enterprises Using Support Vector Machines Ensemble: An Empirical Study from China

  • LI, Xin;XIA, Han
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.8
    • /
    • pp.89-97
    • /
    • 2022
  • The e-commerce market faces significant credit risks due to the complexity of the industry and information asymmetries. Therefore, credit risk has started to stymie the growth of e-commerce. However, there is no reliable system for evaluating the creditworthiness of e-commerce companies. Therefore, this paper constructs a credit risk evaluation index system that comprehensively considers the online and offline behavior of online retail enterprises, including 15 indicators that reflect online credit risk and 15 indicators that reflect offline credit risk. This paper establishes an integration method based on a fuzzy integral support vector machine, which takes the factor analysis results of the credit risk evaluation index system of online retail enterprises as the input and the credit risk evaluation results of online retail enterprises as the output. The classification results of each sub-classifier and the importance of each sub-classifier decision to the final decision have been taken into account in this method. Select the sample data of 1500 online retail loan customers from a bank to test the model. The empirical results demonstrate that the proposed method outperforms a single SVM and traditional SVMs aggregation technique via majority voting in terms of classification accuracy, which provides a basis for banks to establish a reliable evaluation system.

Improved ensemble machine learning framework for seismic fragility analysis of concrete shear wall system

  • Sangwoo Lee;Shinyoung Kwag;Bu-seog Ju
    • Computers and Concrete
    • /
    • v.32 no.3
    • /
    • pp.313-326
    • /
    • 2023
  • The seismic safety of the shear wall structure can be assessed through seismic fragility analysis, which requires high computational costs in estimating seismic demands. Accordingly, machine learning methods have been applied to such fragility analyses in recent years to reduce the numerical analysis cost, but it still remains a challenging task. Therefore, this study uses the ensemble machine learning method to present an improved framework for developing a more accurate seismic demand model than the existing ones. To this end, a rank-based selection method that enables determining an excellent model among several single machine learning models is presented. In addition, an index that can evaluate the degree of overfitting/underfitting of each model for the selection of an excellent single model is suggested. Furthermore, based on the selected single machine learning model, we propose a method to derive a more accurate ensemble model based on the bagging method. As a result, the seismic demand model for which the proposed framework is applied shows about 3-17% better prediction performance than the existing single machine learning models. Finally, the seismic fragility obtained from the proposed framework shows better accuracy than the existing fragility methods.

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

Object Classification Method Using Dynamic Random Forests and Genetic Optimization

  • Kim, Jae Hyup;Kim, Hun Ki;Jang, Kyung Hyun;Lee, Jong Min;Moon, Young Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.79-89
    • /
    • 2016
  • In this paper, we proposed the object classification method using genetic and dynamic random forest consisting of optimal combination of unit tree. The random forest can ensure good generalization performance in combination of large amount of trees by assigning the randomization to the training samples and feature selection, etc. allocated to the decision tree as an ensemble classification model which combines with the unit decision tree based on the bagging. However, the random forest is composed of unit trees randomly, so it can show the excellent classification performance only when the sufficient amounts of trees are combined. There is no quantitative measurement method for the number of trees, and there is no choice but to repeat random tree structure continuously. The proposed algorithm is composed of random forest with a combination of optimal tree while maintaining the generalization performance of random forest. To achieve this, the problem of improving the classification performance was assigned to the optimization problem which found the optimal tree combination. For this end, the genetic algorithm methodology was applied. As a result of experiment, we had found out that the proposed algorithm could improve about 3~5% of classification performance in specific cases like common database and self infrared database compare with the existing random forest. In addition, we had shown that the optimal tree combination was decided at 55~60% level from the maximum trees.

The effect on the Intracranial Pressure of the Patients Receiving Endotracheal Suction (기관내 흡인이 두개강내압에 미치는 영향에 관한 연구)

  • 김매자;이경옥
    • Journal of Korean Academy of Nursing
    • /
    • v.23 no.2
    • /
    • pp.245-254
    • /
    • 1993
  • The purpose of this study was to identify effective methods to minimize increases in intracranial pressure(IICP ) during endotracheal suction by means of comparing two methods of hyperventilation and oxygen supply before and after endotracheal suction. In order to evaluate the effects of these two methods, the ICP during suctioning and the sustained time of IICP were measured. For hyperventilation, ambu-bagging was done 10 times for 30 seconds with a tidal volume of 800-900m1. For oxygen supply, 100 percent oxygen was supplied for 2 minutes before and after suction. The subjects for this study were 12 neurosurgical patients who had had a subarachnoid bolt inserted for ICP monitoring and they were all on mechanical ventilatory support in a surgical intensive care unit of Seoul National University Hospital from July 1, 1991 to March 31, 1992. In each patient hyperventilation was performed five times and oxygen supply was given five times and intracranial pressures were measured immediately before and every 30 seconds for 15 minutes after suction. For case assignments counterbalancing and repeated measure designs were combined. And so the total number of experiments were sixty for each group. The effects of hyperventilation and oxygen supply on the IICP and the sustained time of IICP after suction were analyzed by t-test. The results of study were as follows 1. There was a significant difference between the two groups in the increased ICP during suction (t=2.49, p=.014). 2. The sustained time of IICP after suctioning in the oxygen supply group was shorter than that in the hyperventilation group(t=2.35, p=.020) In summary, the Increase in the ICP during suction was lower and the time for the ICP to return to the presuction level was shorter in the oxygen supply group as compared to the hyperventilation group. Therefore, oxygen supply can be re commended before and after endotracheal suction.

  • PDF

Data-mining modeling for the prediction of wear on forming-taps in the threading of steel components

  • Bustillo, Andres;Lopez de Lacalle, Luis N.;Fernandez-Valdivielso, Asier;Santos, Pedro
    • Journal of Computational Design and Engineering
    • /
    • v.3 no.4
    • /
    • pp.337-348
    • /
    • 2016
  • An experimental approach is presented for the measurement of wear that is common in the threading of cold-forged steel. In this work, the first objective is to measure wear on various types of roll taps manufactured to tapping holes in microalloyed HR45 steel. Different geometries and levels of wear are tested and measured. Taking their geometry as the critical factor, the types of forming tap with the least wear and the best performance are identified. Abrasive wear was observed on the forming lobes. A higher number of lobes in the chamber zone and around the nominal diameter meant a more uniform load distribution and a more gradual forming process. A second objective is to identify the most accurate data-mining technique for the prediction of form-tap wear. Different data-mining techniques are tested to select the most accurate one: from standard versions such as Multilayer Perceptrons, Support Vector Machines and Regression Trees to the most recent ones such as Rotation Forest ensembles and Iterated Bagging ensembles. The best results were obtained with ensembles of Rotation Forest with unpruned Regression Trees as base regressors that reduced the RMS error of the best-tested baseline technique for the lower length output by 33%, and Additive Regression with unpruned M5P as base regressors that reduced the RMS errors of the linear fit for the upper and total lengths by 25% and 39%, respectively. However, the lower length was statistically more difficult to model in Additive Regression than in Rotation Forest. Rotation Forest with unpruned Regression Trees as base regressors therefore appeared to be the most suitable regressor for the modeling of this industrial problem.

Characteristics of sawdust cultivation of Lentinula edodes with different methods of spawn inoculation

  • Chang, Hyun You;Seo, Geum Hui;Lee, Yong Kuk;Jeon, Sung Woo
    • Journal of Mushroom
    • /
    • v.16 no.2
    • /
    • pp.61-64
    • /
    • 2018
  • This study was carried out to investigate the management characteristics and growth performance of L. edodes from the cooling stage to incubation. Bags of different heights and weights are available for bagging. When the medium size of $17{\times}13cm$ was used and the size of the inoculation hole was changed from 1/3 to 2/3, the browning period was shortened to 30 days. Mycelial growth was evaluated according to the cooling temperature after sterilization. It was observed to be the highest at 122 mm/15 days at $10^{\circ}C$ and 114 mm/15 days and 117 mm/15 days at $15^{\circ}C$ and $20^{\circ}C$, respectively. The contamination rate of the sawdust media before inoculation was measured as 0, $4.5{\times}10$, $1.3{\times}10^2$, $4.0{\times}10^3cfu$ at $5^{\circ}C$, $10^{\circ}C$, $15^{\circ}C$, and $24^{\circ}C$ respectively. The average of $1.6{\times}10^8$ colony forming units (cfu) of microorganisms was observed in the sawdust that had been piled for six months outdoors. In summer, the sawdust has to be used immediately after mixing. The sterilized medium had an average of $4{\times}10^3cfu$ of microorganisms at $24^{\circ}C$ and $1.3{\times}10^2cfu$ at $15^{\circ}C$. After 15 days of inoculation in vitro, the growth conditions of the sawdust was the best at 132 mm, followed by grain and liquid. When inoculated with liquid spawn, the moisture content of the substrate should be adjusted between 50% and 55% in advance.