• Title/Summary/Keyword: financial machine learning

Search Result 142, Processing Time 0.027 seconds

A Securities Company's Customer Churn Prediction Model and Causal Inference with SHAP Value (증권 금융 상품 거래 고객의 이탈 예측 및 원인 추론)

  • Na, Kwangtek;Lee, Jinyoung;Kim, Eunchan;Lee, Hyochan
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.215-229
    • /
    • 2020
  • The interest in machine learning is growing in all industries, but it is difficult to apply it to real-world tasks because of inexplicability. This paper introduces a case of developing a financial customer churn prediction model for a securities company, and introduces the research results on an attempt to develop a machine learning model that can be explained using the SHAP Value methodology and derivation of interpretability. In this study, a total of six customer churn models are compared and analyzed, and the cause of customer churn is inferred through the classification and data analysis of SHAP Value and the type of customer asset change. Based on the results of this study, it would be possible to use it as a basis for comprehensive judgment, such as using the Value of the deviation prediction result that can infer the cause of the marketing manager's actual customer marketing in the future and establishing a target marketing strategy for each customer.

Study on Prediction of Attendance Using Machine Learning (머신러닝을 이용한 관중 수요 예측에 관한 연구)

  • Yoo, Ji-Hyun
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1243-1249
    • /
    • 2019
  • People who gathered to enjoy a specific event or content are called audiences or spectators, and show various propensity according to the characteristics of the crowd. Although there is such a difference, in general, the number of attendance is directly related to the business aspect, which enables stable financial operation for the sale of contents through various incomes, such as the admission fee and the use of other facilities. Therefore, prediction of audience can be used as a major factor in marketing and budgeting strategies. In this study, we review several existing models for predicting the number of attendance and propose an efficient machine learning model. In addition, we studied daily attendance prediction and abnormal attendance prediction using combine DNN(Deep Neural Network) and RF(Random Forest) model.

Work life balance practices and the link to innovation and productivity: A comprehensive literature review

  • Hatcher, Ryan;Hwang, Yo-Sung
    • The Journal of Economics, Marketing and Management
    • /
    • v.7 no.1
    • /
    • pp.26-38
    • /
    • 2019
  • Purpose - This paper is to review recent literature, by conducting a thorough investigation of the limitations and implications for future research on work-life balance with the focus and linkages between work-life balance practices, machine learning and emotional intelligence, work-life conflict, the correlations between work-life enrichment and work-life balance practices, the relationships between employee job satisfaction and work-life balance, the links between work-life balance and the managerial support. Research design, data, and methodology - The paper will further detail linkages between work-life balance and organizational performance outcomes productivity and innovation. Previous literatures have paid attentions to the link of HR practices and organizational outcomes such as productivity, flexibility, and financial performance, but the understanding needs to be extended to involve innovation performance. Dealing with employees' emotions using different machine learning techniques is one of the phenomenal researches in today's world. Here, we examine how far the employees are conscious of their own self and found the ideas and views of an individual about themselves and others. Without proper knowledge about their personality it will be very difficult for an individual to manage their own emotions. This study also aims at finding out the individual abilities to manage their emotions in order to perform well. Conclusions - A theoretical conceptual framework has been built by integrating the existing literature to explain a number of factors which are closely associated with work-life balance. The conceptual model illustrates how the work-life balance interplays with performance and interrelates with the aforementioned factors.

Examining the Effects of Vocabulary on Crowdfunding Success: A Comparison of Cultural and Commercial Campaigns

  • Xiang Gao;Weige Huang;Bin, Li;Sunghan Ryu
    • Asia pacific journal of information systems
    • /
    • v.32 no.2
    • /
    • pp.275-306
    • /
    • 2022
  • Crowdfunding has emerged as an important financing source for diverse cultural projects and commercial ventures in the early stages. Unlike traditional investment evaluation, where structured financial data is critical, such information is typically unavailable for crowdfunding campaigns. Instead, campaign creators prepare pitches containing essential information about themselves and the campaigns, which are crucial in attracting and persuading contributors. Prior literature has examined the effects of different aspects in campaign pitches, but a comprehensive understanding of the theme is lacking. This study aims to fill this gap by identifying the lexicon of frequently used vocabulary in campaign pitches and examining how they are associated with crowdfunding success. Moreover, we examine how the association differs between culture and commercial crowdfunding campaigns. We randomly collected 50,000 campaigns from the cultural and commercial categories on Kickstarter and extracted the 100 most used verbs in the campaign pitches. Based on a machine learning approach combined with principal component analysis, we constructed sets of verbal factors statistically significant in predicting crowdfunding success. The findings also show that cultural and commercial campaigns consist of different verbal components with different effects on crowdfunding success.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensemble

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.5
    • /
    • pp.617-625
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensembles

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • v.5 no.2
    • /
    • pp.99-104
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

Machine Learning Based Stock Price Fluctuation Prediction Models of KOSDAQ-listed Companies Using Online News, Macroeconomic Indicators, Financial Market Indicators, Technical Indicators, and Social Interest Indicators (온라인 뉴스와 거시경제 지표, 금융 지표, 기술적 지표, 관심도 지표를 이용한 코스닥 상장 기업의 기계학습 기반 주가 변동 예측)

  • Kim, Hwa Ryun;Hong, Seung Hye;Hong, Helen
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.3
    • /
    • pp.448-459
    • /
    • 2021
  • In this paper, we propose a method of predicting the next-day stock price fluctuations of 10 KOSDAQ-listed companies in 5G, autonomous driving, and electricity sectors by training SVM, XGBoost, and LightGBM models from macroeconomic·financial market indicators, technical indicators, social interest indicators, and daily positive indices extracted from online news. In the three experiments to find out the usefulness of social interest indicators and daily positive indices, the average accuracy improved when each indicator and index was added to the models. In addition, when feature selection was performed to analyze the superiority of the extracted features, the average importance ranking of the social interest indicator and daily positive index was 5.45 and 1.08, respectively, it showed higher importance than the macroeconomic financial market indicators and technical indicators. With the results of these experiments, we confirmed the effectiveness of the social interest indicators as alternative data and the daily positive index for predicting stock price fluctuation.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

A Study on the Prediction Method of Voice Phishing Damage Using Big Data and FDS (빅데이터와 FDS를 활용한 보이스피싱 피해 예측 방법 연구)

  • Lee, Seoungyong;Lee, Julak
    • Korean Security Journal
    • /
    • no.62
    • /
    • pp.185-203
    • /
    • 2020
  • While overall crime has been on the decline since 2009, voice phishing has rather been on the rise. The government and academia have presented various measures and conducted research to eradicate it, but it is not enough to catch up with evolving voice phishing. In the study, researchers focused on catching criminals and preventing damage from voice phishing, which is difficult to recover from. In particular, a voice phishing prediction method using the Fraud Detection System (FDS), which is being used to detect financial fraud, was studied based on the fact that the victim engaged in financial transaction activities (such as account transfers). As a result, it was conceptually derived to combine big data such as call details, messenger details, abnormal accounts, voice phishing type and 112 report related to voice phishing in machine learning-based Fraud Detection System(FDS). In this study, the research focused mainly on government measures and literature research on the use of big data. However, limitations in data collection and security concerns in FDS have not provided a specific model. However, it is meaningful that the concept of voice phishing responses that converge FDS with the types of data needed for machine learning was presented for the first time in the absence of prior research. Based on this research, it is hoped that 'Voice Phishing Damage Prediction System' will be developed to prevent damage from voice phishing.

Predicting Corporate Bankruptcy using Simulated Annealing-based Random Fores (시뮬레이티드 어니일링 기반의 랜덤 포레스트를 이용한 기업부도예측)

  • Park, Hoyeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.155-170
    • /
    • 2018
  • Predicting a company's financial bankruptcy is traditionally one of the most crucial forecasting problems in business analytics. In previous studies, prediction models have been proposed by applying or combining statistical and machine learning-based techniques. In this paper, we propose a novel intelligent prediction model based on the simulated annealing which is one of the well-known optimization techniques. The simulated annealing is known to have comparable optimization performance to the genetic algorithms. Nevertheless, since there has been little research on the prediction and classification of business decision-making problems using the simulated annealing, it is meaningful to confirm the usefulness of the proposed model in business analytics. In this study, we use the combined model of simulated annealing and machine learning to select the input features of the bankruptcy prediction model. Typical types of combining optimization and machine learning techniques are feature selection, feature weighting, and instance selection. This study proposes a combining model for feature selection, which has been studied the most. In order to confirm the superiority of the proposed model in this study, we apply the real-world financial data of the Korean companies and analyze the results. The results show that the predictive accuracy of the proposed model is better than that of the naïve model. Notably, the performance is significantly improved as compared with the traditional decision tree, random forests, artificial neural network, SVM, and logistic regression analysis.