• Title/Summary/Keyword: Predicting Income Algorithm

Search Result 5, Processing Time 0.02 seconds

Proposal of An Artificial Intelligence Farm Income Prediction Algorithm based on Time Series Analysis

  • Jang, Eun-Jin;Shin, Seung-Jung
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.98-103
    • /
    • 2021
  • Recently, as the need for food resources has increased both domestically and internationally, support for the agricultural sector for stable food supply and demand is expanding in Korea. However, according to recent media articles, the biggest problem in rural communities is the unstable profit structure. In addition, in order to confirm the profit structure, profit forecast data must be clearly prepared, but there is a lack of auxiliary data for farmers or future returnees to predict farm income. Therefore, in this paper we analyzed data over the past 15 years through time series analysis and proposes an artificial intelligence farm income prediction algorithm that can predict farm household income in the future. If the proposed algorithm is used, it is expected that it can be used as auxiliary data to predict farm profits.

Income prediction of apple and pear farmers in Chungnam area by automatic machine learning with H2O.AI

  • Hyundong, Jang;Sounghun, Kim
    • Korean Journal of Agricultural Science
    • /
    • v.49 no.3
    • /
    • pp.619-627
    • /
    • 2022
  • In Korea, apples and pears are among the most important agricultural products to farmers who seek to earn money as income. Generally, farmers make decisions at various stages to maximize their income but they do not always know exactly which option will be the best one. Many previous studies were conducted to solve this problem by predicting farmers' income structure, but researchers are still exploring better approaches. Currently, machine learning technology is gaining attention as one of the new approaches for farmers' income prediction. The machine learning technique is a methodology using an algorithm that can learn independently through data. As the level of computer science develops, the performance of machine learning techniques is also improving. The purpose of this study is to predict the income structure of apples and pears using the automatic machine learning solution H2O.AI and to present some implications for apple and pear farmers. The automatic machine learning solution H2O.AI can save time and effort compared to the conventional machine learning techniques such as scikit-learn, because it works automatically to find the best solution. As a result of this research, the following findings are obtained. First, apple farmers should increase their gross income to maximize their income, instead of reducing the cost of growing apples. In particular, apple farmers mainly have to increase production in order to obtain more gross income. As a second-best option, apple farmers should decrease labor and other costs. Second, pear farmers also should increase their gross income to maximize their income but they have to increase the price of pears rather than increasing the production of pears. As a second-best option, pear farmers can decrease labor and other costs.

Predicting the Subsequent Childbirth Intention of Married Women with One Child to Solve the Low Birth Rate Problem in Korea: Application of a Machine Learning Method (저출생 문제해결을 위한 한자녀 기혼여성의 후속 출산의향 예측: 머신러닝 방법의 적용)

  • Hyo Jeong Jeon
    • Korean Journal of Childcare and Education
    • /
    • v.20 no.2
    • /
    • pp.127-143
    • /
    • 2024
  • Objective: The purpose of this study is to develop a machine learning model to predict the subsequent childbirth intention of married women with one child, aiming to address the low birth rate problem in Korea, This will be achieved by utilizing data from the 2021 Family and Childbirth Survey conducted by the Korea Institute for Health and Social Affairs. Methods: A prediction model was developed using the Random Forest algorithm to predict the subsequent childbirth intention of married women with one child. This algorithm was chosen for its advantages in prediction and generalization, and its performance was evaluated. Results: The significance of variables influencing the Random Forest prediction model was confirmed. With the exception of the presence or absence of leave before and after childbirth, most variables contributed to predicting the intention to have subsequent childbirth. Notably, variables such as the mother's age, number of children planned at the time of marriage, average monthly household income, spouse's share of childcare burden, mother's weekday housework hours, and presence or absence of spouse's maternity leave emerged as relatively important predictors of subsequent childbirth intention.

Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm (SVM과 meta-learning algorithm을 이용한 고지혈증 유병 예측모형 개발과 활용)

  • Lee, Seulki;Shin, Taeksoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.111-124
    • /
    • 2018
  • This study aims to develop a classification model for predicting the occurrence of hyperlipidemia, one of the chronic diseases. Prior studies applying data mining techniques for predicting disease can be classified into a model design study for predicting cardiovascular disease and a study comparing disease prediction research results. In the case of foreign literatures, studies predicting cardiovascular disease were predominant in predicting disease using data mining techniques. Although domestic studies were not much different from those of foreign countries, studies focusing on hypertension and diabetes were mainly conducted. Since hypertension and diabetes as well as chronic diseases, hyperlipidemia, are also of high importance, this study selected hyperlipidemia as the disease to be analyzed. We also developed a model for predicting hyperlipidemia using SVM and meta learning algorithms, which are already known to have excellent predictive power. In order to achieve the purpose of this study, we used data set from Korea Health Panel 2012. The Korean Health Panel produces basic data on the level of health expenditure, health level and health behavior, and has conducted an annual survey since 2008. In this study, 1,088 patients with hyperlipidemia were randomly selected from the hospitalized, outpatient, emergency, and chronic disease data of the Korean Health Panel in 2012, and 1,088 nonpatients were also randomly extracted. A total of 2,176 people were selected for the study. Three methods were used to select input variables for predicting hyperlipidemia. First, stepwise method was performed using logistic regression. Among the 17 variables, the categorical variables(except for length of smoking) are expressed as dummy variables, which are assumed to be separate variables on the basis of the reference group, and these variables were analyzed. Six variables (age, BMI, education level, marital status, smoking status, gender) excluding income level and smoking period were selected based on significance level 0.1. Second, C4.5 as a decision tree algorithm is used. The significant input variables were age, smoking status, and education level. Finally, C4.5 as a decision tree algorithm is used. In SVM, the input variables selected by genetic algorithms consisted of 6 variables such as age, marital status, education level, economic activity, smoking period, and physical activity status, and the input variables selected by genetic algorithms in artificial neural network consist of 3 variables such as age, marital status, and education level. Based on the selected parameters, we compared SVM, meta learning algorithm and other prediction models for hyperlipidemia patients, and compared the classification performances using TP rate and precision. The main results of the analysis are as follows. First, the accuracy of the SVM was 88.4% and the accuracy of the artificial neural network was 86.7%. Second, the accuracy of classification models using the selected input variables through stepwise method was slightly higher than that of classification models using the whole variables. Third, the precision of artificial neural network was higher than that of SVM when only three variables as input variables were selected by decision trees. As a result of classification models based on the input variables selected through the genetic algorithm, classification accuracy of SVM was 88.5% and that of artificial neural network was 87.9%. Finally, this study indicated that stacking as the meta learning algorithm proposed in this study, has the best performance when it uses the predicted outputs of SVM and MLP as input variables of SVM, which is a meta classifier. The purpose of this study was to predict hyperlipidemia, one of the representative chronic diseases. To do this, we used SVM and meta-learning algorithms, which is known to have high accuracy. As a result, the accuracy of classification of hyperlipidemia in the stacking as a meta learner was higher than other meta-learning algorithms. However, the predictive performance of the meta-learning algorithm proposed in this study is the same as that of SVM with the best performance (88.6%) among the single models. The limitations of this study are as follows. First, various variable selection methods were tried, but most variables used in the study were categorical dummy variables. In the case with a large number of categorical variables, the results may be different if continuous variables are used because the model can be better suited to categorical variables such as decision trees than general models such as neural networks. Despite these limitations, this study has significance in predicting hyperlipidemia with hybrid models such as met learning algorithms which have not been studied previously. It can be said that the result of improving the model accuracy by applying various variable selection techniques is meaningful. In addition, it is expected that our proposed model will be effective for the prevention and management of hyperlipidemia.

Predicting Default Risk among Young Adults with Random Forest Algorithm (랜덤포레스트 모델을 활용한 청년층 차입자의 채무 불이행 위험 연구)

  • Lee, Jonghee
    • Journal of Family Resource Management and Policy Review
    • /
    • v.26 no.3
    • /
    • pp.19-34
    • /
    • 2022
  • There are growing concerns about debt insolvency among youth and low-income households. The deterioration in household debt quality among young people is due to a combination of sluggish employment, an increase in student loan burden and an increase in high-interest loans from the secondary financial sector. The purpose of this study was to explore the possibility of household debt default among young borrowers in Korea and to predict the factors affecting this possibility. This study utilized the 2021 Household Finance and Welfare Survey and used random forest algorithm to comprehensively analyze factors related to the possibility of default risk among young adults. This study presented the importance index and partial dependence charts of major determinants. This study found that the ratio of debt to assets(DTA), medical costs, household default risk index (HDRI), communication costs, and housing costs the focal independent variables.