• Title/Summary/Keyword: 로지스틱모델

Search Result 239, Processing Time 0.024 seconds

A Study of Freshman Dropout Prediction Model Using Logistic Regression with Shift-Sigmoid Classification Function (시프트 시그모이드 분류함수를 가진 로지스틱 회귀를 이용한 신입생 중도탈락 예측모델 연구)

  • Kim Donghyung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.137-146
    • /
    • 2023
  • The dropout of university freshmen is a very important issue in the financial problems of universities. Moreover, the dropout rate is one of the important indicators among the external evaluation items of universities. Therefore, universities need to predict dropout students in advance and apply various dropout prevention programs targeting them. This paper proposes a method to predict such dropout students in advance. This paper is about a method for predicting dropout students. It proposes a method to select dropouts by applying logistic regression using a shift sigmoid classification function using only quantitative data from the first semester of the first year, which most universities have. It is based on logistic regression and can select the number of prediction subjects and prediction accuracy by using the shift sigmoid function as an classification function. As a result of the experiment, when the proposed algorithm was applied, the number of predicted dropout subjects varied from 100% to 20% compared to the actual number of dropout subjects, and it was found to have a prediction accuracy of 75% to 98%.

A study on the prediction of korean NPL market return (한국 NPL시장 수익률 예측에 관한 연구)

  • Lee, Hyeon Su;Jeong, Seung Hwan;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.123-139
    • /
    • 2019
  • The Korean NPL market was formed by the government and foreign capital shortly after the 1997 IMF crisis. However, this market is short-lived, as the bad debt has started to increase after the global financial crisis in 2009 due to the real economic recession. NPL has become a major investment in the market in recent years when the domestic capital market's investment capital began to enter the NPL market in earnest. Although the domestic NPL market has received considerable attention due to the overheating of the NPL market in recent years, research on the NPL market has been abrupt since the history of capital market investment in the domestic NPL market is short. In addition, decision-making through more scientific and systematic analysis is required due to the decline in profitability and the price fluctuation due to the fluctuation of the real estate business. In this study, we propose a prediction model that can determine the achievement of the benchmark yield by using the NPL market related data in accordance with the market demand. In order to build the model, we used Korean NPL data from December 2013 to December 2017 for about 4 years. The total number of things data was 2291. As independent variables, only the variables related to the dependent variable were selected for the 11 variables that indicate the characteristics of the real estate. In order to select the variables, one to one t-test and logistic regression stepwise and decision tree were performed. Seven independent variables (purchase year, SPC (Special Purpose Company), municipality, appraisal value, purchase cost, OPB (Outstanding Principle Balance), HP (Holding Period)). The dependent variable is a bivariate variable that indicates whether the benchmark rate is reached. This is because the accuracy of the model predicting the binomial variables is higher than the model predicting the continuous variables, and the accuracy of these models is directly related to the effectiveness of the model. In addition, in the case of a special purpose company, whether or not to purchase the property is the main concern. Therefore, whether or not to achieve a certain level of return is enough to make a decision. For the dependent variable, we constructed and compared the predictive model by calculating the dependent variable by adjusting the numerical value to ascertain whether 12%, which is the standard rate of return used in the industry, is a meaningful reference value. As a result, it was found that the hit ratio average of the predictive model constructed using the dependent variable calculated by the 12% standard rate of return was the best at 64.60%. In order to propose an optimal prediction model based on the determined dependent variables and 7 independent variables, we construct a prediction model by applying the five methodologies of discriminant analysis, logistic regression analysis, decision tree, artificial neural network, and genetic algorithm linear model we tried to compare them. To do this, 10 sets of training data and testing data were extracted using 10 fold validation method. After building the model using this data, the hit ratio of each set was averaged and the performance was compared. As a result, the hit ratio average of prediction models constructed by using discriminant analysis, logistic regression model, decision tree, artificial neural network, and genetic algorithm linear model were 64.40%, 65.12%, 63.54%, 67.40%, and 60.51%, respectively. It was confirmed that the model using the artificial neural network is the best. Through this study, it is proved that it is effective to utilize 7 independent variables and artificial neural network prediction model in the future NPL market. The proposed model predicts that the 12% return of new things will be achieved beforehand, which will help the special purpose companies make investment decisions. Furthermore, we anticipate that the NPL market will be liquidated as the transaction proceeds at an appropriate price.

Development of Geospatial Simulation Framework for WebGIS-based Simulation System (WebGIS 기반의 시뮬레이션 시스템을 위한 지리공간 시뮬레이션 프레임워크 개발)

  • Lee, Seong-Kyu;Kim, Young-Seup;Choi, Chul-Uong;Suh, Yong-Chul
    • Spatial Information Research
    • /
    • v.18 no.5
    • /
    • pp.119-131
    • /
    • 2010
  • Researchers require repetitive works such as data format analysis, reformatting and map reprojection in order to use geospatial data. To solve above problems, they are building web-based simulation systems with web developers. But the web-based systems are not efficiently developed because there is not the appropriate simulation framework for a web-based system using geospatial data. In this study, the geospatial simulation framework that can be effectively applied to the web-based system was designed and proposed. Also, the framework was composed of 7 modules; web mapping service, GIS mapping, statistics, model, processing,graphics, and geospatial datasets. In order to evaluate the effectiveness of the framework, a case study of urban growth has been verified. Experts who are not specialized in geospatial information disciplines expect to build easily a web-based system using geospatial data.

Prediction of Elementary Students' Computer Literacy Using Neural Networks (신경망을 이용한 초등학생 컴퓨터 활용 능력 예측)

  • Oh, Ji-Young;Lee, Soo-Jung
    • Journal of The Korean Association of Information Education
    • /
    • v.12 no.3
    • /
    • pp.267-274
    • /
    • 2008
  • A neural network is a modeling technique useful for finding out hidden patterns from data through repetitive learning process and for predicting target values for new data. In this study, we built multilayer perceptron neural networks for prediction of the students' computer literacy based on their personal characteristics, home and social environment, and academic record of other subjects. Prediction performance of the network was compared with that of a widely used prediction method, the regression model. From our experiments, it was found that personal characteristic features best explained computer proficiency level of a student, whereas the features of home and social environment resulted in the worse prediction accuracy among all. Moreover, the developed neural network model produced far more accurate prediction than the regression model.

  • PDF

Real-time Laying Hens Sound Analysis System using MFCC Feature Vectors

  • Jeon, Heung Seok;Na, Deayoung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.127-135
    • /
    • 2021
  • Raising large numbers of animals in very narrow environments such as laying hens house can be very damaged from small environmental change. Previously researched about laying hens sound analysis system has a problem for applying to the laying hens house because considering only the limited situation of laying hens house. In this paper, to solve the problem, we propose a new laying hens sound analysis model using MFCC feature vector. This model can detect 7 situations that occur in actual laying hens house through 9 kinds of laying hens sound analysis. As a result of the performance evaluation of the proposed laying hens sound analysis model, the average AUC was 0.93, which is about 43% higher than that of the frequency feature analysis method.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

Agent-Based COVID-19 Simulation Considering Dynamic Movement: Changes of Infections According to Detect Levels (동적 움직임 변화를 반영한 에이전트 기반 코로나-19 시뮬레이션: 접촉자 발견 수준에 따른 감염 변화)

  • Lee, Jongsung
    • Journal of the Korea Society for Simulation
    • /
    • v.30 no.1
    • /
    • pp.43-54
    • /
    • 2021
  • Since COVID-19 (Severe acute respiratory syndrome coronavirus type 2, SARS-Cov-2) was first discovered at the end of 2019, it has spread rapidly around the world. This study introduces an agent-based simulation model representing COVID-19 spread in South Korea to investigate the effect of detect level (contact tracing) on the virus spread. To develop the model, related data are aggregated and probability distributions are inferred based on the data. The entire process of infection, quarantine, recovery, and death is schematically described and the interaction of people is modeled based on the traffic data. A composite logistic functions are utilized to represent the compliance of people to the government move control such as social distancing. To demonstrate to effect of detect level on the virus spread, detect level is changed from 0% to 100%. The results indicate active contact tracing inhibits the virus spread and the inhibitory effect increases geometrically as the detect level increases.

Analysis of Vehicle Demand by Fuel Types including Hydrogen Vehicles (수소차를 포함한 연료유형에 따른 자동차 수요 분석)

  • Yuhyeon Bak;Jee Young Kim;Yoon Lee
    • Environmental and Resource Economics Review
    • /
    • v.32 no.3
    • /
    • pp.167-190
    • /
    • 2023
  • This study analyzes the potential demand for automobiles based on fuel type using survey data in Korea. The dependent variable of the model is the future desired fuel type, including gasoline, diesel, hybrid, electricity, and hydrogen. The main explanatory variables are the respondent demographic characteristics, key reasons for choosing vehicle fuel type and environmental awareness extracted via principal component analysis (PCA). Using a multinomial logit (MNL) model, we find that respondents who consider fuel economy and infrastructure increase the demand for a hybrid car but decrease the demand for electric and hydrogen vehicles. The denial-types increase the demand for gasoline (petrol) and diesel (light oil), and decrease the demand for electric vehicles. The anxiety-types increase the demand of hybrid vehicles, and decrease the demand for electric vehicles. In contrast, in the case of pro-types, the demand for diesel (light oil) hydrogen vehicles decreased.

The Prediction of Hypoxia Occurrence in Dangdong Bay (당동만의 빈산소 발생 예측)

  • Kang, Hoon;Kwon, Min Sun;You, Sun Jae;Kim, Jong Gu
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.1
    • /
    • pp.65-74
    • /
    • 2020
  • The purpose of this study was to investigate the physical characteristics of marine environment, and to predict the probability of the occurrence of hypoxia in the Dangdong bay. We predicted hypoxia using the logistic regression model analysis by observing the water temperature, salinity, and dissolved oxygen concentration. The analysis showed that the Brunt-Väisälä frequency which was shallow than the deep bay entrance, was higher inside the bay due to a lesser amount of fresh water inflow from the inner side of the bay, and density stratification was formed. The Richardson number, and Brunt-Väisälä frequency were very high occasionally from June to September; however, after September 2, the stratification had a tendency to decrease. Analysis of dissolved oxygen, water temperature, and salinity data observed in Dangdong bay showed that the dissolved oxygen concentration in the bottom layer was mostly affected by the temperature difference (dt) between the surface layer and bottom layer. Meanwhile, when the depth difference (dz) was set as a fixed variable, the probability of the occurrence of hypoxia varied with respect to the difference in water temperature. The depth difference (dz) was calculated to be 5 m, 10 m, 15 m, 20 m, and the difference in water temperature (dt) was found to be greater than 70 % at 8℃, 7℃, 5℃, and 3℃. This indicated that the larger the difference in depth in the bay, the smaller is the temperature difference required for the generation of hypoxia. In particular, the place in the bay, where the water depth dif erence was approximately 20 m, was found to generate hypoxia.

Regression Models for Determining the Patent Royalty Rates using Infringement Damage Awards and Inter-Partes Review Cases (손해배상액과 무효심판 판례를 이용한 특허 로열티율 산정 회귀모형)

  • Yang, Dong Hong;Kang, Gunseog;Kim, Sung-Chul
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.1
    • /
    • pp.47-63
    • /
    • 2018
  • This study suggested quantitative models to calculate a royalty rate as an important input factor of the relief from royalty method which has the characteristics of income approach method and market approach method that are generally used in the valuation of intangible assets. This study built a royalty rate regression model by referring to the patent infringement damages cases based on royalties, i.e., by using the royalty rates as a dependent variable and the patent indexes of the corresponding patent right as independent variables. Then, a logistic regression model was constructed by referring to inter-partes review cases of patent rights, i.e. by using not-unpatentable results as a dependent variable and the patent indexes of the corresponding patent right as independent variables. A final royalty rate was calculated by matching the royalty rate from the royalty rate regression model with a not-unpatentable probability from the logistic regression model. The suggested royalty rate was compared with the royalty rate obtained by the traditional methods to check its reliability.