• Title/Summary/Keyword: 통계적 학습 모형

Search Result 79, Processing Time 0.022 seconds

A Classification Analysis using Bayesian Neural Network (베이지안 신경망을 이용한 분류분석)

  • Hwang, Jin-Soo;Choi, Seong-Yong;Jun, Hong-Suk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.11-25
    • /
    • 2001
  • There are several algorithms for classification in modeling relations, patterns, and rules which exist in data. We learn to classify objects on the basis of instances presented to us, not by being given a set of classification rules. The Bayesian learning uses the probability distribution to express our knowledge about unknown parameters and update our knowledge by the law of probability as the evidence gathered from data. Also, the neural network models are designed for predicting an unknown category or quantity on the basis of known attributes by training. In this paper, we compare the misclassification error rates of Bayesian Neural Network method with those of other classification algorithms, CHAID, CART, and QUBST using several data sets.

  • PDF

A Comparative Study on the Bankruptcy Prediction Power of Statistical Model and AI Models: MDA, Inductive,Neural Network (기업도산예측을 위한 통계적모형과 인공지능 모형간의 예측력 비교에 관한 연구 : MDA,귀납적 학습방법, 인공신경망)

  • 이건창
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.18 no.2
    • /
    • pp.57-81
    • /
    • 1993
  • This paper is concerned with analyzing the bankruptcy prediction power of three methods : Multivariate Discriminant Analysis (MDA), Inductive Learning, Neural Network, MDA has been famous for its effectiveness for predicting bankrupcy in accounting fields. However, it requires rigorous statistical assumptions, so that violating one of the assumptions may result in biased outputs. In this respect, we alternatively propose the use of two AI models for bankrupcy prediction-inductive learning and neural network. To compare the performance of those two AI models with that of MDA, we have performed massive experiments with a number of Korean bankrupt-cases. Experimental results show that AI models proposed in this study can yield more robust and generalizing bankrupcy prediction than the conventional MDA can do.

  • PDF

Estimating home fire severity with statistical distributions (통계적 분포를 통한 주택 화재 심도 추정)

  • Yunjung Park;Inha Song;Soyoun Lee;Kwang Hyun Nam;Rosy Oh;Jaeyoun Ahn
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.6
    • /
    • pp.591-618
    • /
    • 2023
  • This paper evaluates the performance of various distribution assumptions in regression settings for estimating insurance loss. The gamma distribution is commonly used to handle the asymmetry property of loss distribution. However, recent studies highlight the significance of heavy-tailedness in loss distribution. Through an analysis of real home fire insurance data, we compare the effectiveness of different distribution assumptions in regression methods. Our findings show that the choice of parametric distributional assumption is crucial in determining premiums for various insurance products, including "excess of loss insurance" and "limit insurance". Additionally, we discuss practical considerations for applying our results in home fire insurance.

Comparing Monthly Precipitation Predictions Using Time Series Analysis with Deep Learning Models (시계열 분석 및 딥러닝 모형을 활용한 월 강수량 예측 비교)

  • Chung, Yeon-Ji;Kim, Min-Ki;Um, Myoung-Jin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.44 no.4
    • /
    • pp.443-463
    • /
    • 2024
  • This study sought to improve the accuracy of precipitation prediction by utilizing monthly precipitation data for each region over the past 30 years. Using statistical models (ARIMA, SARIMA) and deep learning models (LSTM, GBM), we learned monthly precipitation data from 1983 to 2012 in Gangneung, Gwangju, Daegu, Daejeon, Busan, Seoul, Jeju, and Chuncheon. Based on this, monthly precipitation was predicted for 10 years from 2013 to 2022. As a result of the prediction, most models accurately predicted the precipitation trend, but showed a tendency to underpredict the actual precipitation. To solve these problems, appropriate models were selected for each region and season. The LSTM model showed suitable results in Gangneung, Gwangju, Daegu, Daejeon, Busan, Seoul, Jeju, and Chuncheon. When comparing forecasting power by season, the SARIMA model showed particularly suitable forecasting performance in winter in Gangneung, Gwangju, Daegu, Daejeon, Seoul, and Chuncheon. Additionally, the LSTM model showed higher performance than other models in the summer when precipitation is concentrated. In conclusion, closely analyzing regional and seasonal precipitation patterns and selecting the optimal prediction model based on this plays a critical role in increasing the accuracy of precipitation prediction.

Prediction of Severities of Rental Car Traffic Accidents using Naive Bayes Big Data Classifier (나이브 베이즈 빅데이터 분류기를 이용한 렌터카 교통사고 심각도 예측)

  • Jeong, Harim;Kim, Honghoi;Park, Sangmin;Han, Eum;Kim, Kyung Hyun;Yun, Ilsoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.4
    • /
    • pp.1-12
    • /
    • 2017
  • Traffic accidents are caused by a combination of human factors, vehicle factors, and environmental factors. In the case of traffic accidents where rental cars are involved, the possibility and the severity of traffic accidents are expected to be different from those of other traffic accidents due to the unfamiliar environment of the driver. In this study, we developed a model to forecast the severity of rental car accidents by using Naive Bayes classifier for Busan, Gangneung, and Jeju city. In addition, we compared the prediction accuracy performance of two models where one model uses the variables of which statistical significance were verified in a prior study and another model uses the entire available variables. As a result of the comparison, it is shown that the prediction accuracy is higher when using the variables with statistical significance.

Analysis of the Mediating Effect of Academic Self-efficacy in the Effect of Motivation to Participate of Online Lifelong Education Using YouTube on Learning Flow (유튜브를 활용한 온라인 평생교육의 참여동기가 학습몰입에 미치는 영향에서 학업적 자기효능감의 매개효과 분석)

  • Kim, Tae-Rin
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.527-541
    • /
    • 2022
  • This study analyzed the structural relationship between motivation to participate in online lifelong education through YouTube, academic self-efficacy, and learning flow, where learners are rapidly increasing due to the spread of COVID-19. For this study, an online survey was conducted from July 16 to 30, 2021 for adult learners living in the metropolitan area. A total of 428 people participated in the survey, and a total of 409 copies of the results were analyzed, excluding 19 insincere responses. The main analysis results are as follows. First, The fitness of the research model was verified to be suitable for all analysis. Second, as a result of confirming the coefficients and statistical significance of each pathway in the research model, the motivation to participate in YouTube lifelong education was learning flow and academic self-efficacy, and academic self-efficacy also had a positive effect on learning flow. Third, it was confirmed that the effect of participation motivation in YouTube lifelong education on learning flow through academic self-efficacy was a statistically significant partial mediation. This study is meaningful in that it verified the structural relationship analysis between participation motive, academic self-efficacy, and learning flow in online lifelong education using YouTube reflecting the digital transformation of lifelong education due to COVID-19. Reflecting the need for re-regulation of lifelong education formed after COVID-19 and the flow of digital transformation through the research results, we discussed how lifelong education can enhance learners' motivation to participate and strengthen learning flow through academic self-efficacy as a medium.

Construction of a Short-term Time-series Prediction Model for Analysis of Return Flow of Residential Water (생활용수 회귀수량의 분석을 위한 시계열 단기 예측모형 구축)

  • Lee, Seungyeon;Lee, Sangeun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.763-774
    • /
    • 2023
  • The water availability in a river is related to the return flow of residential water. However it is still difficult to determine the exact return flow. In this study, the residential water-cycle system is defined as a process consisting of water inflow, water transfer and water outflow. The study area is Hampyeong-gun, Jeollanam-do, and is set as a single inflow to a single outflow through the water-cycle system after classification of complete and incomplete measurement points. The time-series prediction models(ARIMA model and TFM) are established with daily inflow and outflow data for 6 years. Inflow and outflow are predicted by dividing into training and test periods. As a result, both models show the feasibility of short-term prediction by deriving stable residuals and securing statistical significance, implementing the preliminary form of the water-cycle system. As a further study, it is suggested to predict the actual return flow of the target basin and efficient water operation by adding input factors and selecting the optimal model.

A Sparse Data Preprocessing Using Support Vector Regression (Support Vector Regression을 이용한 희소 데이터의 전처리)

  • Jun, Sung-Hae;Park, Jung-Eun;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.6
    • /
    • pp.789-792
    • /
    • 2004
  • In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.

An Outlier Data Analysis using Support Vector Regression (Support Vector Regression을 이용한 이상치 데이터분석)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.6
    • /
    • pp.876-880
    • /
    • 2008
  • Outliers are the observations which are very larger or smaller than most observations in the given data set. These are shown by some sources. The result of the analysis with outliers may be depended on them. In general, we do data analysis after removing outliers. But, in data mining applications such as fraud detection and intrusion detection, outliers are included in training data because they have crucial information. In regression models, simple and multiple regression models need to eliminate outliers from given training data by standadized and studentized residuals to construct good model. In this paper, we use support vector regression(SVR) based on statistical teaming theory to analyze data with outliers in regression. We verify the improved performance of our work by the experiment using synthetic data sets.

A Study on the Prediction of Power Demand for Electric Vehicles Using Exponential Smoothing Techniques (Exponential Smoothing기법을 이용한 전기자동차 전력 수요량 예측에 관한 연구)

  • Lee, Byung-Hyun;Jung, Se-Jin;Kim, Byung-Sik
    • Journal of Korean Society of Disaster and Security
    • /
    • v.14 no.2
    • /
    • pp.35-42
    • /
    • 2021
  • In order to produce electric vehicle demand forecasting information, which is an important element of the plan to expand charging facilities for electric vehicles, a model for predicting electric vehicle demand was proposed using Exponential Smoothing. In order to establish input data for the model, the monthly power demand of cities and counties was applied as independent variables, monthly electric vehicle charging stations, monthly electric vehicle charging stations, and monthly electric vehicle registration data. To verify the accuracy of the electric vehicle power demand prediction model, we compare the results of the statistical methods Exponential Smoothing (ETS) and ARIMA models with error rates of 12% and 21%, confirming that the ETS presented in this paper is 9% more accurate as electric vehicle power demand prediction models. It is expected that it will be used in terms of operation and management from planning to install charging stations for electric vehicles using this model in the future.