• 제목/요약/키워드: In-Sample Prediction

Search Result 559, Processing Time 0.03 seconds

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

On-line Prediction Algorithm for Non-stationary VBR Traffic (Non-stationary VBR 트래픽을 위한 동적 데이타 크기 예측 알고리즘)

  • Kang, Sung-Joo;Won, You-Jip;Seong, Byeong-Chan
    • Journal of KIISE:Information Networking
    • /
    • v.34 no.3
    • /
    • pp.156-167
    • /
    • 2007
  • In this paper, we develop the model based prediction algorithm for Variable-Bit-Rate(VBR) video traffic with regular Group of Picture(GOP) pattern. We use multiplicative ARIMA process called GOP ARIMA (ARIMA for Group Of Pictures) as a base stochastic model. Kalman Filter based prediction algorithm consists of two process: GOP ARIMA modeling and prediction. In performance study, we produce three video traces (news, drama, sports) and we compare the accuracy of three different prediction schemes: Kalman Filter based prediction, linear prediction, and double exponential smoothing. The proposed prediction algorithm yields superior prediction accuracy than the other two. We also show that confidence interval analysis can effectively detect scene changes of the sample video sequence. The Kalman filter based prediction algorithm proposed in this work makes significant contributions to various aspects of network traffic engineering and resource allocation.

Protein Tertiary Structure Prediction Method based on Fragment Assembly

  • Lee, Julian;Kim, Seung-Yeon;Joo, Kee-Hyoung;Kim, Il-Soo;Lee, Joo-Young
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.250-261
    • /
    • 2004
  • A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is introduced. This method utilizes secondary structure prediction information and fragment assembly. The secondary structure prediction of proteins is performed with the PREDICT method which uses PSI-BLAST to generate profiles and a distance measure in the pattern space. In order to predict the tertiary structure of a protein sequence, we assemble fragments in the fragment library constructed as a byproduct of PREDICT. The tertiary structure is obtained by minimizing the potential energy using the conformational space annealing method which enables one to sample diverse low lying minima of the energy function. We apply PROFESY for prediction of some proteins with known structures, which shows good performances. We also participated in CASP5 and applied PROFESY to new fold targets for blind predictions. The results were quite promising, despite the fact that PROFESY was in its early stage of development. In particular, the PROFESY result is the best for the hardest target T0161.

  • PDF

Design of Intra Prediction Circuit for HEVC and H.264 Multi-decoder Supporting UHD Images (UHD 영상을 지원하는 HEVC 및 H.264 멀티 디코더 용 인트라 예측 회로 설계)

  • Yu, Sanghyun;Cho, Kyeongsoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.12
    • /
    • pp.50-56
    • /
    • 2016
  • This paper proposes the architecture and design of intra prediction circuit for a multi-decoder supporting UHD images. The proposed circuit supports not only the latest video compression standard HEVC but also H.264. In addition to the basic function of performing intra prediction, this circuit has the capability of performing the reference sample filter operation defined in the H.264 standard, and the smoothing and strong sample filter operations defined in the HEVC standard. We reduced the circuit size by sharing the circuit blocks for common operations and internal storage, and improved the circuit performance by parallel processing. The proposed circuit was described at RTL using Verilog HDL and its functionality was verified by using NC-Verilog of Cadence. The RTL circuit was synthesized by using Design Compiler of Synopsys and 130nm standard cell library. The synthesized gate-level circuit consists of 69,694 gates and processes 100 ~ 280 frames per second for 4K-UHD HEVC images at the maximum operation frequency of 157MHz.

Online Selective-Sample Learning of Hidden Markov Models for Sequence Classification

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.145-152
    • /
    • 2015
  • We consider an online selective-sample learning problem for sequence classification, where the goal is to learn a predictive model using a stream of data samples whose class labels can be selectively queried by the algorithm. Given that there is a limit to the total number of queries permitted, the key issue is choosing the most informative and salient samples for their class labels to be queried. Recently, several aggressive selective-sample algorithms have been proposed under a linear model for static (non-sequential) binary classification. We extend the idea to hidden Markov models for multi-class sequence classification by introducing reasonable measures for the novelty and prediction confidence of the incoming sample with respect to the current model, on which the query decision is based. For several sequence classification datasets/tasks in online learning setups, we demonstrate the effectiveness of the proposed approach.

Prediction of changes in fine dust concentration using LSTM model

  • Lee, Gi-Seok;Lee, Sang-Hyun
    • International journal of advanced smart convergence
    • /
    • v.11 no.2
    • /
    • pp.30-37
    • /
    • 2022
  • Because fine dust (PM10) has a close effect on the environment, fine dust generated in the climate and living environment has a bad effect on the human body. In this study, the LSTM model was applied to predict and analyze the effect of fine dust on Gwangju Metropolitan City in Korea. This paper uses prediction values of input variables selected through correlation analysis to confirm fine dust prediction performance. In this paper, data from the Gwangju Metropolitan City area were collected to measure fine dust. The collection period is one year's worth of data was used from january to December of 2021, and the test data was conducted using three-month data from January to March of 2022. As a result of this study, in the as a result of predicting fine dust (PH10) and ultrafine dust (PH2.5) using the LSTM model, the RMSE was 4.61 and the test result value was as low as 4.37. This reason is judged to be the result of the contents of the one-year sample.

Prediction of Oak Mushroom Prices Using Box-Jenkins Methodology (Box-Jenkins 모형을 이용한 표고버섯 가격예측)

  • Min, Kyung-Taek
    • Journal of Korean Society of Forest Science
    • /
    • v.95 no.6
    • /
    • pp.778-783
    • /
    • 2006
  • Price prediction is essential to decisions of investment and shipment in oak mushroom cultivation. But predicting the prices of oak mushroom is very difficult because there are so many uncertain factors affecting the demand and the supply in the market. The Box-Jenkins methodology is one of strong tools in price prediction especially for the short-term using historical observations of time series. In this paper, the Box-Jenkins methodology is applied to find a model to forecast future oak mushroom prices. And out-of-sample test was conducted to check out the prediction accuracy. The result shows the high accuracy except for market disturbance period affected by unexpected weather change and reveals the usefulness of the model.

A Study on the Insolvency Prediction Model for Korean Shipping Companies

  • Myoung-Hee Kim
    • Journal of Navigation and Port Research
    • /
    • v.48 no.2
    • /
    • pp.109-115
    • /
    • 2024
  • To develop a shipping company insolvency prediction model, we sampled shipping companies that closed between 2005 and 2023. In addition, a closed company and a normal company with similar asset size were selected as a paired sample. For this study, data of a total of 82 companies, including 42 closed companies and 42 general companies, were obtained. These data were randomly divided into a training set (2/3 of data) and a testing set (1/3 of data). Training data were used to develop the model while test data were used to measure the accuracy of the model. In this study, a prediction model for Korean shipping insolvency was developed using financial ratio variables frequently used in previous studies. First, using the LASSO technique, main variables out of 24 independent variables were reduced to 9. Next, we set insolvent companies to 1 and normal companies to 0 and fitted logistic regression, LDA and QDA model. As a result, the accuracy of the prediction model was 82.14% for the QDA model, 78.57% for the logistic regression model, and 75.00% for the LDA model. In addition, variables 'Current ratio', 'Interest expenses to sales', 'Total assets turnover', and 'Operating income to sales' were analyzed as major variables affecting corporate insolvency.

A Prediction Model Based on Relevance Vector Machine and Granularity Analysis

  • Cho, Young Im
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.3
    • /
    • pp.157-162
    • /
    • 2016
  • In this paper, a yield prediction model based on relevance vector machine (RVM) and a granular computing model (quotient space theory) is presented. With a granular computing model, massive and complex meteorological data can be analyzed at different layers of different grain sizes, and new meteorological feature data sets can be formed in this way. In order to forecast the crop yield, a grey model is introduced to label the training sample data sets, which also can be used for computing the tendency yield. An RVM algorithm is introduced as the classification model for meteorological data mining. Experiments on data sets from the real world using this model show an advantage in terms of yield prediction compared with other models.

Glucose Prediction in the Interstitial Fluid Based on Infrared Absorption Spectroscopy Using Multi-component Analysis

  • Kim, Hye-Jeong;Noh, In-Sup;Yoon, Gil-Won
    • Journal of the Optical Society of Korea
    • /
    • v.13 no.2
    • /
    • pp.279-285
    • /
    • 2009
  • Prediction of glucose concentration in the interstitial fluid (ISF) based on mid-infrared absorption spectroscopy was examined at the glucose fundamental absorption band of 1000 - 1500/cm (10 - 6.67 um) using multi-component analysis. Simulated ISF samples were prepared by including four major ISF components. Sodium lactate had absorption spectra that interfere with those of glucose. The rest NaCl, KCl and $CaCl_2$ did not have any signatures. A preliminary experiment based on Design of Experiment, an optimization method, proved that sodium lactate influenced the prediction accuracy of glucose. For the main experiment, 54 samples were prepared whose glucose and sodium lactate concentration varied independently. A partial least squares regression (PLSR) analysis was used to build calibration models. The prediction accuracy was dependent on spectrum preprocessing methods, and Mean Centering produced the best results. Depending on calibration sample sets whose sodium lactate had different concentration levels, the standard error prediction (SEP) of glucose ranged $17.19{\sim}21.02\;mg/dl$.