• Title/Summary/Keyword: Prediction of variables

Search Result 1,803, Processing Time 0.033 seconds

Interval prediction on the sum of binary random variables indexed by a graph

  • Park, Seongoh;Hahn, Kyu S.;Lim, Johan;Son, Won
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.261-272
    • /
    • 2019
  • In this paper, we propose a procedure to build a prediction interval of the sum of dependent binary random variables over a graph to account for the dependence among binary variables. Our main interest is to find a prediction interval of the weighted sum of dependent binary random variables indexed by a graph. This problem is motivated by the prediction problem of various elections including Korean National Assembly and US presidential election. Traditional and popular approaches to construct the prediction interval of the seats won by major parties are normal approximation by the CLT and Monte Carlo method by generating many independent Bernoulli random variables assuming that those binary random variables are independent and the success probabilities are known constants. However, in practice, the survey results (also the exit polls) on the election are random and hardly independent to each other. They are more often spatially correlated random variables. To take this into account, we suggest a spatial auto-regressive (AR) model for the surveyed success probabilities, and propose a residual based bootstrap procedure to construct the prediction interval of the sum of the binary outcomes. Finally, we apply the procedure to building the prediction intervals of the number of legislative seats won by each party from the exit poll data in the $19^{th}$ and $20^{th}$ Korea National Assembly elections.

An Exploratory Study for Decreasing Error of Prediction Value of Recommended System on User Based

  • Lee, Hee-Choon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.77-86
    • /
    • 2006
  • This study is to investigate the error of prediction value with related variables from the recommended system and to examine the error of prediction value with related variables. To decrease the error on the collaborative recommended system on user based, this research explored the effects on the prediction related response pair between raters' demographic variables and Pearson's coefficient and sparsity. The result shows comparative analysis between existing error of prediction value and conditioned one.

  • PDF

Cost Prediction Model using Qualitative Variables focused on Planning Phase for Public Multi-Housing Projects (정성변수를 고려한 공공아파트 기획단계 공사비 예측모델)

  • Ji, Soung-Min;Hyun, Chang-Taek;Moon, Hyun-Seok
    • Korean Journal of Construction Engineering and Management
    • /
    • v.13 no.2
    • /
    • pp.91-101
    • /
    • 2012
  • In planning phase of Public Multi-Housing Projects, it is required to develop the methodology and criteria for fair cost prediction with influencing power from planning phase to occupancy phase. Many studies still have focused on the prediction of cost by multiple regression. However, there is no logical explanation about the influence of nonmetric variables for the prediction of cost in planning phase. Accordingly, this research pursues a cost prediction model including nonmetric variables for use in planning phase. There are 3 steps of this research : 1) Finding the factors influencing construction cost and assigning variables for a multiple regression. 2) Conducting a dummy regression analysis with nonmetric variables and model validation by comparing actual cost data. 3) Developing the ratio of RC structure cost to wall structure cost by using cost predection model. The results could establish cost prediction process including the influence of nonmetric variables and the ratio of RC structure cost to wall structure cost.

Parameter Study of TEIS Model, Two-zone Model, and Stanitz's Equations (직렬 두요소 모델, 두 영역 모델, Stanitz 방정식에 대한 변수 연구)

  • Yoon, Sung-Ho;Baek, Je-Hyun
    • Proceedings of the KSME Conference
    • /
    • 2000.04b
    • /
    • pp.580-585
    • /
    • 2000
  • Recently TEIS model, Two-zone model aid Stanitz equations are often used for off-design performance prediction of centrifugal compressor and pump. The prediction results often agree well with experimental data. However these models and equations have some important variables which have a great influence on overall performance prediction me. But no systematic study about these variables has been performed. So, in this paper, a systematic study about these variables influence on overall performance prediction owe is peformed. Finally the meaning of the variables and the research to be undertaken are discussed.

  • PDF

A Study on the Predictability of Hospital's Future Cash Flow Information (병원의 미래 현금흐름 정보예측)

  • Moon, Young-Jeon;Yang, Dong-Hyun
    • Korea Journal of Hospital Management
    • /
    • v.11 no.3
    • /
    • pp.19-41
    • /
    • 2006
  • The Objective of this study was to design the model which predict the future cash flow of hospitals and on the basis of designed model to support sound hospital management by the prediction of future cash flow. The five cash flow measurement variables discussed in financial accrual part were used as variables and these variables were defined as NI, NIDPR, CFO, CFAI, CC. To measure the cash flow B/S related variables, P/L related variables and financial ratio related variables were utilized in this study. To measure cash flow models were designed and to estimate the prediction ability of five cash flow models, the martingale model and the market model were utilized. To estimate relative prediction outcome of cash flow prediction model and simple market model, MAE and MER were used to compare and analyze relative prediction ability of the cash flow model and the market model and to prove superiority of the model of the cash flow prediction model, 32 Regional Public Hospital's cross-section data and 4 year time series data were combined and pooled cross-sectional time series regression model was used for GLS-analysis. To analyze this data, Firstly, each cash flow prediction model, martingale model and market model were made and MAE and MER were estimated. Secondly difference-test was conducted to find the difference between MAE and MER of cash flow prediction model. Thirdly after ranking by size the prediction of cash flow model, martingale model and market model, Friedman-test was evaluated to find prediction ability. The results of this study were as follows: when t-test was conducted to find prediction ability among each model, the error of prediction of cash flow model was smaller than that of martingale and market model, and the difference of prediction error cash flow was significant, so cash flow model was analyzed as excellent compare with other models. This research results can be considered conductive in that present the suitable prediction model of future cash flow to the hospital. This research can provide valuable information in policy-making of hospital's policy decision. This research provide effects as follows; (1) the research is useful to estimate the benefit of hospital, solvency and capital supply ability for substitution of fixed equipment. (2) the research is useful to estimate hospital's liqudity, solvency and financial ability. (3) the research is useful to estimate evaluation ability in hospital management. Furthermore, the research should be continued by sampling all hospitals and constructed advanced cash flow model in dimension, established type and continued by studying unified model which is related each cash flow model.

  • PDF

Assessing Distress Prediction Model toward Jeju District Hotels (제주지역 호텔기업 부실예측모형 평가)

  • Kim, Si-Joong
    • The Journal of Industrial Distribution & Business
    • /
    • v.8 no.4
    • /
    • pp.47-52
    • /
    • 2017
  • Purpose - This current study will investigate the average financial ratio of top and failed five-star hotels in the Jeju area. A total of 14 financial ratio variables are utilized. This study aims to; first, assess financial ratio of the first-class hotels in Jeju to establishing variables, second, develop distress prediction model for the first-class hotels in Jeju district by using logit analysis and third, evaluate distress prediction capacity for the first-class hotels in Jeju district by using logit analysis. Research design, data, and methodology - The sample was collected from year 2015 and 14 financial ratios of 12 first-class hotels in Jeju district. The results from the samples were analyzed by t-test, and the independent variables were chosen. This was an empirical study where the distress prediction model was evaluated by logit analysis. This current research has focused on critically analyzing and differentiating between the top and failed hotels in the Jeju area by utilizing the 14 financial ratio variables. Results - The verification result of the accuracy estimated by logit analysis has shown to indicate that the distress prediction model's distress prediction capacity was 83.3%. In order to extract the factors that differentiated the top hotels in the Jeju area from the failed hotels among the 14 chosen, the analysis of t-black was utilized by independent variables. Logit analysis was also used in this study. As a result, it was observed that 5 variables were statistically significant and are included in the logit analysis for discernment of top and failed hotels in the Jeju area. Conclusions - The distress prediction press' prediction capability was compared in this research analysis. The distress prediction press prediction capability was shown to range from 75-85% by logit analysis from a previous study. In this current research, the study's prediction capacity was shown to be 83.33%. It was considered a high number and was found to belong to the range of the previous study's prediction capacity range. From a practical perspective, the capacity of the assessment of the distress prediction model in the top and failed hotels in the Jeju area was considered to be a prominent factor in applications of future hotel appraisal.

A Study on the Insolvency Prediction Model for Korean Shipping Companies

  • Myoung-Hee Kim
    • Journal of Navigation and Port Research
    • /
    • v.48 no.2
    • /
    • pp.109-115
    • /
    • 2024
  • To develop a shipping company insolvency prediction model, we sampled shipping companies that closed between 2005 and 2023. In addition, a closed company and a normal company with similar asset size were selected as a paired sample. For this study, data of a total of 82 companies, including 42 closed companies and 42 general companies, were obtained. These data were randomly divided into a training set (2/3 of data) and a testing set (1/3 of data). Training data were used to develop the model while test data were used to measure the accuracy of the model. In this study, a prediction model for Korean shipping insolvency was developed using financial ratio variables frequently used in previous studies. First, using the LASSO technique, main variables out of 24 independent variables were reduced to 9. Next, we set insolvent companies to 1 and normal companies to 0 and fitted logistic regression, LDA and QDA model. As a result, the accuracy of the prediction model was 82.14% for the QDA model, 78.57% for the logistic regression model, and 75.00% for the LDA model. In addition, variables 'Current ratio', 'Interest expenses to sales', 'Total assets turnover', and 'Operating income to sales' were analyzed as major variables affecting corporate insolvency.

Evaluating Distress Prediction Models for Food Service Franchise Industry (외식프랜차이즈기업 부실예측모형 예측력 평가)

  • KIM, Si-Joong
    • Journal of Distribution Science
    • /
    • v.17 no.11
    • /
    • pp.73-79
    • /
    • 2019
  • Purpose: The purpose of this study was evaluated to compare the predictive power of distress prediction models by using discriminant analysis method and logit analysis method for food service franchise industry in Korea. Research design, data and methodology: Forty-six food service franchise industry with high sales volume in the 2017 were selected as the sample food service franchise industry for analysis. The fourteen financial ratios for analysis were calculated from the data in the 2017 statement of financial position and income statement of forty-six food service franchise industry in Korea. The fourteen financial ratios were used as sample data and analyzed by t-test. As a result seven statistically significant independent variables were chosen. The analysis method of the distress prediction model was performed by logit analysis and multiple discriminant analysis. Results: The difference between the average value of fourteen financial ratios of forty-six food service franchise industry was tested through t-test in order to extract variables that are classified as top-leveled and failure food service franchise industry among the financial ratios. As a result of the univariate test appears that the variables which differentiate the top-leveled food service franchise industry to failure food service industry are income to stockholders' equity, operating income to sales, current ratio, net income to assets, cash flows from operating activities, growth rate of operating income, and total assets turnover. The statistical significances of the seven financial ratio independent variables were also confirmed by logit analysis and discriminant analysis. Conclusions: The analysis results of the prediction accuracy of each distress prediction model in this study showed that the forecast accuracy of the prediction model by the discriminant analysis method was 84.8% and 89.1% by the logit analysis method, indicating that the logit analysis method has higher distress predictability than the discriminant analysis method. Comparing the previous distress prediction capability, which ranges from 75% to 85% by discriminant analysis and logit analysis, this study's prediction capacity, which is 84.8% in the discriminant analysis, and 89.1% in logit analysis, is found to belong to the range of previous study's prediction capacity range and is considered high number.

Method of Analyzing Important Variables using Machine Learning-based Golf Putting Direction Prediction Model (머신러닝 기반 골프 퍼팅 방향 예측 모델을 활용한 중요 변수 분석 방법론)

  • Kim, Yeon Ho;Cho, Seung Hyun;Jung, Hae Ryun;Lee, Ki Kwang
    • Korean Journal of Applied Biomechanics
    • /
    • v.32 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • Objective: This study proposes a methodology to analyze important variables that have a significant impact on the putting direction prediction using a machine learning-based putting direction prediction model trained with IMU sensor data. Method: Putting data were collected using an IMU sensor measuring 12 variables from 6 adult males in their 20s at K University who had no golf experience. The data was preprocessed so that it could be applied to machine learning, and a model was built using five machine learning algorithms. Finally, by comparing the performance of the built models, the model with the highest performance was selected as the proposed model, and then 12 variables of the IMU sensor were applied one by one to analyze important variables affecting the learning performance. Results: As a result of comparing the performance of five machine learning algorithms (K-NN, Naive Bayes, Decision Tree, Random Forest, and Light GBM), the prediction accuracy of the Light GBM-based prediction model was higher than that of other algorithms. Using the Light GBM algorithm, which had excellent performance, an experiment was performed to rank the importance of variables that affect the direction prediction of the model. Conclusion: Among the five machine learning algorithms, the algorithm that best predicts the putting direction was the Light GBM algorithm. When the model predicted the putting direction, the variable that had the greatest influence was the left-right inclination (Roll).

Computation of geographic variables for air pollution prediction models in South Korea

  • Eum, Youngseob;Song, Insang;Kim, Hwan-Cheol;Leem, Jong-Han;Kim, Sun-Young
    • Environmental Analysis Health and Toxicology
    • /
    • v.30
    • /
    • pp.10.1-10.14
    • /
    • 2015
  • Recent cohort studies have relied on exposure prediction models to estimate individual-level air pollution concentrations because individual air pollution measurements are not available for cohort locations. For such prediction models, geographic variables related to pollution sources are important inputs. We demonstrated the computation process of geographic variables mostly recorded in 2010 at regulatory air pollution monitoring sites in South Korea. On the basis of previous studies, we finalized a list of 313 geographic variables related to air pollution sources in eight categories including traffic, demographic characteristics, land use, transportation facilities, physical geography, emissions, vegetation, and altitude. We then obtained data from different sources such as the Statistics Geographic Information Service and Korean Transport Database. After integrating all available data to a single database by matching coordinate systems and converting non-spatial data to spatial data, we computed geographic variables at 294 regulatory monitoring sites in South Korea. The data integration and variable computation were performed by using ArcGIS version 10.2 (ESRI Inc., Redlands, CA, USA). For traffic, we computed the distances to the nearest roads and the sums of road lengths within different sizes of circular buffers. In addition, we calculated the numbers of residents, households, housing buildings, companies, and employees within the buffers. The percentages of areas for different types of land use compared to total areas were calculated within the buffers. For transportation facilities and physical geography, we computed the distances to the closest public transportation depots and the boundary lines. The vegetation index and altitude were estimated at a given location by using satellite data. The summary statistics of geographic variables in Seoul across monitoring sites showed different patterns between urban background and urban roadside sites. This study provided practical knowledge on the computation process of geographic variables in South Korea, which will improve air pollution prediction models and contribute to subsequent health analyses.