• Title/Summary/Keyword: 최적회귀모형

Search Result 226, Processing Time 0.031 seconds

Application of Time-Series Model to Forecast Track Irregularity Progress (궤도틀림 진전 예측을 위한 시계열 모델 적용)

  • Jeong, Min Chul;Kim, Gun Woo;Kim, Jung Hoon;Kang, Yun Suk;Kong, Jung Sik
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.25 no.4
    • /
    • pp.331-338
    • /
    • 2012
  • Irregularity data inspected by EM-120, an railway inspection system in Korea includes unavoidable incomplete and erratic information, so it is encountered lots of problem to analyse those data without appropriate pre-data-refining processes. In this research, for the efficient management and maintenance of railway system, characteristics and problems of the detected track irregularity data have been analyzed and efficient processing techniques were developed to solve the problems. The correlation between track irregularity and seasonal changes was conducted based on ARIMA model analysis. Finally, time series analysis was carried out by various forecasting model, such as regression, exponential smoothing and ARIMA model, to determine the appropriate optimal models for forecasting track irregularity progress.

A Case Study on the Cost Effectiveness Analysis of Depot Maintenance Using Simulation Model and Experimental Design (시뮬레이션 모형과 실험설계법을 활용한 창정비 비용대 효과 분석 사례)

  • Kim, Sung-Kon;Lee, Sang-Jin
    • Journal of the Korea Society for Simulation
    • /
    • v.26 no.3
    • /
    • pp.23-34
    • /
    • 2017
  • This paper is to study the simulation model of depot maintenance system that analyzes logistics supportability such as component availability and cost of target equipment. A depot maintenance system could repair or maintain multiple components simultaneously. The key performance indicators of this system are component availability, repair cycle time, and maintenance cost. The simulation model is based on the engine maintenance process of army aviation depot. This study combines the NOLH(Nearly Orthogonal Latin Hypercube) experimental design method, to composes 33 scenarios, with a multiple regression analysis to find out major factors that influence on key performance indicators. This study is significant in providing a cost-effectiveness analysis on depot maintenance system that is capable of maintaining multiple components at the same time.

Finite Element A nalysis of Gradually and Rapidly Varied Unsteady Flow in Open Channel:I.Theory and Stability Analysis (개수로내의 점변 및 급변 부정류에 대한 유한요소해석 :I.이론 및 수치안정성 해석)

  • Han, Kun-Yeun;Park, Jae-Hong;Lee, Jong-Tae
    • Water for future
    • /
    • v.29 no.6
    • /
    • pp.167-178
    • /
    • 1996
  • The simulation techniques of hydrologic data series have been developed for the purposes of the design of water resources system, the optimization of reservoir operation, and the design of flood control of reservoir, etx. While the stochastic models are usually used in most analysis of water resources fields for the generation of data sequences, the indexed sequential modeling (ISM) method based on generation of a series of overlapping short-term flow sequences directly from the historical record has been used for the data generation in western USA since the early of 1980's. It was reported that the reliable results by ISM were obtained in practical applications. In this study, we generate annual inflow series at a location of Hong Cheon Dam site by using ISM method and first order autoregressive model (AR(1)), and estimate the drought characteristics for the comparison aim between ISM and AR(1).

  • PDF

Identification of major risk factors association with respiratory diseases by data mining (데이터마이닝 모형을 활용한 호흡기질환의 주요인 선별)

  • Lee, Jea-Young;Kim, Hyun-Ji
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.373-384
    • /
    • 2014
  • Data mining is to clarify pattern or correlation of mass data of complicated structure and to predict the diverse outcomes. This technique is used in the fields of finance, telecommunication, circulation, medicine and so on. In this paper, we selected risk factors of respiratory diseases in the field of medicine. The data we used was divided into respiratory diseases group and health group from the Gyeongsangbuk-do database of Community Health Survey conducted in 2012. In order to select major risk factors, we applied data mining techniques such as neural network, logistic regression, Bayesian network, C5.0 and CART. We divided total data into training and testing data, and applied model which was designed by training data to testing data. By the comparison of prediction accuracy, CART was identified as best model. Depression, smoking and stress were proved as the major risk factors of respiratory disease.

Outbound Air Travel Demand Forecasting Model with Unobserved Regional Characteristics (미관찰 지역 특성을 고려한 내국인 국제선 항공수요 추정 모형)

  • YU, Jeong Whon;CHOI, Jung Yoon
    • Journal of Korean Society of Transportation
    • /
    • v.36 no.2
    • /
    • pp.141-154
    • /
    • 2018
  • In order to meet the ever-increasing demand for international air travel, several plans are underway to open new airports and expand existing provincial airports. However, existing air demand forecasts have been based on the total air demand in Korea or the air demand among major cities. There is not much forecast of regional air demand considering local characteristics. In this study, the outbound air travel demand in the southeastern region of Korea was analyzed and the fixed-effects model using panel data was proposed as an optimal model that can reflect the inherent characteristics of metropolitan areas which are difficult to observe in reality. The results of model validation show that panel data analysis effectively addresses the spurious regression and unobserved heterogeneity that are difficult to handle in a model using only a few macroeconomic indicators with time series characteristics. Various statistical validation and conformance tests suggest that the fixed-effects model proposed in this study is superior to other econometric models in predicting demand for international demand in the southeastern region.

The analysis of future land use change impacts on runoff characteristics in urbanized watershed using SWAT model (SWAT 모형을 이용한 도시유역 토지이용 변화가 유출특성에 미치는 영향연구)

  • Kim, Sang-Ho;Ha, Rim;Jung, Chung-Gil;Kim, Seong-Joon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.159-159
    • /
    • 2012
  • 유역의 도시화는 지체시간, 첨두유량 및 총 유출량 등 홍수 유출특성 뿐만 아니라 개발행위에 따른 토사유출특성 변화를 초래하는 등 다양한 문제를 유발하고 있다. 또한 인구증가, 산업발달, 교통량 증가로 인한 화석연료의 소비증가로 대기 중의 이산화탄소의 농도가 급증하여 기후변화에도 큰 영향을 미치고 있다. 이러한 지나친 도시화 진행을 억제하고 웰빙과 건강에 대한 관심 증대에 따라 저탄소 녹색사업의 일환으로 도시 녹지조성 계획이 진행되고 있다. 따라서 본 연구에서는 도시하천인 중랑천 ($288km^2$) 유역을 대상으로 도시화 진행에 따른 토지이용 변화가 강우-유출 특성에 미치는 영향을 분석하였다. 이를 위해, 중랑천 유역의 과거 (1975, 1980, 1985, 1990, 1995, 2000) 토지 이용도로부터 각 항목별 면적변화 추이를 분석한 결과, 17.8%의 도시지역 면적 증가가 나타났다. 미래토지이용 예측을 위하여 CLUE-s (Conversion of Land Use change and its Effects) 모델을 이용하였다. 과거 토지이용 변화 특성을 분석하여 토지피복의 변화와 전이 특성값을 결정하였고, 이를 바탕으로 토지면적 시나리오, 변화 제한지역, 회귀식 결과와 토지이용 변화 특성에 따른 CLUE-s 모델을 이용하여 미래 토지이용변화 (2040, 2080) 모의를 실시하였다. 이러한 토지이용 변화에 따른 유출특성변화 모의를 위해, 물리적 기반의 준분포형 강우-유출 모형인 SWAT (Soil and Water Assessment Tool) 모형을 이용하였다. 모형의 적용성 평가를 위해, 매개변수 민감도 분석에 따른 도시유역 최적의 유출관련 매개변수를 선정하고 중랑교 지점의 일별 유출량자료(2000~2009)를 이용하여 보정 및 검증을 실시하였다. SWAT 모형의 검보정 후, 예측된 미래 토지이용도를 적용하여 과거와 현재, 미래 토지이용변화에 따른 유출특성변화를 비교분석하였다.

  • PDF

Feature selection and prediction modeling of drug responsiveness in Pharmacogenomics (약물유전체학에서 약물반응 예측모형과 변수선택 방법)

  • Kim, Kyuhwan;Kim, Wonkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.153-166
    • /
    • 2021
  • A main goal of pharmacogenomics studies is to predict individual's drug responsiveness based on high dimensional genetic variables. Due to a large number of variables, feature selection is required in order to reduce the number of variables. The selected features are used to construct a predictive model using machine learning algorithms. In the present study, we applied several hybrid feature selection methods such as combinations of logistic regression, ReliefF, TurF, random forest, and LASSO to a next generation sequencing data set of 400 epilepsy patients. We then applied the selected features to machine learning methods including random forest, gradient boosting, and support vector machine as well as a stacking ensemble method. Our results showed that the stacking model with a hybrid feature selection of random forest and ReliefF performs better than with other combinations of approaches. Based on a 5-fold cross validation partition, the mean test accuracy value of the best model was 0.727 and the mean test AUC value of the best model was 0.761. It also appeared that the stacking models outperform than single machine learning predictive models when using the same selected features.

제안기반 자동 거래협상 시장에서의 사용자 에이전트를 위한 최적 거래안 탐색 전략의 개발

  • 홍준석;김우주;송용욱
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.05a
    • /
    • pp.140-148
    • /
    • 2002
  • 컴퓨터를 통해 편리한 생활을 추구해온 인간들은 전자상거래 분야에서도 이러한 욕구를 충족시키기 위해 자동협상이라는 기능을 요구하게 되었다. 지능형 에이전트를 이용한 자동협상은 인간의 거래협상 업무의 부담을 많은 부분을 덜어주고 있어 자동협상 에이전트에 관한 연구들이 활성화되고 있다 소비자간 전자상거래에서는 다수의 자동협상 에이전트 연구들이 경매시장에서의 자동협상에 초점을 맞추고 있는데 반해, 가격 이외의 여러 거래속성을 갖는 상품에 대한 제안기반 협상시장에서의 자동협상 에이전트에 관한 연구들이 최근에 활발히 이루어지고 있다. 본 연구에서는 소비자간 전자상거래에서 거래속성의 변화에 따라 개인의 효용가치의 차이를 이용한 다속성 상품의 제안기반 협상시장이 가져야할 특성에 대해 연구하고, 이를 기반으로 자동 거래협상을 수행에 필요한 거래속성 변화에 따른 소비자 개인의 선호체계를 표현하기 위한 방법을 개발하였다. 그리고 이러한 자동 거래협상을 공정하게 수행하기 위해 협상시장이 가져야할 특징과 프로토콜을 제안하고 시장운영 에이전트 시스템의 구조를 설계하였다. 마지막으로 이러한 분산형 시장구조를 갖는 제안기반의 협상시장에 참여하는 사용자 에이전트 시스템이 최적의 거래상대와 최적의 거래안을 찾기 위한 탐색방법을 구체적으로 개발하였다. 본 연구의 결과를 통하여 소비자간 전자상거래에서 구매자 뿐만 아니라 판매자도 협상결과에 따른 거래로 얻어지는 자신의 효용을 극대화할 수 있는 공정한 협상시장을 운영할 수 있을 뿐만 아니라 사용자들도 손쉽게 자신의 협상 선호체계를 쉽게 표현하고, 표현된 선호체계를 반영한 자동 거래협상을 수행할 수 있을 것 이다. 기존의 UN/EDIFACT표준을 사용하고 있는 EDI환경과 기존 VAN 방식의 EDI 중계 시스템과 연동되며, 향후 관세청의 XML/EDI 표준 시행을 미리 대비하는 선도연구로서 자리매김이 된다. 본 연구에서는 개발된 XML/EDI 통관시스템은 향후, 서비스의 최대 걸림돌이 되어왔던 값비싼 EDI 사용료의 부담에서 벗어날 수 있게 할 것이며, 저렴한 EDI구축/운영 비용으로 전자문서교환의 활성화와 XML이 인터넷 기반의 문서유통 표준으로 자리매김할 수 있는 중요한 계기가 될 것이다.재무/비재무적 지표를 고려한 인공신경망기법의 예측적중률이 높은 것으로 나타났다. 즉, 로지스틱회귀 분석의 재무적 지표모형은 훈련, 시험용이 84.45%, 85.10%인 반면, 재무/비재무적 지표모형은 84.45%, 85.08%로서 거의 동일한 예측적중률을 가졌으나 인공신경망기법 분석에서는 재무적 지표모형이 92.23%, 85.10%인 반면, 재무/비재무적 지표모형에서는 91.12%, 88.06%로서 향상된 예측적중률을 나타내었다.ting LMS according to increasing the step-size parameter $\mu$ in the experimentally computed. learning curve. Also we find that convergence speed of proposed algorithm is increased by (B+1) time proportional to B which B is the number of recycled data buffer without

  • PDF

The effect investigation of the delirium by Bayesian network and radial graph (베이지안 네트워크와 방사형 그래프를 이용한 섬망의 효과 규명)

  • Lee, Jea-Young;Bae, Jae-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.911-919
    • /
    • 2011
  • In recent medical analysis, it becomes more important to looking for risk factors related to mental illness. If we find and identify their relevant characteristics of the risk factors, the disease can be prevented in advance. Moreover, the study can be helpful to medical development. These kinds of studies of risk factors for mental illness have mainly been discussed by using the logistic regression model. However in this paper, data mining techniques such as CART, C5.0, logistic, neural networks and Bayesian network were used to search for the risk factors. The Bayesian network of the above data mining methods was selected as most optimal model by applying delirium data. Then, Bayesian network analysis was used to find risk factors and the relationship between the risk factors are identified through a radial graph.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.