• Title/Summary/Keyword: 통계예측모델

Search Result 545, Processing Time 0.032 seconds

Feature selection and prediction modeling of drug responsiveness in Pharmacogenomics (약물유전체학에서 약물반응 예측모형과 변수선택 방법)

  • Kim, Kyuhwan;Kim, Wonkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.153-166
    • /
    • 2021
  • A main goal of pharmacogenomics studies is to predict individual's drug responsiveness based on high dimensional genetic variables. Due to a large number of variables, feature selection is required in order to reduce the number of variables. The selected features are used to construct a predictive model using machine learning algorithms. In the present study, we applied several hybrid feature selection methods such as combinations of logistic regression, ReliefF, TurF, random forest, and LASSO to a next generation sequencing data set of 400 epilepsy patients. We then applied the selected features to machine learning methods including random forest, gradient boosting, and support vector machine as well as a stacking ensemble method. Our results showed that the stacking model with a hybrid feature selection of random forest and ReliefF performs better than with other combinations of approaches. Based on a 5-fold cross validation partition, the mean test accuracy value of the best model was 0.727 and the mean test AUC value of the best model was 0.761. It also appeared that the stacking models outperform than single machine learning predictive models when using the same selected features.

Using Artificial Neural Network for Software Development Efforts Estimation on (인공신경망을 이용한 소프트웨어 개발공수 예측모델에 관한 연구)

  • Jeon, Eung-Seop
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.1
    • /
    • pp.211-224
    • /
    • 1996
  • In the research area of estimation of the software development efforts, a number of researches have been accomplished in order to control the costs and to make software more competitive. However, most of them were restricted to the functional algorithm models or the statistic models. Moreover, since they are dealing with the cases of foreign countries, the results are hard to apply directly to the domestic environment for the efficient project management because of lack of accuracy, fitness, flexibility and portability. Therefore, it is appropriate to suggest and propose a new approach supported by artificial neural network which is composed of back propagation and feel-forward algorithms to improve the exactness of the efforts estimation and to advance practical uses. In this study, the artificial neural network approach is used to model the software cost estimation and the results are compared with the revised COCOMO and the multiregression model in order to validate the superiority of the model.

  • PDF

Development and Evaluation of an Ensemble Forecasting System for the Regional Ocean Wave of Korea (앙상블 지역 파랑예측시스템 구축 및 검증)

  • Park, JongSook;Kang, KiRyong;Kang, Hyun-Suk
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.30 no.2
    • /
    • pp.84-94
    • /
    • 2018
  • In order to overcome the limitation of deterministic forecast, an ensemble forecasting system for regional ocean wave is developed. This system predicts ocean wind waves based on the meteorological forcing from the Ensemble Prediction System for Global of the Korea Meteorological Administration, which is consisted of 24 ensemble members. The ensemble wave forecasting system is evaluated by using the moored buoy data around Korea. The root mean squared error (RMSE) of ensemble mean showed the better performance than the deterministic forecast system after 2 days, especially RMSE of ensemble mean is improved by 15% compared with the deterministic forecast for 3-day lead time. It means that the ensemble method could reduce the uncertainty of the deterministic prediction system. The Relative Operating Characteristic as an evaluation scheme of probability prediction was bigger than 0.9 showing high predictability, meaning that the ensemble wave forecast could be usefully applied.

Improvement of Rolling Force Estimation by Modificaiton Function for Hot Steel Strip Rolling Process (보정함수를 이용한 강판의 열간 압연하중 예측 정도향상)

  • 문영훈;이경종;이필종;이준정
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.17 no.5
    • /
    • pp.1193-1201
    • /
    • 1993
  • A new deformation resistance model for hot steel strip rolling process was formulated to improve the accuracy of roll force estimation. To improve the existing deformation resistance model more precisely, a modification function was introduced in this study. For the modification function, several factors considering material and operational conditions have been investigated and the optimal modification function was determined under the principle of minimum variability. The newly formulated modification function was applied to the deformation resistance model for ultra-low carbon steel and showed improved accuracy with about 30% decrease in terms of standard deviation of predicted roll force values against measured ones.

Comparing Highway Traffic Noise Emission Levels Using Individual UofL State - specific Data - Based on Open Space - (루이빌대 개별State-specific 데이터를 이용한 도로 교통소음 수준 비교 - 오픈공간에서 -)

  • Teak K.;Roswell A. Harris;Louis F. Cohn
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.14 no.4
    • /
    • pp.276-286
    • /
    • 2004
  • 현재. 미 연방도로부에서는 도로교통소음분석을 위한 예측모형 (TNM & STAMINA)을 미 전 지역에 제공하고 있고, 이와 관련된 여러가지 연구논문들이 수행되고 있는바, 모델을 이용한 예측치와 실측치 간의 비교$.$분석 연구논문을 통하여 차이점이 존재하는 것을 증명하고 있다. 따라서 본 연구논문은 소음예측모형의 핵심자료로 사용될 수 있는 루이빌대(UofL) 회귀모형들을 차종별 (소형, 중형, 대형) 그리고 주별 (아리조나. 콜로라도, 조지아, 캔사스, 와싱톤)로 구분하여 그 차이점을 통계적으로 비교$.$분석$.$결론을 도출하였다. 그 결과 아리조나와 콜로라도(중대형)를 제외한 나머지 개별 State-specific데이터는 통계적으로 서로 다른 것으로 나타났다.

A Statistical Prediction Model of Speakers' Intentions in a Goal-Oriented Dialogue (목적지향 대화에서 화자 의도의 통계적 예측 모델)

  • Kim, Dong-Hyun;Kim, Hark-Soo;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.554-561
    • /
    • 2008
  • Prediction technique of user's intention can be used as a post-processing method for reducing the search space of an automatic speech recognizer. Prediction technique of system's intention can be used as a pre-processing method for generating a flexible sentence. To satisfy these practical needs, we propose a statistical model to predict speakers' intentions that are generalized into pairs of a speech act and a concept sequence. Contrary to the previous model using simple n-gram statistic of speech acts, the proposed model represents a dialogue history of a current utterance to a feature set with various linguistic levels (i.e. n-grams of speech act and a concept sequence pairs, clue words, and state information of a domain frame). Then, the proposed model predicts the intention of the next utterance by using the feature set as inputs of CRFs (Conditional Random Fields). In the experiment in a schedule management domain, The proposed model showed the precision of 76.25% on prediction of user's speech act and the precision of 64.21% on prediction of user's concept sequence. The proposed model also showed the precision of 88.11% on prediction of system's speech act and the Precision of 87.19% on prediction of system's concept sequence. In addition, the proposed model showed 29.32% higher average precision than the previous model.

Modeling Methodology for Cold Tolerance Assessment of Pittosporum tobira (돈나무의 내한성 평가 모델링)

  • Kim, Inhea;Huh, Keun Young;Jung, Hyun Jong;Choi, Su Min;Park, Jae Hyoen
    • Horticultural Science & Technology
    • /
    • v.32 no.2
    • /
    • pp.241-251
    • /
    • 2014
  • This study was carried out to develop a simple, rapid and reliable assessment model to predict cold tolerance in Pittosporum tobira, a broad-leaved evergreen commonly used in the southern region of South Korea, which can minimize the possible experimental errors appeared in a electrolyte leakage test for cold tolerance assessment. The modeling procedure comprised of regrowth test and a electrolyte leakage test on the plants exposed to low temperature treatments. The lethal temperatures estimated from the methodological combinations of a electrolyte leakage test including tissue sampling, temperature treatment for potential electrical conductivity, and statistical analysis were compared to the results of the regrowth test. The highest temperature showing the survival rate lower than 50% obtained from the regrowth test was $-10^{\circ}C$ and the lethal was $-10^{\circ}C{\sim}-5^{\circ}C$. Based on the results of the regrowth test, several methodological combinations of electrolyte leakage tests were evaluated and the electrolyte leakage lethal temperatures estimated using leaf sample tissue and freeze-killing method were closest to the regrowth lethal temperature. Evaluating statistical analysis models, linear interpolation had a higher tendency to overestimate the cold tolerance than non-linear regression. Consequently, the optimal model for cold tolerance assessment of P. tobira is composed of evaluating electrolyte leakage from leaf sample tissue applying freeze-killing method for potential electrical conductivity and predicting lethal temperature through non-linear regression analysis.

Prediction of Forest Fire Hazardous Area Using Predictive Spatial Data Mining (예측적 공간 데이터 마이닝을 이용한 산불위험지역 예측)

  • Han, Jong-Gyu;Yeon, Yeon-Kwang;Chi, Kwang-Hoon;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.1119-1126
    • /
    • 2002
  • In this paper, we propose two predictive spatial data mining based on spatial statistics and apply for predicting the forest fire hazardous area. These are conditional probability and likelihood ratio methods. In these approaches, the prediction models and estimation procedures are depending un the basic quantitative relationships of spatial data sets relevant forest fire with respect to selected the past forest fire ignition areas. To make forest fire hazardous area prediction map using the two proposed methods and evaluate the performance of prediction power, we applied a FHR (Forest Fire Hazard Rate) and a PRC (Prediction Rate Curve) respectively. In comparison of the prediction power of the two proposed prediction model, the likelihood ratio method is mort powerful than conditional probability method. The proposed model for prediction of forest fire hazardous area would be helpful to increase the efficiency of forest fire management such as prevention of forest fire occurrence and effective placement of forest fire monitoring equipment and manpower.

Probabilistic Evaluation on Prediction of the Strains by Single Surface Constitutive Model (확률론에 의한 Single Surface 구성모델의 변형률 예측능력 평가)

  • Jeong, Jin Seob;Song, Young Sun;Kim, Chan Kee
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.13 no.3
    • /
    • pp.163-172
    • /
    • 1993
  • A probabilistic approach for evaluation of prediction of the strains using Lade's single surface constitutive model was employed, based on first-order approximate mean and variance. Several experiments such as isotropic compression and drained triaxial compression tests were conducted to examine the variabilities of soil parameters for Lade's model. By taking into account the results of the experimental data such as mean values and standard deviations of soil parameter's, a new probabilistic approach, which explains the uncertainty of computed strains, is applied. The magnitude of the COV for each parameter and the correlation coefficient between the two parameters can be effectively used for reducing the number of the parameters for the model. It is concluded that Lade's single surface constitutive model is surperior model for the prediction of the strain, because the COV of strains is under the "0.51".

  • PDF

A Three-dimensional Numerical Weather Model using Power Output Predict of Distributed Power Source (3차원 기상 수치 모델을 이용한 분산형 전원의 출력 예)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Convergence Society for SMB
    • /
    • v.6 no.4
    • /
    • pp.93-98
    • /
    • 2016
  • Recently, the project related to the smart grid are being actively studied around the developed world. In particular, the long-term stabilization measures distributed power supply problem has been highlighted. In this paper, we propose a three-dimensional numerical weather prediction models to compare the error rate information which combined with the physical models and statistical models to predict the output of distributed power. Proposed model can predict the system for a stable power grid-can improve the prediction information of the distributed power. In performance evaluation, proposed model was a generation forecasting accuracy improved by 4.6%, temperature compensated prediction accuracy was improved by 3.5%. Finally, the solar radiation correction accuracy is improved by 1.1%.