• 제목/요약/키워드: In-Sample Prediction

검색결과 559건 처리시간 0.027초

머신러닝을 활용한 코스닥 관리종목지정 예측 (Predicting Administrative Issue Designation in KOSDAQ Market Using Machine Learning Techniques)

  • 채승일;이동주
    • 아태비즈니스연구
    • /
    • 제13권2호
    • /
    • pp.107-122
    • /
    • 2022
  • Purpose - This study aims to develop machine learning models to predict administrative issue designation in KOSDAQ Market using financial data. Design/methodology/approach - Employing four classification techniques including logistic regression, support vector machine, random forest, and gradient boosting to a matched sample of five hundred and thirty-six firms over an eight-year period, the authors develop prediction models and explore the practicality of the models. Findings - The resulting four binary selection models reveal overall satisfactory classification performance in terms of various measures including AUC (area under the receiver operating characteristic curve), accuracy, F1-score, and top quartile lift, while the ensemble models (random forest and gradienct boosting) outperform the others in terms of most measures. Research implications or Originality - Although the assessment of administrative issue potential of firms is critical information to investors and financial institutions, detailed empirical investigation has lagged behind. The current research fills this gap in the literature by proposing parsimonious prediction models based on a few financial variables and validating the applicability of the models.

Time-Frequency Analysis of Electrohysterogram for Classification of Term and Preterm Birth

  • Ryu, Jiwoo;Park, Cheolsoo
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제4권2호
    • /
    • pp.103-109
    • /
    • 2015
  • In this paper, a novel method for the classification of term and preterm birth is proposed based on time-frequency analysis of electrohysterogram (EHG) using multivariate empirical mode decomposition (MEMD). EHG is a promising study for preterm birth prediction, because it is low-cost and accurate compared to other preterm birth prediction methods, such as tocodynamometry (TOCO). Previous studies on preterm birth prediction applied prefilterings based on Fourier analysis of an EHG, followed by feature extraction and classification, even though Fourier analysis is suboptimal to biomedical signals, such as EHG, because of its nonlinearity and nonstationarity. Therefore, the proposed method applies prefiltering based on MEMD instead of Fourier-based prefilters before extracting the sample entropy feature and classifying the term and preterm birth groups. For the evaluation, the Physionet term-preterm EHG database was used where the proposed method and Fourier prefiltering-based method were adopted for comparative study. The result showed that the area under curve (AUC) of the receiver operating characteristic (ROC) was increased by 0.0351 when MEMD was used instead of the Fourier-based prefilter.

VVC CIIP 모드에서 화면내-화면간 참조샘플을 이용한 플라나 예측 방법 (Enhanced planar prediction using intra-inter reference sample in VVC CIIP mode)

  • 남건욱;이종석;김민섭;심동규
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2020년도 하계학술대회
    • /
    • pp.249-250
    • /
    • 2020
  • 본 논문에서는 VVC 화면간 예측 모드 Combined Inter-intra Prediction(CIIP)의 화면내 예측 과정에서의 향상된 PLANAR 예측 방법을 제안한다. Combined Inter-intra Prediction(CIIP) 모드는 화면간 예측 신호와 PLANAR 모드로 생성되는 화면내 예측 신호를 가중합 하여 최종 예측 신호를 생성하는 모드이다. 제안하는 방법은 화면간 예측 신호로 생성된 예측 샘플을 PLANAR 모드 예측 과정에서 우측 및 하단의 참조 샘플로 사용한다. 이후 PLANAR 예측 및 가중합 하여 예측 신호를 만들어내는 것은 기존 CIIP와 동일하다. 제안하는 방법의 성능 평가를 위하여 VVC의 참조 소프트웨어인 VTM 9.0에 구현하였으며, 기존 VTM 9.0과 부호화 성능을 비교한 결과로 휘도 성분에서 0.01 % 부호화 성능 감소를 보이고 색차 성분에 대하여 각각 0.17%, 0.13% 부호화 성능 향상을 보인다.

  • PDF

퍼지기법을 이용한 상수관로의 노후도예측 모델 연구 (Deterioration Prediction Model of Water Pipes Using Fuzzy Techniques)

  • 최태호;최민아;이현동;구자용
    • 상하수도학회지
    • /
    • 제30권2호
    • /
    • pp.155-165
    • /
    • 2016
  • Pipe Deterioration Prediction (PDP) and Pipe Failure Risk Prediction (PFRP) models were developed in an attempt to predict the deterioration and failure risk in water mains using fuzzy technique and the markov process. These two models were used to determine the priority in repair and replacement, by predicting the deterioration degree, deterioration rate, failure possibility and remaining life in a study sample comprising 32 water mains. From an analysis approach based on conservative risk with a medium policy risk, the remaining life for 30 of the 32 water mains was less than 5 years for 2 mains (7%), 5-10 years for 8 (27%), 10-15 years for 7 (23%), 15-20 years for 5 (17%), 20-25 years for 5 (17%), and 25 years or more for 2 (7%).

건축자재 라돈 방출에 의한 실내공기 중 라돈농도 예측에 관한 연구 (A study on the Prediction of Indoor Concentration due to Radon Exhalation from Domestic Building Materials)

  • 이철민;곽윤경;이동현;이다정;조용석
    • 한국환경과학회지
    • /
    • 제24권9호
    • /
    • pp.1131-1138
    • /
    • 2015
  • Radon exhalation rates have been determined for samples of concrete, gypsum board, marble, and tile among building materials that are used in domestic construction environment. Radon emanation was measured using the closed chamber method based on CR-39 nuclear track detectors. The radon concentrations in apartments of 100 households in Seoul, Busan and Gyeonggi Provinces were measured to verify the prediction model of indoor radon concentration. The results obtained by the four samples showed the largest radon exhalation rate of $0.34314Bq/m^2{\cdot}h$ for sample concrete. The radon concentration contribution to indoor radon in the house due to exhalation from the concrete was $31.006{\pm}7.529Bq/m^3$. The difference between the prediction concentration and actual measured concentration was believed to be due to the uncertainty resulting from the model implementation.

토양정보별 포장내 공간변이 예측에 관한 연구 (The Prediction of Spacial Variability for Soil Information in Paddy Field)

  • 정인규;성제훈;이충근;김상철;이용범
    • Journal of Biosystems Engineering
    • /
    • 제29권1호
    • /
    • pp.65-70
    • /
    • 2004
  • This study was carried out to verify and predict the soil informations such as the contents of organic matter(OM) and Mg and pH of the soil. The predictability of spacial variation in the paddy field was examined by analyzing the various soil information. The prediction models for the OM pH, and Mg, were developed using inverse distance weighted (IDW), triangulated irregular network(TIN) and Kriging model. The determination of coefficients of linear and spherical Kriging models were 0.756 and 0.578, respectively, and were very low in comparison with other soil information. For IDW and TIN model, the determination of coefficients were 1.000 and hence the performance of the models was found to be excellent. The developed models were validated using unknown soil sample obtained In 2000 and 2001. From the analysis of relationship between the measured pH and predicted 0.9353. For prediction of Mg, the determination of coefficient is more than 0.8. Since the determination of coefficients of developed models for OM were relatively low, it may be difficult to predict the content of OM using the developed models. For further study, the additional works to enhance the performance of the prediction models for soil information are required.

Prediction of Length of ICU Stay Using Data-mining Techniques: an Example of Old Critically Ill Postoperative Gastric Cancer Patients

  • Zhang, Xiao-Chun;Zhang, Zhi-Dan;Huang, De-Sheng
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권1호
    • /
    • pp.97-101
    • /
    • 2012
  • Objective: With the background of aging population in China and advances in clinical medicine, the amount of operations on old patients increases correspondingly, which imposes increasing challenges to critical care medicine and geriatrics. The study was designed to describe information on the length of ICU stay from a single institution experience of old critically ill gastric cancer patients after surgery and the framework of incorporating data-mining techniques into the prediction. Methods: A retrospective design was adopted to collect the consecutive data about patients aged 60 or over with a gastric cancer diagnosis after surgery in an adult intensive care unit in a medical university hospital in Shenyang, China, from January 2010 to March 2011. Characteristics of patients and the length their ICU stay were gathered for analysis by univariate and multivariate Cox regression to examine the relationship with potential candidate factors. A regression tree was constructed to predict the length of ICU stay and explore the important indicators. Results: Multivariate Cox analysis found that shock and nutrition support need were statistically significant risk factors for prolonged length of ICU stay. Altogether, eight variables entered the regression model, including age, APACHE II score, SOFA score, shock, respiratory system dysfunction, circulation system dysfunction, diabetes and nutrition support need. The regression tree indicated comorbidity of two or more kinds of shock as the most important factor for prolonged length of ICU stay in the studied sample. Conclusions: Comorbidity of two or more kinds of shock is the most important factor of length of ICU stay in the studied sample. Since there are differences of ICU patient characteristics between wards and hospitals, consideration of the data-mining technique should be given by the intensivists as a length of ICU stay prediction tool.

Nondestructive determination of physico-chemical properties in compost by NIRS

  • Seo, Sang-Hyun;Lee, Chang-Hee;Park, Sung-Hun;Cho, Rae-Kwang;Park, Woo-Churl
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.1622-1622
    • /
    • 2001
  • The purpose of this research was to develop a the reflection technique with near infrared (NIR) radiation for estimating physico-chemical properties in compost. The composts (cattle, pig, chicken and waste composts) were air dried and then ground to pass through a 0.5 or 2mm sieve for the physico-chemical properties and spectroscopic determinations. The physico-chemical properties of compost were shown high values ; moisture(30-60%), T-N(0.8-2.9%), organic matter(29-89%), pH(5.89-9.60) K$_2$O(0.27-5.66%), P2O$\sub$5/(0.07-2.62%), CaO(0.03-4.80%), MgO(0.09-1.56%), NaCl(0.01-1.13%), EC(1.41-13.76dS/m). Generally, we should select a simple calibration and prediction method for determining physico-chemical properties in compost under similar accuracy and precision of prediction. It should be remembered that the NIRS approach will never replace the traditional methods. However, NIRS technique may be an effective method for rapid and nondestructive measurements of a large number of compost samples. Near infrared reflectance spectra of composts was obtained by Infra Alyzer 500 scanning spectrophotometer at 2-nm intervals from 1100 to 2500nm. Multiple linear regression(MLR) or partial least square regression (PLSR) was used to evaluate a NIRS method for the rapid and nondestructive determination of physico-chemical properties and humic acid contents in composts. The standard error of prediction(SEP) for finely sized sample(<0.5mm) and coarsely sized sample(<2mm) did not show much difference. The NIR instrument of filter type showed the same accuracy of the monochromator scanning type to estimate the compost properties. The results summarized that NIR spectroscopy can be used as a routine testing method to determine quantitatively the OM, moisture, T-N, color, pH, cation content in the compost samples nondestructively.

  • PDF

Support Vector Machine을 이용한 고객구매예측모형 (Purchase Prediction Model using the Support Vector Machine)

  • 안현철;한인구;김경재
    • 지능정보연구
    • /
    • 제11권3호
    • /
    • pp.69-81
    • /
    • 2005
  • 고객관계관리는 치열한 경쟁환경에서 각 기업이 생존하기 위해 반드시 필요한 하나의 기업전략이 되었다. 고객관계관리의 방법은 다양하지만 가장 기본적인 방법은 특정 고객이 어떤 상품 혹은 상품군을 구매할 것인지를 정확히 예측하는 것이다. 이미 국내외 실무현장에서 전통적인 데이터마이닝 기법을 활용한 고객구매예측모형이 널리 적용되고 있다. 하지만 전통적인 기법의 경우, 정확도가 상대적으로 떨어지거나 혹은 모형의 구축 및 유지관리가 어렵다는 문제가 종종 제기되어 왔다. 이에 본 연구에서는 기존 모형의 문제점을 개선하기 위한 대안으로, 매우 높은 예측력을 나타내면서 동시에 일반화 능력이 우수한 것으로 알려진 Support Vector Machine(SVM)을 이용하여 고객구매예측모형을 구축하고자 한다. 본 연구에서는 고객구매예측의 도구로써 SVM의 적합성을 판단하기 위하여 전통적인 기법인 로지스틱 회귀분석, 인공신경망과 그 성과를 비교하였다. 그 결과, SVM이 다른 기법들에 비해 상대적으로 우수한 성과를 나타냄을 확인할 수 있었다.

  • PDF

Measurement of Lipid Content of Compost in the fermentation Process using Near-Infrared Spectroscopy

  • Suehara, Ken-Ichiro;Masui, Daisuke;Nakano, Yasuhisa;Yano, Takuo
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.1254-1254
    • /
    • 2001
  • Near infrared spectroscopy (NIRS) was applied to determination of the lipid content of compost during compost fermentation of tofu(soybean-curd) refuse. The reflected rays in the wavelength range between 800 and 2500 nm were measured at 2 nm intervals. The absorption of lipid observed at 4 wavelengths, 1208, 1712, 2312 and 2352 nm on the second derivative spectra. To formulate a calibration equation, a multiple linear regression analysis was carried out between the near-infrared spectral data and on the lipid content in the calibration sample set (sample number, n=60) obtained using a Soxhlet extraction method. The calibration equation for prediction of lipid, the value of the multiple correlation coefficient (R) was 0.975 when using the wavelengths of 1208 and 1712nm. To validate the calibration equation obtained, the lipid content in the validation sample set (n=35) not used for formulating the calibration equation were calculated using the calibration equations, and compared with the values obtained using the Soxhlet extraction method. Good agreement were observed between the results of the Soxhlet extraction method and those values of the NIRS method. The simple correlation coefficient (r) and standard error of prediction (SEP) were 0.964 and 0.815 %, respectively. Then, the NIRS method was applied to a compost fermentation in which the time course the lipid content were measured and good results were obtained. The study indicates that NIRS is a useful method for process management of the compost fermentation of tofu refuse.

  • PDF