• 제목/요약/키워드: mean squared error (MSE)

검색결과 171건 처리시간 0.023초

Different penalty methods for assessing interval from first to successful insemination in Japanese Black heifers

  • Setiaji, Asep;Oikawa, Takuro
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제32권9호
    • /
    • pp.1349-1354
    • /
    • 2019
  • Objective: The objective of this study was to determine the best approach for handling missing records of first to successful insemination (FS) in Japanese Black heifers. Methods: Of a total of 2,367 records of heifers born between 2003 and 2015 used, 206 (8.7%) of open heifers were missing. Four penalty methods based on the number of inseminations were set as follows: C1, FS average according to the number of inseminations; C2, constant number of days, 359; C3, maximum number of FS days to each insemination; and C4, average of FS at the last insemination and FS of C2. C5 was generated by adding a constant number (21 d) to the highest number of FS days in each contemporary group. The bootstrap method was used to compare among the 5 methods in terms of bias, mean squared error (MSE) and coefficient of correlation between estimated breeding value (EBV) of non-censored data and censored data. Three percentages (5%, 10%, and 15%) were investigated using the random censoring scheme. The univariate animal model was used to conduct genetic analysis. Results: Heritability of FS in non-censored data was $0.012{\pm}0.016$, slightly lower than the average estimate from the five penalty methods. C1, C2, and C3 showed lower standard errors of estimated heritability but demonstrated inconsistent results for different percentages of missing records. C4 showed moderate standard errors but more stable ones for all percentages of the missing records, whereas C5 showed the highest standard errors compared with noncensored data. The MSE in C4 heritability was $0.633{\times}10^{-4}$, $0.879{\times}10^{-4}$, $0.876{\times}10^{-4}$ and $0.866{\times}10^{-4}$ for 5%, 8.7%, 10%, and 15%, respectively, of the missing records. Thus, C4 showed the lowest and the most stable MSE of heritability; the coefficient of correlation for EBV was 0.88; 0.93 and 0.90 for heifer, sire and dam, respectively. Conclusion: C4 demonstrated the highest positive correlation with the non-censored data set and was consistent within different percentages of the missing records. We concluded that C4 was the best penalty method for missing records due to the stable value of estimated parameters and the highest coefficient of correlation.

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • 제86권3호
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

전도성 직물을 이용한 단일 리드 심전도 측정 및 실시간 심전도 유도 호흡 추출 방법에 관한 연구 (Real Time ECG Derived Respiratory Extraction from Heart Rate for Single Lead ECG Measurement using Conductive Textile Electrode)

  • 이계형;박성빈;윤형로
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제55권7호
    • /
    • pp.335-343
    • /
    • 2006
  • We have designed the system that measure one channel ECG by two electrode and extract real-time EDR with more related resipiration and comportable to subject by using conductive textile. On the assumption that relation between RL electrode and potential measurement electrode is coupled with RC connected model, we designed RL drive output to feedback two electrode for reduction of common mode signal. The conductive textile which was used for two ECG electrode was offered more comfort during night sleep in bed than any other method using attachments. In the method of single-lead EDR, R wave point or QRS interval area could be used for EDR estimation in traditional method, it is, so to speak, the amplitude modulation(AM) method for EDR. Alternatively, R-R interval could be used for frequency modulation(FM) method based on Respiratory Sinus Arrhythmia(RSA). For evaluation of performance on AM EDR and FM EDR from 14 subject, ECG lead III was measured. Each EDR was compared with both temperature around nose(direct measurement of respiration) and respiration signal from thoracic belt(indirect measurement of respiration) on mean squared error(MSE), cross correlation(Xcorr), and Coherence. The upsampling interpolation technique of multirate signal processing is applied to interpolating data instead of cubic spline interpolation. As a result, we showed the real-time EDR extraction processing to be implemented at micro-controller.

저장온도에 따른 마른김(Pyropia pseudolinearis)의 Bacillus cereus 성장예측모델 개발 (Predictive Growth Models of Bacillus cereus on Dried Laver Pyropia pseudolinearis as Function of Storage Temperature)

  • 최만석;김지윤;전은비;박신영
    • 한국수산과학회지
    • /
    • 제53권5호
    • /
    • pp.699-706
    • /
    • 2020
  • Predictive models in food microbiology are used for predicting microbial growth or death rates using mathematical and statistical tools considering the intrinsic and extrinsic factors of food. This study developed predictive growth models for Bacillus cereus on dried laver Pyropia pseudolinearis stored at different temperatures (5, 10, 15, 20, and 25℃). Primary models developed for specific growth rate (SGR), lag time (LT), and maximum population density (MPD) indicated a good fit (R2≥0.98) with the Gompertz equation. The SGR values were 0.03, 0.08, and 0.12, and the LT values were 12.64, 4.01, and 2.17 h, at the storage temperatures of 15, 20, and 25℃, respectively. Secondary models for the same parameters were determined via nonlinear regression as follows: SGR=0.0228-0.0069*T1+0.0005*T12; LT=113.0685-9.6256*T1+0.2079*T12; MPD=1.6630+0.4284*T1-0.0080*T12 (where T1 is the storage temperature). The appropriateness of the secondary models was validated using statistical indices, such as mean squared error (MSE<0.01), bias factor (0.99≤Bf≤1.07), and accuracy factor (1.01≤Af≤1.14). External validation was performed at three random temperatures, and the results were consistent with each other. Thus, these models may be useful for predicting the growth of B. cereus on dried laver.

영동대설 사례에 대한 MM5 강수량 모의의 통계적 검증 (Statistical Verification of Precipitation Forecasts from MM5 for Heavy Snowfall Events in Yeongdong Region)

  • 이정순;권태영;김덕래
    • 대기
    • /
    • 제16권2호
    • /
    • pp.125-139
    • /
    • 2006
  • Precipitation forecasts from MM5 have been verified for the period 1989-2001 over Yeongdong region to show a tendency of model forecast. We select 57 events which are related with the heavy snowfall in Yeongdong region. They are classified into three precipitation types; mountain type, cold-coastal type, and warm type. The threat score (TS), the probability of detection (POD), and the false-alarm rate (FAR) are computed for categorical verification and the mean squared error (MSE) is also computed for scalar accuracy measures. In the case of POD, warm, mountain, and cold-coastal precipitation type are 0.71, 0.69, and 0.55 in turn, respectively. In aspect of quantitative verification, mountain and cold-coastal type are relatively well matched between forecasts and observations, while for warm type MM5 tends to overestimate precipitation. There are 12 events for the POD below 0.2, mountain, cold-coastal, warm type are 2, 7, 3 events, respectively. Most of their precipitation are distributed over the East Sea nearby Yeongdong region. These events are also shown when there are no or very weak easterlies in the lower troposphere. Even in the case that we use high resolution sea surface temperature (about 18 km) for the boundary condition, there are not much changes in the wind direction to compare that with low resolution sea surface temperature (about 100 km).

LTE 시스템 채널 추정치의 후처리 기법 연구 (A Study on the Postprocessing of Channel Estimates in LTE System)

  • 유경렬
    • 전기학회논문지
    • /
    • 제60권1호
    • /
    • pp.205-213
    • /
    • 2011
  • The Long Term Evolution (LTE) system is designed to provide a high quality data service for fast moving mobile users. It is based on the Orthogonal Frequency Division Multiplexing (OFDM) and relies its channel estimation on the training samples which are systematically built within the transmitting data. Either a preamble or a lattice type is used for the distribution of training samples and the latter suits better for the multipath fading channel environment whose channel frequency response (CFR) fluctuates rapidly with time. In the lattice-type structure, the estimation of the CFR makes use of the least squares estimate (LSE) for each pilot samples, followed by an interpolation both in time-and in frequency-domain to fill up the channel estimates for subcarriers corresponding to data samples. All interpolation schemes should rely on the pilot estimates only, and thus, their performances are bounded by the quality of pilot estimates. However, the additive noise give rise to high fluctuation on the pilot estimates, especially in a communication environment with low signal-to-noise ratio. These high fluctuations could be monitored in the alternating high values of the first forward differences (FFD) between pilot estimates. In this paper, we analyzed statistically those FFD values and propose a postprocessing algorithm to suppress high fluctuations in the noisy pilot estimates. The proposed method is based on a localized adaptive moving-average filtering. The performance of the proposed technique is verified on a multipath environment suggested on a 3GPP LTE specification. It is shown that the mean-squared error (MSE) between the actual CFR and pilot estimates could be reduced up to 68% from the noisy pilot estimates.

사전 부호화를 이용한 TEA 적응 등화기의 성능 개선에 관한 연구 (A Study on the Performance improvement of TEA adaptive equalizer using Precoding)

  • 임승각
    • 정보처리학회논문지C
    • /
    • 제13C권3호
    • /
    • pp.369-374
    • /
    • 2006
  • 본 논문은 수신된 신호의 고차 통계치를 이용하는 TEA(Tricepstrum Equalization Alogorithm) 기반의 적응 등화기 성능 개선에 관한 것이다. 적응 등화기는 주로 부가 잡음, 위상 찌그러짐 및 주파수 선택성 페이딩이 존재하는 통신 채널 환경에서 수신측에서 통신의 고속, 동기 유지, BER 과같은 성능 개선을 위하여 사용되는데 이의 특성은 통신 채널의 전달 함수의 역특성을 갖게된다. 논문에서 적응 등화기의 알고리즘으로는 고차 통계치(HOS)를 이용하는 TEA 알고리즘을 사용하였으며 대상 신호로는 2 차원 선호 방식인 16-QAM을 이용하였다. 16-QAM의 사전 부호화를 위한 신호점 할당시에 Gray 부호를 이용함으로서 등화기의 성능을 나타내는 잔류 부호간 간섭(Residual ISI)과 MSE에서 개선된 성능을 컴퓨터 시뮬레이션으로 얻을 수 있었다.

회귀나무 모형을 이용한 패널데이터 분석 (Panel data analysis with regression trees)

  • 장영재
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권6호
    • /
    • pp.1253-1262
    • /
    • 2014
  • 회귀나무 (regression tree)는 독립변수로 이루어진 공간을 재귀적으로 분할하고 해당 영역에서 종속변수의 최선의 예측값을 찾고자 하는 비모수적 방법론이다. 회귀나무 모형이 제안된 이래 로지스틱 회귀나무모형이나 분위수 회귀나무모형과 같이 유연하고 다양한 모형적합을 위한 연구가 진행되어 왔다. 최근에 들어서는 Sela와 Simonoff (2012)의 RE-EM 알고리즘, Loh와 Zheng (2013)의 GUIDE 등 패널데이터와 관련하여 진일보한 나무모형 알고리즘도 제안되었다. 본 논문에서는 각 알고리즘을 소개하고 특징을 살펴보는 한편, 실험 데이터를 생성하여 평균제곱오차 (mean squared error)를 바탕으로 예측력을 비교하였다. 분석결과, RE-EM 알고리즘의 예측력이 상대적으로 우수하게 나타났다. 이 알고리즘을 통해 기업경기실사지수 업종별 패널자료를 분석한 결과 최근의 업황에 가장 큰 영향을 미치는 요소는 매출 실적으로 나타났으며 매출 상위 그룹의 경우 비제조업이 제조업에 비해 업황에 대한 판단이 긍정적인 것으로 나타났다.

Development of Diameter Growth Models by Thinning Intensity of Planted Quercus glauca Thunb. Stands

  • Jung, Su Young;Lee, Kwang Soo;Kim, Hyun Soo
    • 인간식물환경학회지
    • /
    • 제24권6호
    • /
    • pp.629-638
    • /
    • 2021
  • Background and objective: This study was conducted to develop diameter growth models for thinned Quercus glauca Thunb. (QGT) stands to inform production goals for treatment and provide the information necessary for the systematic management of this stands. Methods: This study was conducted on QGT stands, of which initial thinning was completed in 2013 to develop a treatment system. To analyze the tree growth and trait response for each thinning treatment, forestry surveys were conducted in 2014 and 2021, and a one-way analysis of variance (ANOVA) was executed. In addition, non-linear least squares regression of the PROC NLIN procedure was used to develop an optimal diameter growth model. Results: Based on growth and trait analyses, the height and height-to-diameter (H/D) ratio were not different according to treatment plot (p > .05). For the diameter of basal height (DBH), the heavy thinning (HT) treatment plot was significantly larger than the control plot (p < .05). As a result of the development of diameter growth models by treatment plot, the mean squared error (MSE) of the Gompertz polymorphic equation (control: 2.2381, light thinning: 0.8478, and heavy thinning: 0.8679) was the lowest in all treatment plots, and the Shapiro-Wilk statistic was found to follow a normal distribution (p > .95), so it was selected as an equation fit for the diameter growth model. Conclusion: The findings of this study provide basic data for the systematic management of Quercus glauca Thunb. stands. It is necessary to construct permanent sample plots (PSP) that consider stand status, location conditions, and climatic environments.

기계학습 기반 지진 취약 철근콘크리트 골조에 대한 신속 내진성능 등급 예측모델 개발 연구 (Machine Learning-based Rapid Seismic Performance Evaluation for Seismically-deficient Reinforced Concrete Frame)

  • 강태욱;강재도;오근영;신지욱
    • 한국지진공학회논문집
    • /
    • 제28권4호
    • /
    • pp.193-203
    • /
    • 2024
  • Existing reinforced concrete (RC) building frames constructed before the seismic design was applied have seismically deficient structural details, and buildings with such structural details show brittle behavior that is destroyed early due to low shear performance. Various reinforcement systems, such as fiber-reinforced polymer (FRP) jacketing systems, are being studied to reinforce the seismically deficient RC frames. Due to the step-by-step modeling and interpretation process, existing seismic performance assessment and reinforcement design of buildings consume an enormous amount of workforce and time. Various machine learning (ML) models were developed using input and output datasets for seismic loads and reinforcement details built through the finite element (FE) model developed in previous studies to overcome these shortcomings. To assess the performance of the seismic performance prediction models developed in this study, the mean squared error (MSE), R-square (R2), and residual of each model were compared. Overall, the applied ML was found to rapidly and effectively predict the seismic performance of buildings according to changes in load and reinforcement details without overfitting. In addition, the best-fit model for each seismic performance class was selected by analyzing the performance by class of the ML models.