• Title/Summary/Keyword: least squared error

Search Result 88, Processing Time 0.022 seconds

Panel data analysis with regression trees (회귀나무 모형을 이용한 패널데이터 분석)

  • Chang, Youngjae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1253-1262
    • /
    • 2014
  • Regression tree is a tree-structured solution in which a simple regression model is fitted to the data in each node made by recursive partitioning of predictor space. There have been many efforts to apply tree algorithms to various regression problems like logistic regression and quantile regression. Recently, algorithms have been expanded to the panel data analysis such as RE-EM algorithm by Sela and Simonoff (2012), and extension of GUIDE by Loh and Zheng (2013). The algorithms are briefly introduced and prediction accuracy of three methods are compared in this paper. In general, RE-EM shows good prediction accuracy with least MSE's in the simulation study. A RE-EM tree fitted to business survey index (BSI) panel data shows that sales BSI is the main factor which affects business entrepreneurs' economic sentiment. The economic sentiment BSI of non-manufacturing industries is higher than that of manufacturing ones among the relatively high sales group.

2.4kbps Speech Coding Algorithm Using the Sinusoidal Model (정현파 모델을 이용한 2.4kbps 음성부호화 알고리즘)

  • 백성기;배건성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.3A
    • /
    • pp.196-204
    • /
    • 2002
  • The Sinusoidal Transform Coding(STC) is a vocoding scheme based on a sinusoidal model of a speech signal. The low bit-rate speech coding based on sinusoidal model is a method that models and synthesizes speech with fundamental frequency and its harmonic elements, spectral envelope and phase in the frequency region. In this paper, we propose the 2.4kbps low-rate speech coding algorithm using the sinusoidal model of a speech signal. In the proposed coder, the pitch frequency is estimated by choosing the frequency that makes least mean squared error between synthetic speech with all spectrum peaks and speech synthesized with chosen frequency and its harmonics. The spectral envelope is estimated using SEEVOC(Spectral Envelope Estimation VOCoder) algorithm and the discrete all-pole model. The phase information is obtained using the time of pitch pulse occurrence, i.e., the onset time, as well as the phase of the vocal tract system. Experimental results show that the synthetic speech preserves both the formant and phase information of the original speech very well. The performance of the coder has been evaluated in terms of the MOS test based on informal listening tests, and it achieved over the MOS score of 3.1.

Development of Diameter Growth Models by Thinning Intensity of Planted Quercus glauca Thunb. Stands

  • Jung, Su Young;Lee, Kwang Soo;Kim, Hyun Soo
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.629-638
    • /
    • 2021
  • Background and objective: This study was conducted to develop diameter growth models for thinned Quercus glauca Thunb. (QGT) stands to inform production goals for treatment and provide the information necessary for the systematic management of this stands. Methods: This study was conducted on QGT stands, of which initial thinning was completed in 2013 to develop a treatment system. To analyze the tree growth and trait response for each thinning treatment, forestry surveys were conducted in 2014 and 2021, and a one-way analysis of variance (ANOVA) was executed. In addition, non-linear least squares regression of the PROC NLIN procedure was used to develop an optimal diameter growth model. Results: Based on growth and trait analyses, the height and height-to-diameter (H/D) ratio were not different according to treatment plot (p > .05). For the diameter of basal height (DBH), the heavy thinning (HT) treatment plot was significantly larger than the control plot (p < .05). As a result of the development of diameter growth models by treatment plot, the mean squared error (MSE) of the Gompertz polymorphic equation (control: 2.2381, light thinning: 0.8478, and heavy thinning: 0.8679) was the lowest in all treatment plots, and the Shapiro-Wilk statistic was found to follow a normal distribution (p > .95), so it was selected as an equation fit for the diameter growth model. Conclusion: The findings of this study provide basic data for the systematic management of Quercus glauca Thunb. stands. It is necessary to construct permanent sample plots (PSP) that consider stand status, location conditions, and climatic environments.

A comparison of ATR-FTIR and Raman spectroscopy for the non-destructive examination of terpenoids in medicinal plants essential oils

  • Rahul Joshi;Sushma Kholiya;Himanshu Pandey;Ritu Joshi;Omia Emmanuel;Ameeta Tewari;Taehyun Kim;Byoung-Kwan Cho
    • Korean Journal of Agricultural Science
    • /
    • v.50 no.4
    • /
    • pp.675-696
    • /
    • 2023
  • Terpenoids, also referred to as terpenes, are a large family of naturally occurring chemical compounds present in the essential oils extracted from medicinal plants. In this study, a nondestructive methodology was created by combining ATR-FT-IR (attenuated total reflectance-Fourier transform infrared), and Raman spectroscopy for the terpenoids assessment in medicinal plants essential oils from ten different geographical locations. Partial least squares regression (PLSR) and support vector regression (SVR) were used as machine learning methodologies. However, a deep learning based model called as one-dimensional convolutional neural network (1D CNN) were also developed for models comparison. With a correlation coefficient (R2) of 0.999 and a lowest RMSEP (root mean squared error of prediction) of 0.006% for the prediction datasets, the SVR model created for FT-IR spectral data outperformed both the PLSR and 1 D CNN models. On the other hand, for the classification of essential oils derived from plants collected from various geographical regions, the created SVM (support vector machine) classification model for Raman spectroscopic data obtained an overall classification accuracy of 0.997% which was superior than the FT-IR (0.986%) data. Based on the results we propose that FT-IR spectroscopy, when coupled with the SVR model, has a significant potential for the non-destructive identification of terpenoids in essential oils compared with destructive chemical analysis methods.

Development of Code-PPP Based on Multi-GNSS Using Compact SSR of QZSS-CLAS (QZSS-CLAS의 Compact SSR을 이용한 다중 위성항법 기반의 Code-PPP 개발)

  • Lee, Hae Chang;Park, Kwan Dong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.6
    • /
    • pp.521-531
    • /
    • 2020
  • QZSS (Quasi-Zenith Satellite System) provides the CLAS (Centimeter Level Augmentation Service) through the satellite's L6 band. CLAS provides correction messages called C-SSR (Compact - State Space Representation) for GPS (Global Positioning System), Galileo and QZSS. In this study, CLAS messages were received by using the AsteRx4 of Septentrio which is a GPS receiver capable of receiving L6 bands, and the messages were decoded to acquire C-SSR. In addition, Multi-GNSS (Global Navigation Satellite System) Code-PPP (Precise Point Positioning) was developed to compensate for GNSS errors by using C-SSR to pseudo-range measurements of GPS, Galileo and QZSS. And non-linear least squares estimation was used to estimate the three-dimensional position of the receiver and the receiver time errors of the GNSS constellations. To evaluate the accuracy of the algorithms developed, static positioning was performed on TSK2 (Tsukuba), one of the IGS (International GNSS Service) sites, and kinematic positioning was performed while driving around the Ina River in Kawanishi. As a result, for the static positioning, the mean RMSE (Root Mean Square Error) for all data sets was 0.35 m in the horizontal direction ad 0.57 m in the vertical direction. And for the kinematic positioning, the accuracy was approximately 0.82 m in horizontal direction and 3.56 m in vertical direction compared o the RTK-FIX values of VRS.

Time- and Frequency-Domain Block LMS Adaptive Digital Filters: Part Ⅱ - Performance Analysis (시간영역 및 주파수영역 블럭적응 여파기에 관한 연구 : 제 2 부- 성능분석)

  • Lee, Jae-Chon;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.7 no.4
    • /
    • pp.54-76
    • /
    • 1988
  • In Part Ⅰ of the paper, we have developed various block least mean-square (BLMS) adaptive digital filters (ADF's) based on a unified matrix treatment. In Part Ⅱ we analyze the convergence behaviors of the self-orthogonalizing frequency-domain BLMS (FBLMS) ADF and the unconstrained FBLMS (UFBLMS) ADF both for the overlap-save and overlap-add sectioning methods. We first show that, unlike the FBLMS ADF with a constant convergence factor, the convergence behavior of the self-orthogonalizing FBLMS ADF is governed by the same autocorrelation matrix as that of the UFBLMS ADF. We then show that the optimum solution of the UFBLMS ADF is the same as that of the constrained FBLMS ADF when the filter length is sufficiently long. The mean of the weight vector of the UFBLMS ADF is also shown to converge to the optimum Wiener weight vector under a proper condition. However, the steady-state mean-squared error(MSE) of the UFBLMS ADF turns out to be slightly worse than that of the constrained algorithm if the same convergence constant is used in both cases. On the other hand, when the filter length is not sufficiently long, while the constrained FBLMS ADF yields poor performance, the performance of the UFBLMS ADF can be improved to some extent by utilizing its extended filter-length capability. As for the self-orthogonalizing FBLMS ADF, we study how we can approximate the autocorrelation matrix by a diagonal matrix in the frequency domain. We also analyze the steady-state MSE's of the self-orthogonalizing FBLMS ADF's with and without the constant. Finally, we present various simulation results to verify our analytical results.

  • PDF

Automatic Determination of the Azimuth Angle of Reflectors in Borehole Radar Reflection Data Using Direction-finding Antenna (방향탐지 안테나를 이용한 시추공 레이다 반사법 탐사에 있어서 반사층 방위각의 자동 결정)

  • Kim Jung-Ho;Cho Seong-Jun;Yi Myeong-Jong;Chung Seung-Hwan
    • Geophysics and Geophysical Exploration
    • /
    • v.1 no.3
    • /
    • pp.176-182
    • /
    • 1998
  • The borehole radar reflection survey can image the underground structure with high resolution, however, we cannot get any information on the orientation of the reflectors with dipole antenna alone. The direction-finding antenna system is commonly used to give the solution to the problem. However, the interpretation of the data from direction- finding antenna may be time-consuming, and sometimes have ambiguities in the sense of precise determination of the azimuth. To solve the problem, we developed the automatic azimuth finding scheme of reflectors in borehole radar reflection data using direction-finding antenna. The algorithm is based on finding the azimuthal angle possibly showing the maximum reflection amplitude in the least-squared error sense. The developed algorithm was applied to the field data acquired in quarry mine. It was possible to locate nearly all of the reflectors in three dimensional fashion, which coincide with the known geological structures and man-made discontinuities.

  • PDF

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.