• Title/Summary/Keyword: non-parametric regression model

Search Result 38, Processing Time 0.025 seconds

Transitional Dark Energy - A solution to the H0 tension

  • Keeley, Ryan
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.59.2-59.2
    • /
    • 2019
  • In this talk, I will explain the implications of a rapid appearance of dark energy between the redshifts ($z$) of one and two on the expansion rate and growth of perturbations. Using both Gaussian process regression and a parametric model, I show that this is the preferred solution to the current set of low-redshift ($z<3$) distance measurements if $H_0=73~\rm km\,s^{-1}\,Mpc^{-1}$ to within 1\% and the high-redshift expansion history is unchanged from the $\Lambda$CDM inference by the Planck satellite. Dark energy was effectively non-existent around $z=2$, but its density is close to the $\Lambda$CDM model value today, with an equation of state greater than $-1$ at $z<0.5$. If sources of clustering other than matter are negligible, we show that this expansion history leads to slower growth of perturbations at $z<1$, compared to $\Lambda$CDM, that is measurable by upcoming surveys and can alleviate the $\sigma_8$ tension between the Planck CMB temperature and low-redshift probes of the large-scale structure.

  • PDF

Improvement of generalization of linear model through data augmentation based on Central Limit Theorem (데이터 증가를 통한 선형 모델의 일반화 성능 개량 (중심극한정리를 기반으로))

  • Hwang, Doohwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.19-31
    • /
    • 2022
  • In Machine learning, we usually divide the entire data into training data and test data, train the model using training data, and use test data to determine the accuracy and generalization performance of the model. In the case of models with low generalization performance, the prediction accuracy of newly data is significantly reduced, and the model is said to be overfit. This study is about a method of generating training data based on central limit theorem and combining it with existed training data to increase normality and using this data to train models and increase generalization performance. To this, data were generated using sample mean and standard deviation for each feature of the data by utilizing the characteristic of central limit theorem, and new training data was constructed by combining them with existed training data. To determine the degree of increase in normality, the Kolmogorov-Smirnov normality test was conducted, and it was confirmed that the new training data showed increased normality compared to the existed data. Generalization performance was measured through differences in prediction accuracy for training data and test data. As a result of measuring the degree of increase in generalization performance by applying this to K-Nearest Neighbors (KNN), Logistic Regression, and Linear Discriminant Analysis (LDA), it was confirmed that generalization performance was improved for KNN, a non-parametric technique, and LDA, which assumes normality between model building.

THREE-STAGED RISK EVALUATION MODEL FOR BIDDING ON INTERNATIONAL CONSTRUCTION PROJECTS

  • Wooyong Jung;Seung Heon Han
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.534-541
    • /
    • 2011
  • Risk evaluation approaches for bidding on international construction projects are typically partitioned into three stages: country selection, project classification, and bid-cost evaluation. However, previous studies are frequently under attack in that they have several crucial limitations: 1) a dearth of studies about country selection risk tailored for the overseas construction market at a corporate level; 2) no consideration of uncertainties for input variable per se; 3) less probabilistic approaches in estimating a range of cost variance; and 4) less inclusion of covariance impacts. This study thus suggests a three-staged risk evaluation model to resolve these inherent problems. In the first stage, a country portfolio model that maximizes the expected construction market growth rate and profit rate while decreasing market uncertainty is formulated using multi-objective genetic analysis. Following this, probabilistic approaches for screening bad projects are suggested through applying various data mining methods such as discriminant logistic regression, neural network, C5.0, and support vector machine. For the last stage, the cost overrun prediction model is simulated for determining a reasonable bid cost, while considering non-parametric distribution, effects of systematic risks, and the firm's specific capability accrued in a given country. Through the three consecutive models, this study verifies that international construction risk can be allocated, reduced, and projected to some degree, thereby contributing to sustaining stable profits and revenues in both the short-term and the long-term perspective.

  • PDF

A Derivation of a Hydrograph by Using Smoothed Dimensionless Unit Kernel Function (평활화된 무차원 단위핵함수를 이용한 단위도의 유도)

  • Seong, Kee-Won
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.6
    • /
    • pp.559-564
    • /
    • 2008
  • A practical method is derived for determining the unit hydrograph and S-curve from complex storm events by using a smoothed unit kernel approach. The using a unit kernel yields more convenient way of constructing a unit hydrograph and its S-curve than a conventional method. However, with use of real data, the unit kernel oscillates and is unstable so that a unit hydrograph and S-curve cannot easily obtained. The use of non-parametric ridge regression with a Laplacian matrix is suggested for deriving an event averaged unit kernel which reduces the computational efforts when dealing with the Nash instantaneous unit hydrograph as a basis of the kernel. A method changing the unit hydrograph duration is also presented. The procedure shown in this work will play an efficient role when any unit hydrograph works is involved.

Development of Prediction Model of Groundwater Pollution based on Food Available Water and Validation in Small Watersheds (식품용수 수질자료를 이용한 지하수 오염 예측 모델 개발 및 소규모 유역에서의 검증)

  • Nam, Sungwoo;Park, Eungyu;Yi, Myeong-jae;Jeon, Seonkeum;Jung, Hyemin;Kim, Jeongwoo
    • Journal of Soil and Groundwater Environment
    • /
    • v.26 no.6
    • /
    • pp.165-175
    • /
    • 2021
  • Groundwater is used in many areas in food industry such as food manufacturing, food processing, cooking, and liquor industry etc. in Korea. As groundwater occupies a large portion of food industry, it is necessary to predict deterioration of water quality to ensure the safety of food water since using undrinkable groundwater has a ripple effect that can cause great harm or anxiety to food users. In this study, spatiotemporal data aggregation method was used in order to obtain spatially representative data, which enable prediction of groundwater quality change in a small watershed. In addition, a highly reliable predictive model was developed to estimate long-term changes in groundwater quality by applying a non-parametric segmented regression technique. Two pilot watersheds were selected where a large number of companies use groundwater for food water, and the appropriateness of the model was assessed by comparing the model-produced values with those obtained by actual measurements. The result of this study can contribute to establishing a customized food water management system utilizing big data that respond quickly, accurately, and preemptively to changes in groundwater quality and pollution. It is also expected to contribute to the improvement of food safety management.

Estimation of the Input Wave Height of the Wave Generator for Regular Waves by Using Artificial Neural Networks and Gaussian Process Regression (인공신경망과 가우시안 과정 회귀에 의한 규칙파의 조파기 입력파고 추정)

  • Jung-Eun, Oh;Sang-Ho, Oh
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.6
    • /
    • pp.315-324
    • /
    • 2022
  • The experimental data obtained in a wave flume were analyzed using machine learning techniques to establish a model that predicts the input wave height of the wavemaker based on the waves that have experienced wave shoaling and to verify the performance of the established model. For this purpose, artificial neural network (NN), the most representative machine learning technique, and Gaussian process regression (GPR), one of the non-parametric regression analysis methods, were applied respectively. Then, the predictive performance of the two models was compared. The analysis was performed independently for the case of using all the data at once and for the case by classifying the data with a criterion related to the occurrence of wave breaking. When the data were not classified, the error between the input wave height at the wavemaker and the measured value was relatively large for both the NN and GPR models. On the other hand, if the data were divided into non-breaking and breaking conditions, the accuracy of predicting the input wave height was greatly improved. Among the two models, the overall performance of the GPR model was better than that of the NN model.

Risk indicators related to periimplant disease: an observational retrospective cohort study

  • Poli, Pier Paolo;Beretta, Mario;Grossi, Giovanni Battista;Maiorana, Carlo
    • Journal of Periodontal and Implant Science
    • /
    • v.46 no.4
    • /
    • pp.266-276
    • /
    • 2016
  • Purpose: The aim of the present study was to retrospectively investigate the influence of potential risk indicators on the development of peri-implant disease. Methods: Overall, 103 patients referred for implant treatment from 2000 to 2012 were randomly enrolled. The study sample consisted of 421 conventional-length (>6 mm) non-turned titanium implants that were evaluated clinically and radiographically according to preestablished clinical and patient-related parameters by a single investigator. A non-parametric Mann-Whitney U test or Kruskal-Wallis rank test and a logistic regression model were used for the statistical analysis of the recorded data at the implant level. Results: The diagnosis of peri-implant mucositis and peri-implantitis was made for 173 (41.1%) and 19 (4.5%) implants, respectively. Age (${\geq}65$ years), patient adherence (professional hygiene recalls <2/year) and the presence of plaque were associated with higher peri-implant probing-depth values and bleeding-on-probing scores. The logistic regression analysis indicated that age (P=0.001), patient adherence (P=0.03), the absence of keratinized tissue (P=0.03), implants placed in pristine bone (P=0.04), and the presence of peri-implant soft-tissue recession (P=0.000) were strongly associated with the event of peri-implantitis. Conclusions: Within the limitations of this study, patients aged ${\geq}65$ years and non-adherent subjects were more prone to develop peri-implant disease. Therefore, early diagnosis and a systematic maintenance-care program are essential for maintaining peri-implant tissue health, especially in older patients.

Influence of spacers on ultimate strength of intermediate length thin walled columns

  • Anbarasu, M.;Sukumar, S.
    • Steel and Composite Structures
    • /
    • v.16 no.4
    • /
    • pp.437-454
    • /
    • 2014
  • The influence of spacers on the behaviour and ultimate capacity of intermediate length CFS open section columns under axial compression is investigated in this paper. The focus of the research lies in the cross- section predominantly, failed by distortional buckling. This paper made an attempt to either delay or eliminate the distortional buckling mode by the introduction of transverse elements referred herein as spacers. The cross-sections investigated have been selected by performing the elastic buckling analysis using CUFSM software. The test program considered three different columns having slenderness ratios of 35, 50 & 60. The test program consisted of 14 pure axial compression tests under hinged-hinged end condition. Models have been analysed using finite element simulations and the obtained results are compared with the experimental tests. The finite element package ABAQUS has been used to carry out non-linear analyses of the columns. The finite element model incorporates material, geometric non-linearities and initial geometric imperfection of the specimens. The work involves a wide parametric study in the column with spacers of varying depth and number of spacers. The results obtained from the study shows that the depth and number of spacers have significant influence on the behaviour and strength of the columns. Based on the nonlinear regression analysis the design equation is proposed for the selected section.

The methodology for developing the 2007 Korean growth charts and blood pressure nomogram in Korean children and adolescents (2007 한국 소아청소년 성장곡선 및 정상혈압 분포 개발 방법론)

  • Lee, Soon Young;Kim, Youn Nam;Kang, Yeon Ji;Jang, Myoung-Jin;Kim, Jinheum;Moon, Jin Soo;Lee, Chong Guk;Oh, Kyungwon;Kim, Young Taek;Nam, Chung Mo
    • Clinical and Experimental Pediatrics
    • /
    • v.51 no.1
    • /
    • pp.26-32
    • /
    • 2008
  • Purpose : This study was to provide the methods of developing the growth charts and the blood pressure nomogram among Korean children and adolescents. Methods : The growth charts were developed based on the data from the national growth surveys for children and adolescents in 1998 and 2005. The percentile charts were developed through two stages. At the first stage, the selected empirical charts were smoothed through several fitting procedures including parametric and non-parametric methods. At the second stage, a modified LMS (lambda, mu, sigma) statistical procedure was applied to the smoothed percentile charts. The LMS procedure allowed to estimate any percentile and to calculate standard deviation units and z-scores. The charts for weight-for-age, height-for-age, BMI-for-age, weight-for-height and head circumference-for-age were developed by sex. Age and normalized height controlled sex-specific nomograms of systolic and diastolic blood pressure were developed by a fixed effect model of general regression using the data from 2005 national growth survey. Results : The significant systemic differences between the percentiles of growth charts and the empirical data were not found. The final output of the study is available from Korean Center for Disease Control and Prevention homepage, http://www.cdc.go.kr/webcdc/. Blood Pressure nomogram was tabulated by height percentiles and age using the regression coefficients analyzed with regression model. Conclusion : 2007 growth charts and blood pressure nomogram were the first products based on the statistical modeling using the national survey data. The further study on the methodology including data collection, data cleaning and statistical modeling for representative growth charts would be needed.

A comparison of mortality projection by different time period in time series (시계열 이용기간에 따른 사망률 예측 비교)

  • Kim, Soon-Young;Oh, Jinho;Kim, Kee-Whan
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.41-65
    • /
    • 2018
  • In Korea, as the mortality rate improves in a shorter period of time than in developed countries, it is important to consider the selection of the time series as well as the model selection in the mortality projection. Therefore, this study proposed a method using the multiple regression model in respect to the selection of the time series period. In addition, we investigate the problems that arise when various time series are used based on the Lee-Carter (LC) model, the kinds of LC model along with Lee-Miller (LM) and Booth-Maindonald-Smith (BMS), and the non-parametric model such as functional data model (FDM) and Coherent FDM, and examine differences in the age-specific mortality rate and life expectancy projection. Based on the analysis results, the age-specific mortality rate and predicted life expectancy of men and women are calculated for the year 2030 for each model. We also compare the mortality rate and life expectancy of the next generation provided by Korean Statistical Information Service (KOSIS).