• Title/Summary/Keyword: random data analysis

Search Result 1,687, Processing Time 0.037 seconds

Evaluation of chassis component reliability considering variation of fatigue data (피로 자료 분산을 고려한 자동차 부품의 신뢰도 해석)

  • Nam G.W;Lee B.C.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2005.06a
    • /
    • pp.690-693
    • /
    • 2005
  • In this paper, probabilistic distribution of fatigue life of chassis component is determined statistically by applying the design of experiments and the Pearson system. To construct $p-\varepsilon-N$ curve, the case that fatigue data are random variables is attempted. Probabilistic density function(p.d.f) for fatigue life is obtained by design of experiment and using this p.d.f fatigue reliability about any aimed fatigue life can be calculated. Lower control arm and rear torsion bar of chassis component are selected as examples for analysis. Component load histories, which are obtained by multi-body dynamic simulation for Belsian load history, are used. Finite element analysis are performed using commercial software MSC Nastran and fatigue analysis are performed using FE Fatigue. When strain-life curve itself is random variable, probability density function of fatigue life has very little difference from log-normal distribution. And the case of fatigue data are random variables, probability density functions are approximated to Beta distribution. Each p.d.f is verified by Monte-Carlo simulation.

  • PDF

Evaluation of Chassis Component Reliability Considering Variation of Fatigue Data (피로 자료 분산을 고려한 자동차 부품의 신뢰도 해석)

  • Nam, Gi-Won;Lee, Byung-Chai
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.24 no.2 s.191
    • /
    • pp.110-117
    • /
    • 2007
  • In this paper, probabilistic distribution of chassis component fatigue life is determined statistically by applying the design of experiments and the Pearson system. To construct p - ${\varepsilon}$ - N curve, the case that fatigue data are random variables is attempted. Probabilistic density function (p.d.f) for fatigue life is obtained by the design of experiment and using this p.d.f fatigue reliability, any aimed fatigue life can be calculated. Lower control arm and rear torsion bar of chassis components are selected as examples for analysis. Component load histories which are obtained by multi-body dynamic simulation for Belsian load history are used. Finite element analysis is performed by using commercial software MSC Nastran and fatigue analysis is performed by using FE Fatigue. When strain-life curve itself is random variable, the probability density function of fatigue life has very little difference from log-normal distribution. And the cases of fatigue data are random variables, probability density functions are approximated to Beta distribution. Each p.d.f is verified by Monte-Carlo simulation.

A study on applying random forest and gradient boosting algorithm for Chl-a prediction of Daecheong lake (대청호 Chl-a 예측을 위한 random forest와 gradient boosting 알고리즘 적용 연구)

  • Lee, Sang-Min;Kim, Il-Kyu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.6
    • /
    • pp.507-516
    • /
    • 2021
  • In this study, the machine learning which has been widely used in prediction algorithms recently was used. the research point was the CD(chudong) point which was a representative point of Daecheong Lake. Chlorophyll-a(Chl-a) concentration was used as a target variable for algae prediction. to predict the Chl-a concentration, a data set of water quality and quantity factors was consisted. we performed algorithms about random forest and gradient boosting with Python. to perform the algorithms, at first the correlation analysis between Chl-a and water quality and quantity data was studied. we extracted ten factors of high importance for water quality and quantity data. as a result of the algorithm performance index, the gradient boosting showed that RMSE was 2.72 mg/m3 and MSE was 7.40 mg/m3 and R2 was 0.66. as a result of the residual analysis, the analysis result of gradient boosting was excellent. as a result of the algorithm execution, the gradient boosting algorithm was excellent. the gradient boosting algorithm was also excellent with 2.44 mg/m3 of RMSE in the machine learning hyperparameter adjustment result.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

A Topological Analysis of Large Scale Structure Using the CMASS Sample of SDSS-III

  • Choi, Yun-Young;Kim, Juhan;Kim, Sungsoo
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.56.2-56.2
    • /
    • 2013
  • We study the three-dimensional genus topology of large-scale structure using the CMASS Data Release 11 sample of the SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS). The CMASS sample yields a genus curve that is characteristic of one produced by Gaussian random-phase initial conditions. The data thus supports the standard model of inflation where random quantum fluctuations in the early universe produced Gaussian random-phase initial conditions. Modest deviations in the observed genus from random phase are as expected from the nonlinear evolution of structure. We construct mock SDSS CMASS surveys along the past light cone from the Horizon Run 3 (HR3) N-body simulations, where gravitationally bound dark matter subhalos are identified as the sites of galaxy formation. We study the genus topology of the HR3 mock surveys with the same geometry and sampling density as the observational sample, and the observed genus topology to be consistent with LCDM as simulated by the HR3 mock samples.

  • PDF

Bayesian analysis of random partition models with Laplace distribution

  • Kyung, Minjung
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.5
    • /
    • pp.457-480
    • /
    • 2017
  • We develop a random partition procedure based on a Dirichlet process prior with Laplace distribution. Gibbs sampling of a Laplace mixture of linear mixed regressions with a Dirichlet process is implemented as a random partition model when the number of clusters is unknown. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities, unlike its counterparts. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo posterior computation. The proposed method is illustrated with simulated data and one real data of the energy efficiency of Tsanas and Xifara (Energy and Buildings, 49, 560-567, 2012).

Bayesian estimation of median household income for small areas with some longitudinal pattern

  • Lee, Jayoun;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.755-762
    • /
    • 2015
  • One of the main objectives of the U.S. Census Bureau is the proper estimation of median household income for small areas. These estimates have an important role in the formulation of various governmental decisions and policies. Since direct survey estimates are available annually for each state or county, it is desirable to exploit the longitudinal trend in income observations in the estimation procedure. In this study, we consider Fay-Herriot type small area models which include time-specific random effect to accommodate any unspecified time varying income pattern. Analysis is carried out in a hierarchical Bayesian framework using Markov chain Monte Carlo methodology. We have evaluated our estimates by comparing those with the corresponding census estimates of 1999 using some commonly used comparison measures. It turns out that among three types of time-specific random effects the small area model with a time series random walk component provides estimates which are superior to both direct estimates and the Census Bureau estimates.

Analysis of a Random Shock Model for a System and Its Optimization

  • Park, Jeong-Hun;Choi, Seung-Kyoung;Lee, Eui-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.773-782
    • /
    • 2004
  • In this paper, a random shock model for a system is considered. Each shock arriving according to a Poisson process decreases the state of the system by a random amount. A repairman arriving according to another Poisson process of rate $\lambda$ repairs the system only if the state of the system is below a threshold $\alpha$. After assigning various costs to the system, we calculate the long-run average cost and show that there exist a unique value of arrival rate $\lambda$ and a unique value of threshold $\alpha$ which minimize the long-run average cost per unit time.

  • PDF

Bayesian Modeling of Random Effects Covariance Matrix for Generalized Linear Mixed Models

  • Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.235-240
    • /
    • 2013
  • Generalized linear mixed models(GLMMs) are frequently used for the analysis of longitudinal categorical data when the subject-specific effects is of interest. In GLMMs, the structure of the random effects covariance matrix is important for the estimation of fixed effects and to explain subject and time variations. The estimation of the matrix is not simple because of the high dimension and the positive definiteness; subsequently, we practically use the simple structure of the covariance matrix such as AR(1). However, this strong assumption can result in biased estimates of the fixed effects. In this paper, we introduce Bayesian modeling approaches for the random effects covariance matrix using a modified Cholesky decomposition. The modified Cholesky decomposition approach has been used to explain a heterogenous random effects covariance matrix and the subsequent estimated covariance matrix will be positive definite. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using these methods.

An Exploratory Observation of Analyzing Event-Related Potential Data on the Basis of Random-Resampling Method (무선재추출법에 기초한 사건관련전위 자료분석에 대한 탐색적 고찰)

  • Hyun, Joo-Seok
    • Science of Emotion and Sensibility
    • /
    • v.20 no.2
    • /
    • pp.149-160
    • /
    • 2017
  • In hypothesis testing, the interpretation of a statistic obtained from the data analysis relies on a probabilistic distribution of the statistic constructed according to several statistical theories. For instance, the statistical significance of a mean difference between experimental conditions is determined according to a probabilistic distribution of the mean differences (e.g., Student's t) constructed under several theoretical assumptions for population characteristics. The present study explored the logic and advantages of random-resampling approach for analyzing event-related potentials (ERPs) where a hypothesis is tested according to the distribution of empirical statistics that is constructed based on randomly resampled dataset of real measures rather than a theoretical distribution of the statistics. To motivate ERP researchers' understanding of the random-resampling approach, the present study further introduced a specific example of data analyses where a random-permutation procedure was applied according to the random-resampling principle, as well as discussing several cautions ahead of its practical application to ERP data analyses.