• Title/Summary/Keyword: Statistical predictions

Search Result 208, Processing Time 0.024 seconds

Statistical Characteristics of Fractal Dimension in Turbulent Prefixed Flame (난류 예혼합 화염에서의 프랙탈 차원의 통계적 특성)

  • Lee, Dae-Hun;Gwon, Se-Jin
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.26 no.1
    • /
    • pp.18-26
    • /
    • 2002
  • With the introduction of Fractal notation, various fields of engineering adopted fractal notation to express characteristics of geometry involved and one of the most frequently applied areas was turbulence. With research on turbulence regarding the surface as fractal geometry, attempts to analyze turbulent premised flame as fractal geometry also attracted attention as a tool for modeling, for the flame surface can be viewed as fractal geometry. Experiments focused on disclosure of flame characteristics by measuring fractal parameters were done by researchers. But robust principle or theory can't be extracted. Only reported modeling efforts using fractal dimension is flame speed model by Gouldin. This model gives good predictions of flame speed in unstrained case but not in highly strained flame condition. In this research, approaches regarding fractal dimension of flame as one representative value is pointed out as a reason for the absence of robust model. And as an extort to establish robust modeling, Presents methods treating fractal dimension as statistical variable. From this approach flame characteristics reported by experiments such as Da effect on flame structure can be seen quantitatively and shows possibility of flame modeling using fractal parameters with statistical method. From this result more quantitative model can be derived.

Cluster Analysis and Meteor-Statistical Model Test to Develop a Daily Forecasting Model for Jejudo Wind Power Generation (제주도 일단위 풍력발전예보 모형개발을 위한 군집분석 및 기상통계모형 실험)

  • Kim, Hyun-Goo;Lee, Yung-Seop;Jang, Moon-Seok
    • Journal of Environmental Science International
    • /
    • v.19 no.10
    • /
    • pp.1229-1235
    • /
    • 2010
  • Three meteor-statistical forecasting models - the transfer function model, the time-series autoregressive model and the neural networks model - were tested to develop a daily forecasting model for Jejudo, where the need and demand for wind power forecasting has increased. All the meteorological observation sites in Jejudo have been classified into 6 groups using a cluster analysis. Four pairs of observation sites among them, all having strong wind speed correlation within the same meteorological group, were chosen for a model test. In the development of the wind speed forecasting model for Jejudo, it was confirmed that not only the use a wind dataset at the objective site itself, but the introduction of another wind dataset at the nearest site having a strong wind speed correlation within the same group, would enhance the goodness to fit of the forecasting. A transfer function model and a neural network model were also confirmed to offer reliable predictions, with the similar goodness to fit level.

Investigating Data Preprocessing Algorithms of a Deep Learning Postprocessing Model for the Improvement of Sub-Seasonal to Seasonal Climate Predictions (계절내-계절 기후예측의 딥러닝 기반 후보정을 위한 입력자료 전처리 기법 평가)

  • Uran Chung;Jinyoung Rhee;Miae Kim;Soo-Jin Sohn
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.2
    • /
    • pp.80-98
    • /
    • 2023
  • This study explores the effectiveness of various data preprocessing algorithms for improving subseasonal to seasonal (S2S) climate predictions from six climate forecast models and their Multi-Model Ensemble (MME) using a deep learning-based postprocessing model. A pipeline of data transformation algorithms was constructed to convert raw S2S prediction data into the training data processed with several statistical distribution. A dimensionality reduction algorithm for selecting features through rankings of correlation coefficients between the observed and the input data. The training model in the study was designed with TimeDistributed wrapper applied to all convolutional layers of U-Net: The TimeDistributed wrapper allows a U-Net convolutional layer to be directly applied to 5-dimensional time series data while maintaining the time axis of data, but every input should be at least 3D in U-Net. We found that Robust and Standard transformation algorithms are most suitable for improving S2S predictions. The dimensionality reduction based on feature selections did not significantly improve predictions of daily precipitation for six climate models and even worsened predictions of daily maximum and minimum temperatures. While deep learning-based postprocessing was also improved MME S2S precipitation predictions, it did not have a significant effect on temperature predictions, particularly for the lead time of weeks 1 and 2. Further research is needed to develop an optimal deep learning model for improving S2S temperature predictions by testing various models and parameters.

Development and Application of Statistical Programs Based on Data and Artificial Intelligence Prediction Model to Improve Statistical Literacy of Elementary School Students (초등학생의 통계적 소양 신장을 위한 데이터와 인공지능 예측모델 기반의 통계프로그램 개발 및 적용)

  • Kim, Yunha;Chang, Hyewon
    • Communications of Mathematical Education
    • /
    • v.37 no.4
    • /
    • pp.717-736
    • /
    • 2023
  • The purpose of this study is to develop a statistical program using data and artificial intelligence prediction models and apply it to one class in the sixth grade of elementary school to see if it is effective in improving students' statistical literacy. Based on the analysis of problems in today's elementary school statistical education, a total of 15 sessions of the program was developed to encourage elementary students to experience the entire process of statistical problem solving and to make correct predictions by incorporating data, the core in the era of the Fourth Industrial Revolution into AI education. The biggest features of this program are the recognition of the importance of data, which are the key elements of artificial intelligence education, and the collection and analysis activities that take into account context using real-life data provided by public data platforms. In addition, since it consists of activities to predict the future based on data by using engineering tools such as entry and easy statistics, and creating an artificial intelligence prediction model, it is composed of a program focused on the ability to develop communication skills, information processing capabilities, and critical thinking skills. As a result of applying this program, not only did the program positively affect the statistical literacy of elementary school students, but we also observed students' interest, critical inquiry, and mathematical communication in the entire process of statistical problem solving.

Prediction of spatio-temporal AQI data

  • KyeongEun Kim;MiRu Ma;KyeongWon Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.119-133
    • /
    • 2023
  • With the rapid growth of the economy and fossil fuel consumption, the concentration of air pollutants has increased significantly and the air pollution problem is no longer limited to small areas. We conduct statistical analysis with the actual data related to air quality that covers the entire of South Korea using R and Python. Some factors such as SO2, CO, O3, NO2, PM10, precipitation, wind speed, wind direction, vapor pressure, local pressure, sea level pressure, temperature, humidity, and others are used as covariates. The main goal of this paper is to predict air quality index (AQI) spatio-temporal data. The observations of spatio-temporal big datasets like AQI data are correlated both spatially and temporally, and computation of the prediction or forecasting with dependence structure is often infeasible. As such, the likelihood function based on the spatio-temporal model may be complicated and some special modelings are useful for statistically reliable predictions. In this paper, we propose several methods for this big spatio-temporal AQI data. First, random effects with spatio-temporal basis functions model, a classical statistical analysis, is proposed. Next, neural networks model, a deep learning method based on artificial neural networks, is applied. Finally, random forest model, a machine learning method that is closer to computational science, will be introduced. Then we compare the forecasting performance of each other in terms of predictive diagnostics. As a result of the analysis, all three methods predicted the normal level of PM2.5 well, but the performance seems to be poor at the extreme value.

Risk Prediction Using Genome-Wide Association Studies on Type 2 Diabetes

  • Choi, Sungkyoung;Bae, Sunghwan;Park, Taesung
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.138-148
    • /
    • 2016
  • The success of genome-wide association studies (GWASs) has enabled us to improve risk assessment and provide novel genetic variants for diagnosis, prevention, and treatment. However, most variants discovered by GWASs have been reported to have very small effect sizes on complex human diseases, which has been a big hurdle in building risk prediction models. Recently, many statistical approaches based on penalized regression have been developed to solve the "large p and small n" problem. In this report, we evaluated the performance of several statistical methods for predicting a binary trait: stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN). We first built a prediction model by combining variable selection and prediction methods for type 2 diabetes using Affymetrix Genome-Wide Human SNP Array 5.0 from the Korean Association Resource project. We assessed the risk prediction performance using area under the receiver operating characteristic curve (AUC) for the internal and external validation datasets. In the internal validation, SLR-LASSO and SLR-EN tended to yield more accurate predictions than other combinations. During the external validation, the SLR-SLR and SLR-EN combinations achieved the highest AUC of 0.726. We propose these combinations as a potentially powerful risk prediction model for type 2 diabetes.

Predictability of Consumer Expectations for Future Changes in Real Growth (소비자 기대심리의 미래 성장 예측력)

  • Kim, Tae-Ho;Lim, La-Hee;Lee, Seung-Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.457-465
    • /
    • 2015
  • The long lasting world-wide recession and low economic progress have made it more important to predict future economic behavior. Accordingly, it is of interest to explore useful leading indicators, correlated with policy targets, to predict future economic growth. This study attempts to develop a model to evaluate the performance of consumer survey results from Statistics Korea to predict future economic activities. A statistical model is formulated and estimated to generate predictions by utilizing consumer expectations. The prediction is found improved in the distant future and consumer expectations appear to be a useful leading indicator to provide information of future real growth.

Design-oriented strength and strain models for GFRP-wrapped concrete

  • Messaoud, Houssem;Kassoul, Amar;Bougara, Abdelkader
    • Computers and Concrete
    • /
    • v.26 no.3
    • /
    • pp.293-307
    • /
    • 2020
  • The aim of this paper is to develop design-oriented models for the prediction of the ultimate strength and ultimate axial strain for concrete confined with glass fiber-reinforced polymer (GFRP) wraps. Twenty of most used and recent design-oriented models developed to predict the strength and strain of GFRP-confined concrete in circular sections are selected and evaluated basing on a database of 163 test results of concrete cylinders confined with GFRP wraps subjected to uniaxial compression. The evaluation of these models is performed using three statistical indices namely the coefficient of the determination (R2), the root mean square error (RMSE), and the average absolute error (AAE). Based on this study, new strength and strain models for GFRP-wrapped concrete are developed using regression analysis. The obtained results show that the proposed models exhibit better performance and provide accurate predictions over the existing models.

Machine Learning Approaches to Corn Yield Estimation Using Satellite Images and Climate Data: A Case of Iowa State

  • Kim, Nari;Lee, Yang-Won
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.4
    • /
    • pp.383-390
    • /
    • 2016
  • Remote sensing data has been widely used in the estimation of crop yields by employing statistical methods such as regression model. Machine learning, which is an efficient empirical method for classification and prediction, is another approach to crop yield estimation. This paper described the corn yield estimation in Iowa State using four machine learning approaches such as SVM (Support Vector Machine), RF (Random Forest), ERT (Extremely Randomized Trees) and DL (Deep Learning). Also, comparisons of the validation statistics among them were presented. To examine the seasonal sensitivities of the corn yields, three period groups were set up: (1) MJJAS (May to September), (2) JA (July and August) and (3) OC (optimal combination of month). In overall, the DL method showed the highest accuracies in terms of the correlation coefficient for the three period groups. The accuracies were relatively favorable in the OC group, which indicates the optimal combination of month can be significant in statistical modeling of crop yields. The differences between our predictions and USDA (United States Department of Agriculture) statistics were about 6-8 %, which shows the machine learning approaches can be a viable option for crop yield modeling. In particular, the DL showed more stable results by overcoming the overfitting problem of generic machine learning methods.

Development of Empirical and Statistical Models for Prediction of Water Quality of Pretreated Wastewater in Pulp and Paper Industry (제지공정 폐수 전처리 수질예측을 위한 실험적 모델과 통계적 모델 개발)

  • Sohn, Jinsik;Han, Jihee;Lee, Sangho
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.31 no.4
    • /
    • pp.289-296
    • /
    • 2017
  • Pulp and paper industry produces large volumes of wastewater and residual sludge waste, resulting in many issues in relation to wastewater treatment and sludge disposal. Contaminants in pulp and paper wastewater include effluent solids, sediments, chemical oxygen demand (COD), and biological oxygen demand (BOD), which should be treated by wastewater treatment processes such as coagulation and biological treatment. However, few works have been attempted to predict the treatment efficiency of pulp and paper wastewater. Accordingly, this study presented empirical models based on experimental data in laboratory-scale coagulation tests and compared them with statistical models such as artificial neural network (ANN). Results showed that the water quality parameters such as turbidity, suspended solids, COD, and UVA can be predicted using either linear or expoential regression models. Nevertheless, the accuracies for turbidity and UVA predictions were relatively lower than those for SS and COD. On the other hand, ANN showed higher accuracies than the emprical models for all water parameters. However, it seems that two kinds of models should be used together to provide more accurate information on the treatment efficiency of pulp and paper wastewater.