• Title/Summary/Keyword: statistical learning

Search Result 1,288, Processing Time 0.026 seconds

A Dynamic Correction Technique of Time-Series Data using Anomaly Detection Model based on LSTM-GAN (LSTM-GAN 기반 이상탐지 모델을 활용한 시계열 데이터의 동적 보정기법)

  • Hanseok Jeong;Han-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.103-111
    • /
    • 2023
  • This paper proposes a new data correction technique that transforms anomalies in time series data into normal values. With the recent development of IT technology, a vast amount of time-series data is being collected through sensors. However, due to sensor failures and abnormal environments, most of time-series data contain a lot of anomalies. If we build a predictive model using original data containing anomalies as it is, we cannot expect highly reliable predictive performance. Therefore, we utilizes the LSTM-GAN model to detect anomalies in the original time series data, and combines DTW (Dynamic Time Warping) and GAN techniques to replace the anomaly data with normal data in partitioned window units. The basic idea is to construct a GAN model serially by applying the statistical information of the window with normal distribution data adjacent to the window containing the detected anomalies to the DTW so as to generate normal time-series data. Through experiments using open NAB data, we empirically prove that our proposed method outperforms the conventional two correction methods.

The Validity and Reliability of the Korean Version of Readiness for Practice Survey for Nursing Students (한국어판 간호학생 간호실무준비도 측정도구의 타당도와 신뢰도)

  • Lee, Tae Wha;Ji, Yoonjung;Yoon, Yea Seul
    • Journal of Korean Academy of Nursing
    • /
    • v.52 no.6
    • /
    • pp.564-581
    • /
    • 2022
  • Purpose: This study aimed to evaluate the validity and reliability of the Korean version of the Readiness for Practice Survey (K-RPS). Method: The English Readiness for Practice Survey was translated into Korean using the Translation, Review, Adjudication, Pretesting, and Documentation (TRAPD) method. Secondary data analysis was performed using the dataset from the New Nurse e-Cohort study (Panel 2020) in South Korea. This study used a nationally representative sample of 812 senior nursing students. Exploratory and confirmatory factor analyses were also conducted. Convergent validity within the items and discriminant validity between factors were assessed to evaluate construct validity. Construct validity for hypothesis testing was evaluated using convergent and discriminant validity. Ordinary α was used to assess reliability. Results: The K-RPS comprises 20 items examining four factors: clinical problem solving, learning experience, professional responsibilities, and professional preparation. Although the convergent validity of the items was successfully verified, discriminant validity between the factors was not. The K-RPS construct validity was verified using a bi-factor model (CMIN/DF 2.20, RMSEA .06, TLI .97, CFI .97, and PGFI .59). The K-RPS was significantly correlated with self-esteem (r = .43, p < .001) and anxiety about clinical practicum (r = - .50, p < .001). Internal consistency was reliable based on an ordinary α of .88. Conclusion: The K-RPS is both valid and reliable and can be used as a standardized Korean version of the Readiness for Practice measurement tool.

Forecasting volatility index by temporal convolutional neural network (Causal temporal convolutional neural network를 이용한 변동성 지수 예측)

  • Ji Won Shin;Dong Wan Shin
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.129-139
    • /
    • 2023
  • Forecasting volatility is essential to avoiding the risk caused by the uncertainties of an financial asset. Complicated financial volatility features such as ambiguity between non-stationarity and stationarity, asymmetry, long-memory, sudden fairly large values like outliers bring great challenges to volatility forecasts. In order to address such complicated features implicity, we consider machine leaning models such as LSTM (1997) and GRU (2014), which are known to be suitable for existing time series forecasting. However, there are the problems of vanishing gradients, of enormous amount of computation, and of a huge memory. To solve these problems, a causal temporal convolutional network (TCN) model, an advanced form of 1D CNN, is also applied. It is confirmed that the overall forecasting power of TCN model is higher than that of the RNN models in forecasting VIX, VXD, and VXN, the daily volatility indices of S&P 500, DJIA, Nasdaq, respectively.

Meta-heuristic optimization algorithms for prediction of fly-rock in the blasting operation of open-pit mines

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Mohammadi, Mokhtar;Ibrahim, Hawkar Hashim;Rashidi, Shima;Mohammed, Adil Hussein
    • Geomechanics and Engineering
    • /
    • v.30 no.6
    • /
    • pp.489-502
    • /
    • 2022
  • In this study, a Gaussian process regression (GPR) model as well as six GPR-based metaheuristic optimization models, including GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, and GPR-SSO, were developed to predict fly-rock distance in the blasting operation of open pit mines. These models included GPR-SCA, GPR-SSO, GPR-MVO, and GPR. In the models that were obtained from the Soungun copper mine in Iran, a total of 300 datasets were used. These datasets included six input parameters and one output parameter (fly-rock). In order to conduct the assessment of the prediction outcomes, many statistical evaluation indices were used. In the end, it was determined that the performance prediction of the ML models to predict the fly-rock from high to low is GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, GPR-SSO, and GPR with ranking scores of 66, 60, 54, 46, 43, 38, and 30 (for 5-fold method), respectively. These scores correspond in conclusion, the GPR-PSO model generated the most accurate findings, hence it was suggested that this model be used to forecast the fly-rock. In addition, the mutual information test, also known as MIT, was used in order to investigate the influence that each input parameter had on the fly-rock. In the end, it was determined that the stemming (T) parameter was the most effective of all the parameters on the fly-rock.

A Meta-Analysis on Effects of Infant's Sociality Development in Forest Experience Activities (숲 체험 활동이 유아의 사회성 발달의 효과에 관한 메타분석)

  • Chan-Woo Kim;Duk-Byeong Park
    • Journal of Agricultural Extension & Community Development
    • /
    • v.29 no.4
    • /
    • pp.225-250
    • /
    • 2022
  • This study aims to examine the effects of infant's social development forest experience activities through meta-analysis. The final nine studies(total of 165 in the experimental group and 159 in the control group) were selected as a method of systematic review. Meta-analysis on overall effect size estimation, chi-square test, significance analysis, publication bias analysis, and subgroup analysis was performed using the R program. The overall effect size of 9 studies was 1.59, indicating a large effect size. As a result of subgroup analysis of the sub-factors of sociality, autonomy showed the largest effect size at 1.47, the adjusted effect size of cooperation was 1.34, the effect size adjusted for peer interaction was 1.29, and the adjusted effect size for perspective-taking ability was 0.97. All were found to have a statistically significant effect. To analyze the moderating effect, a meta-regression analysis was conducted on the participation period(4, 5~6, 7~8weeks), the number of sessions(6~10, 11~15, 16~20), the frequency per week(1, 2, 5), and the participation time(40, 60, 90, 120, 150min), but there was no statistical difference. Although not statistically significant, the effect size was larger when the participation period was 4 weeks, the number of sessions was 16 to 20, the frequency was 2 times per week, and the participation time was 40 minutes. This results can be usefully utilized by policy makers and forest commentators related to the vitalization of forest education through forest experience activities.

Futures Price Prediction based on News Articles using LDA and LSTM (LDA와 LSTM를 응용한 뉴스 기사 기반 선물가격 예측)

  • Jin-Hyeon Joo;Keun-Deok Park
    • Journal of Industrial Convergence
    • /
    • v.21 no.1
    • /
    • pp.167-173
    • /
    • 2023
  • As research has been published to predict future data using regression analysis or artificial intelligence as a method of analyzing economic indicators. In this study, we designed a system that predicts prospective futures prices using artificial intelligence that utilizes topic probability data obtained from past news articles using topic modeling. Topic probability distribution data for each news article were obtained using the Latent Dirichlet Allocation (LDA) method that can extract the topic of a document from past news articles via unsupervised learning. Further, the topic probability distribution data were used as the input for a Long Short-Term Memory (LSTM) network, a derivative of Recurrent Neural Networks (RNN) in artificial intelligence, in order to predict prospective futures prices. The method proposed in this study was able to predict the trend of futures prices. Later, this method will also be able to predict the trend of prices for derivative products like options. However, because statistical errors occurred for certain data; further research is required to improve accuracy.

South Korean Elementary Students' Mathematical Listening Ability (초등학생의 수학 청해력 실태 조사 연구)

  • Kim, Rina
    • Communications of Mathematical Education
    • /
    • v.37 no.2
    • /
    • pp.183-197
    • /
    • 2023
  • Mathematical listening ability(MLA) refers to the capability to listen to speech languages that contain mathematical principles and concepts and understand their meanings, distinguishing it from daily life and listening in other subject classes. In this study, I investigated 834 elementary school students' MLA adapting a MLA survey items. Through the statistical analysis results of the survey, I confirmed that students' MLA had a significant correlation with gender, grade, and school location. Female students' MLA was statistically significantly higher than that of male students. MLA increased with grade and then decreased again in 6th grade. In addition, students' MLA was statistically significant differences according to the location of the school. The results of this study might be used as the basis for follow-up research and development of teaching and learning materials related to MLA.

Visualizing Unstructured Data using a Big Data Analytical Tool R Language (빅데이터 분석 도구 R 언어를 이용한 비정형 데이터 시각화)

  • Nam, Soo-Tai;Chen, Jinhui;Shin, Seong-Yoon;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.151-154
    • /
    • 2021
  • Big data analysis is the process of discovering meaningful new correlations, patterns, and trends in large volumes of data stored in data stores and creating new value. Thus, most big data analysis technology methods include data mining, machine learning, natural language processing, and pattern recognition used in existing statistical computer science. Also, using the R language, a big data tool, we can express analysis results through various visualization functions using pre-processing text data. The data used in this study was analyzed for 21 papers in the March 2021 among the journals of the Korea Institute of Information and Communication Engineering. In the final analysis results, the most frequently mentioned keyword was "Data", which ranked first 305 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

  • PDF

Visualizing Article Material using a Big Data Analytical Tool R Language (빅데이터 분석 도구 R 언어를 이용한 논문 데이터 시각화)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.326-327
    • /
    • 2021
  • Newly, big data utilization has been widely interested in a wide variety of industrial fields. Big data analysis is the process of discovering meaningful new correlations, patterns, and trends in large volumes of data stored in data stores and creating new value. Thus, most big data analysis technology methods include data mining, machine learning, natural language processing, and pattern recognition used in existing statistical computer science. Also, using the R language, a big data tool, we can express analysis results through various visualization functions using pre-processing text data. The data used in this study were analyzed for 29 papers in a specific journal. In the final analysis results, the most frequently mentioned keyword was "Research", which ranked first 743 times. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.

  • PDF

Predictive model for the shear strength of concrete beams reinforced with longitudinal FRP bars

  • Alzabeebee, Saif;Dhahir, Moahmmed K.;Keawsawasvong, Suraparb
    • Structural Engineering and Mechanics
    • /
    • v.84 no.2
    • /
    • pp.143-154
    • /
    • 2022
  • Corrosion of steel reinforcement is considered as the main cause of concrete structures deterioration, especially those under humid environmental conditions. Hence, fiber reinforced polymer (FRP) bars are being increasingly used as a replacement for conventional steel owing to their non-corrodible characteristics. However, predicting the shear strength of beams reinforced with FRP bars still challenging due to the lack of robust shear theory. Thus, this paper aims to develop an explicit data driven based model to predict the shear strength of FRP reinforced beams using multi-objective evolutionary polynomial regression analysis (MOGA-EPR) as data driven models learn the behavior from the input data without the need to employee a theory that aid the derivation, and thus they have an enhanced accuracy. This study also evaluates the accuracy of predictive models of shear strength of FRP reinforced concrete beams employed by different design codes by calculating and comparing the values of the mean absolute error (MAE), root mean square error (RMSE), mean (𝜇), standard deviation of the mean (𝜎), coefficient of determination (R2), and percentage of prediction within error range of ±20% (a20-index). Experimental database has been developed and employed in the model learning, validation, and accuracy examination. The statistical analysis illustrated the robustness of the developed model with MAE, RMSE, 𝜇, 𝜎, R2, and a20-index of 14.6, 20.8, 1.05, 0.27, 0.85, and 0.61, respectively for training data and 10.4, 14.1, 0.98, 0.25, 0.94, and 0.60, respectively for validation data. Furthermore, the developed model achieved much better predictions than the standard predictive models as it scored lower MAE, RMSE, and 𝜎, and higher R2 and a20-index. The new model can be used in future with confidence in optimized designs as its accuracy is higher than standard predictive models.