• Title/Summary/Keyword: Statistical Learning

Search Result 1,300, Processing Time 0.032 seconds

A Study on Use of Archival Information for Resource-based Learning (자원기반학습을 위한 기록정보의 활용방안에 관한 연구)

  • Han, Hyun-Jin;Lee, Soo-Sang
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.8 no.1
    • /
    • pp.143-165
    • /
    • 2008
  • This study is concerned with educational services provided for archives with a focus on programs for teachers and students in the classroom. The purpose of this study is to develop the archival-resource based learning model. And the other purpose is to find out the influence of the archival-resource based learning. The researcher and teachers designed two lessen plans for archival-resource based learning and general learning. To compare with the archival-resource based learning and the general learning, the researcher divided into two comparison classes of 6th graders of two elementary schools. Statistical analysis was conducted by analysis of covariance using SPSS WIN 12.0 for t-test.

Evaluation of Attribute Selection Methods and Prior Discretization in Supervised Learning

  • Cha, Woon Ock;Huh, Moon Yul
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.879-894
    • /
    • 2003
  • We evaluated the efficiencies of applying attribute selection methods and prior discretization to supervised learning, modelled by C4.5 and Naive Bayes. Three databases were obtained from UCI data archive, which consisted of continuous attributes except for one decision attribute. Four methods were used for attribute selection : MDI, ReliefF, Gain Ratio and Consistency-based method. MDI and ReliefF can be used for both continuous and discrete attributes, but the other two methods can be used only for discrete attributes. Discretization was performed using the Fayyad and Irani method. To investigate the effect of noise included in the database, noises were introduced into the data sets up to the extents of 10 or 20%, and then the data, including those either containing the noises or not, were processed through the steps of attribute selection, discretization and classification. The results of this study indicate that classification of the data based on selected attributes yields higher accuracy than in the case of classifying the full data set, and prior discretization does not lower the accuracy.

Harnessing sparsity in lamb wave-based damage detection for beams

  • Sen, Debarshi;Nagarajaiah, Satish;Gopalakrishnan, S.
    • Structural Monitoring and Maintenance
    • /
    • v.4 no.4
    • /
    • pp.381-396
    • /
    • 2017
  • Structural health monitoring (SHM) is a necessity for reliable and efficient functioning of engineering systems. Damage detection (DD) is a crucial component of any SHM system. Lamb waves are a popular means to DD owing to their sensitivity to small damages over a substantial length. This typically involves an active sensing paradigm in a pitch-catch setting, that involves two piezo-sensors, a transmitter and a receiver. In this paper, we propose a data-intensive DD approach for beam structures using high frequency signals acquired from beams in a pitch-catch setting. The key idea is to develop a statistical learning-based approach, that harnesses the inherent sparsity in the problem. The proposed approach performs damage detection, localization in beams. In addition, quantification is possible too with prior calibration. We demonstrate numerically that the proposed approach achieves 100% accuracy in detection and localization even with a signal to noise ratio of 25 dB.

A Technique of Statistical Message Filtering for Blocking Spam Message (통계적 기법을 이용한 스팸메시지 필터링 기법)

  • Kim, Seongyoon;Cha, Taesoo;Park, Jeawon;Choi, Jaehyun;Lee, Namyong
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.299-308
    • /
    • 2014
  • Due to indiscriminately received spam messages on information society, spam messages cause damages not only to person but also to our community. Nowadays a lot of spam filtering techniques, such as blocking characters, are studied actively. Most of these studies are content-based spam filtering technologies through machine learning.. Because of a spam message transmission techniques are being developed, spammers have to send spam messages using term spamming techniques. Spam messages tend to include number of nouns, using repeated words and inserting special characters between words in a sentence. In this paper, considering three features, SPSS statistical program were used in parameterization and we derive the equation. And then, based on this equation we measured the performance of classification of spam messages. The study compared with previous studies FP-rate in terms of further minimizing the cost of product was confirmed to show an excellent performance.

A Comparative Study on the Bankruptcy Prediction Power of Statistical Model and AI Models: MDA, Inductive,Neural Network (기업도산예측을 위한 통계적모형과 인공지능 모형간의 예측력 비교에 관한 연구 : MDA,귀납적 학습방법, 인공신경망)

  • 이건창
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.18 no.2
    • /
    • pp.57-81
    • /
    • 1993
  • This paper is concerned with analyzing the bankruptcy prediction power of three methods : Multivariate Discriminant Analysis (MDA), Inductive Learning, Neural Network, MDA has been famous for its effectiveness for predicting bankrupcy in accounting fields. However, it requires rigorous statistical assumptions, so that violating one of the assumptions may result in biased outputs. In this respect, we alternatively propose the use of two AI models for bankrupcy prediction-inductive learning and neural network. To compare the performance of those two AI models with that of MDA, we have performed massive experiments with a number of Korean bankrupt-cases. Experimental results show that AI models proposed in this study can yield more robust and generalizing bankrupcy prediction than the conventional MDA can do.

  • PDF

Support Vector Machine based on Stratified Sampling

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.9 no.2
    • /
    • pp.141-146
    • /
    • 2009
  • Support vector machine is a classification algorithm based on statistical learning theory. It has shown many results with good performances in the data mining fields. But there are some problems in the algorithm. One of the problems is its heavy computing cost. So we have been difficult to use the support vector machine in the dynamic and online systems. To overcome this problem we propose to use stratified sampling of statistical sampling theory. The usage of stratified sampling supports to reduce the size of training data. In our paper, though the size of data is small, the performance accuracy is maintained. We verify our improved performance by experimental results using data sets from UCI machine learning repository.

Predicting movie audience with stacked generalization by combining machine learning algorithms

  • Park, Junghoon;Lim, Changwon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.217-232
    • /
    • 2021
  • The Korea film industry has matured and the number of movie-watching per capita has reached the highest level in the world. Since then, movie industry growth rate is decreasing and even the total sales of movies per year slightly decreased in 2018. The number of moviegoers is the first factor of sales in movie industry and also an important factor influencing additional sales. Thus it is important to predict the number of movie audiences. In this study, we predict the cumulative number of audiences of films using stacking, an ensemble method. Stacking is a kind of ensemble method that combines all the algorithms used in the prediction. We use box office data from Korea Film Council and web comment data from Daum Movie (www.movie.daum.net). This paper describes the process of collecting and preprocessing of explanatory variables and explains regression models used in stacking. Final stacking model outperforms in the prediction of test set in terms of RMSE.

Character Recognition Based on Adaptive Statistical Learning Algorithm

  • K.C. Koh;Park, H.J.;Kim, J.S.;K. Koh;H.S. Cho
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.109.2-109
    • /
    • 2001
  • In the PCB assembly lines, as components become more complex and smaller, the conventional inspection method using traditional ICT and function test show their limitations in application. The automatic optical inspection(AOI) gradually becomes the alternative in the PCB assembly line. In Particular, the PCB inspection machines need more reliable and flexible object recognition algorithms for high inspection accuracy. The conventional AOI machines use the algorithmic approaches such as template matching, Fourier analysis, edge analysis, geometric feature recognition or optical character recognition (OCR), which mostly require much of teaching time and expertise of human operators. To solve this problem, in this paper, a statistical learning based part recognition method is proposed. The performance of the ...

  • PDF

The use of support vector machines in semi-supervised classification

  • Bae, Hyunjoo;Kim, Hyungwoo;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.193-202
    • /
    • 2022
  • Semi-supervised learning has gained significant attention in recent applications. In this article, we provide a selective overview of popular semi-supervised methods and then propose a simple but effective algorithm for semi-supervised classification using support vector machines (SVM), one of the most popular binary classifiers in a machine learning community. The idea is simple as follows. First, we apply the dimension reduction to the unlabeled observations and cluster them to assign labels on the reduced space. SVM is then employed to the combined set of labeled and unlabeled observations to construct a classification rule. The use of SVM enables us to extend it to the nonlinear counterpart via kernel trick. Our numerical experiments under various scenarios demonstrate that the proposed method is promising in semi-supervised classification.

A review and comparison of convolution neural network models under a unified framework

  • Park, Jimin;Jung, Yoonsuh
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.161-176
    • /
    • 2022
  • There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of efficient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with different characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.