• 제목/요약/키워드: statistical learning approach

검색결과 159건 처리시간 0.028초

The Role of Distributional Cues in the Acquisition of Verb Argument Structures

  • Kim, Mee-Sook
    • 한국언어정보학회지:언어와정보
    • /
    • 제7권1호
    • /
    • pp.87-99
    • /
    • 2003
  • This paper investigates the role of input frequency in the acquisition of verb argument structures based on distributional information of a corpus of utterances derived from the English CHILDES database (MacWhinney 1993). It has been widely accepted that children successfully learn verb argument structures by innate language mechanisms, such as linking rules which connect verb meanings and its syntactic structures. In contrast, an approach to language acquisition called “statistical language learning” has currently claimed that children could succeed in acquiring syntactic structures in the absence of innate language mechanisms, making use of distributional properties of the input. In this paper, I evaluate the feasibility of the statistical learning in acquiring verb argument structures, based on distributional information about locative verbs in parental input. The naturalistic data allow us to investigate to what extent the statistical learning approach can and cannot help children succeed in learning the syntax of locative verbs. Based on the results of English database analysis, I show that there is rich statistical information for learning the syntactic possibilities of locative verbs in parental input, despite some limitations in the statistical learning approach.

  • PDF

A Co-Evolutionary Computing for Statistical Learning Theory

  • Jun Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제5권4호
    • /
    • pp.281-285
    • /
    • 2005
  • Learning and evolving are two basics for data mining. As compared with classical learning theory based on objective function with minimizing training errors, the recently evolutionary computing has had an efficient approach for constructing optimal model without the minimizing training errors. The global search of evolutionary computing in solution space can settle the local optima problems of learning models. In this research, combining co-evolving algorithm into statistical learning theory, we propose an co-evolutionary computing for statistical learning theory for overcoming local optima problems of statistical learning theory. We apply proposed model to classification and prediction problems of the learning. In the experimental results, we verify the improved performance of our model using the data sets from UCI machine learning repository and KDD Cup 2000.

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • 제14권3호
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

Detecting outliers in segmented genomes of flu virus using an alignment-free approach

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • 제18권1호
    • /
    • pp.2.1-2.11
    • /
    • 2020
  • In this paper, we propose a new approach to detecting outliers in a set of segmented genomes of the flu virus, a data set with a heterogeneous set of sequences. The approach has the following computational phases: feature extraction, which is a mapping into feature space, alignment-free distance measure to measure the distance between any two segmented genomes, and a mapping into distance space to analyze a quantum of distance values. The approach is implemented using supervised and unsupervised learning modes. The experiments show robustness in detecting outliers of the segmented genome of the flu virus.

Harnessing sparsity in lamb wave-based damage detection for beams

  • Sen, Debarshi;Nagarajaiah, Satish;Gopalakrishnan, S.
    • Structural Monitoring and Maintenance
    • /
    • 제4권4호
    • /
    • pp.381-396
    • /
    • 2017
  • Structural health monitoring (SHM) is a necessity for reliable and efficient functioning of engineering systems. Damage detection (DD) is a crucial component of any SHM system. Lamb waves are a popular means to DD owing to their sensitivity to small damages over a substantial length. This typically involves an active sensing paradigm in a pitch-catch setting, that involves two piezo-sensors, a transmitter and a receiver. In this paper, we propose a data-intensive DD approach for beam structures using high frequency signals acquired from beams in a pitch-catch setting. The key idea is to develop a statistical learning-based approach, that harnesses the inherent sparsity in the problem. The proposed approach performs damage detection, localization in beams. In addition, quantification is possible too with prior calibration. We demonstrate numerically that the proposed approach achieves 100% accuracy in detection and localization even with a signal to noise ratio of 25 dB.

A Multiple Instance Learning Problem Approach Model to Anomaly Network Intrusion Detection

  • Weon, Ill-Young;Song, Doo-Heon;Ko, Sung-Bum;Lee, Chang-Hoon
    • Journal of Information Processing Systems
    • /
    • 제1권1호
    • /
    • pp.14-21
    • /
    • 2005
  • Even though mainly statistical methods have been used in anomaly network intrusion detection, to detect various attack types, machine learning based anomaly detection was introduced. Machine learning based anomaly detection started from research applying traditional learning algorithms of artificial intelligence to intrusion detection. However, detection rates of these methods are not satisfactory. Especially, high false positive and repeated alarms about the same attack are problems. The main reason for this is that one packet is used as a basic learning unit. Most attacks consist of more than one packet. In addition, an attack does not lead to a consecutive packet stream. Therefore, with grouping of related packets, a new approach of group-based learning and detection is needed. This type of approach is similar to that of multiple-instance problems in the artificial intelligence community, which cannot clearly classify one instance, but classification of a group is possible. We suggest group generation algorithm grouping related packets, and a learning algorithm based on a unit of such group. To verify the usefulness of the suggested algorithm, 1998 DARPA data was used and the results show that our approach is quite useful.

머신러닝 기법을 활용한 공장 에너지 사용량 데이터 분석 (Machine Learning Approach for Pattern Analysis of Energy Consumption in Factory)

  • 성종훈;조영식
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제8권4호
    • /
    • pp.87-92
    • /
    • 2019
  • 본 연구에서는 머신 러닝 기법을 활용하여 공장에서 발생하는 에너지 사용량에 대한 데이터 분석 및 패턴 추출에 대해 다룬다. 통계학이나 기존의 방법들은 몇 가지 물리적 특성을 반영하는 수학적 모델을 구축하는 반면, 머신 러닝을 통한 접근방법은 데이터 학습을 통하여 모델의 계수들을 결정하게 된다. 기존의 방법들은 특정한 구조를 갖는 수학적 모델을 구축해야 한다는 어려움이 있으며 과연 데이터의 특징들을 잘 반영하는지에 대한 의문이 존재했다. 그러나 머신 러닝을 통한 방법은 사람이 구축하기 어려운 작업들을 용이하게 구축한다는 장점을 가지고 있기 때문에 데이터 간의 관계를 파악하기에 더 효율적이라는 장점을 가지고 있다. 공장의 에너지 소비에 직접적으로 영향을 끼치는 요소들이 존재하며 이러한 전력 소비는 시간에 따른 데이터로 나타나게 된다. 각 요소들로부터 발생하는 소비 전력을 계측하고 데이터 베이스를 구축하기 위해 각 요소에 센서를 장착하였다. 취득된 데이터에 대해 전처리 과정 및 통계적인 분석을 거친 뒤, 머신 러닝을 통해 패턴을 분석하는 과정을 거쳤다. 이를 통해 공장에서 발생하는 소비 전력 데이터에 대한 패턴 분석을 진행하였다.

A Machine Learning Approach to Korean Language Stemming

  • Cho, Se-hyeong
    • 한국지능시스템학회논문지
    • /
    • 제11권6호
    • /
    • pp.549-557
    • /
    • 2001
  • Morphological analysis and POS tagging require a dictionary for the language at hand . In this fashion though it is impossible to analyze a language a dictionary. We also have difficulty if significant portion of the vocabulary is new or unknown . This paper explores the possibility of learning morphology of an agglutinative language. in particular Korean language, without any prior lexical knowledge of the language. We use unsupervised learning in that there is no instructor to guide the outcome of the learner, nor any tagged corpus. Here are the main characteristics of the approach: First. we use only raw corpus without any tags attached or any dictionary. Second, unlike many heuristics that are theoretically ungrounded, this method is based on statistical methods , which are widely accepted. The method is currently applied only to Korean language but since it is essentially language-neutral it can easily be adapted to other agglutinative languages.

  • PDF

Emotional Correlation Test from Binary Gender Perspective using Kansei Engineering Approach on IVML Prototype

  • Nur Faraha Mohd, Naim;Mintae, Hwang
    • Journal of information and communication convergence engineering
    • /
    • 제21권1호
    • /
    • pp.68-74
    • /
    • 2023
  • This study examines the response of users' feelings from a gender perspective toward interactive video mobile learning (IVML). An IVML prototype was developed for the Android platform allowing users to install and make use of the app for m-learning purposes. This study aims to measure the level of feelings toward the IVML prototype and examine the differences in gender perspectives, identify the most responsive feelings between male, and female users as prominent feelings and measure the correlation between user-friendly feeling traits as an independent variable in accordance with gender attributes. The feelings response could then be extracted from the user experience, user interface, and human-computer interaction based on gender perspectives using the Kansei engineering approach as the measurement method. The statistical results demonstrated the different emotional reactions from a male and female perspective toward the IVML prototype may or may not have a correlation with the user-friendly trait, perhaps having a similar emotional response from one to another.

Input Variable Importance in Supervised Learning Models

  • Huh, Myung-Hoe;Lee, Yong Goo
    • Communications for Statistical Applications and Methods
    • /
    • 제10권1호
    • /
    • pp.239-246
    • /
    • 2003
  • Statisticians, or data miners, are often requested to assess the importances of input variables in the given supervised learning model. For the purpose, one may rely on separate ad hoc measures depending on modeling types, such as linear regressions, the neural networks or trees. Consequently, the conceptual consistency in input variable importance measures is lacking, so that the measures cannot be directly used in comparing different types of models, which is often done in data mining processes, In this short communication, we propose a unified approach to the importance measurement of input variables. Our method uses sensitivity analysis which begins by perturbing the values of input variables and monitors the output change. Research scope is limited to the models for continuous output, although it is not difficult to extend the method to supervised learning models for categorical outcomes.