• Title/Summary/Keyword: statistical learning approach

Search Result 159, Processing Time 0.027 seconds

The Role of Distributional Cues in the Acquisition of Verb Argument Structures

  • Kim, Mee-Sook
    • Language and Information
    • /
    • v.7 no.1
    • /
    • pp.87-99
    • /
    • 2003
  • This paper investigates the role of input frequency in the acquisition of verb argument structures based on distributional information of a corpus of utterances derived from the English CHILDES database (MacWhinney 1993). It has been widely accepted that children successfully learn verb argument structures by innate language mechanisms, such as linking rules which connect verb meanings and its syntactic structures. In contrast, an approach to language acquisition called “statistical language learning” has currently claimed that children could succeed in acquiring syntactic structures in the absence of innate language mechanisms, making use of distributional properties of the input. In this paper, I evaluate the feasibility of the statistical learning in acquiring verb argument structures, based on distributional information about locative verbs in parental input. The naturalistic data allow us to investigate to what extent the statistical learning approach can and cannot help children succeed in learning the syntax of locative verbs. Based on the results of English database analysis, I show that there is rich statistical information for learning the syntactic possibilities of locative verbs in parental input, despite some limitations in the statistical learning approach.

  • PDF

A Co-Evolutionary Computing for Statistical Learning Theory

  • Jun Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.4
    • /
    • pp.281-285
    • /
    • 2005
  • Learning and evolving are two basics for data mining. As compared with classical learning theory based on objective function with minimizing training errors, the recently evolutionary computing has had an efficient approach for constructing optimal model without the minimizing training errors. The global search of evolutionary computing in solution space can settle the local optima problems of learning models. In this research, combining co-evolving algorithm into statistical learning theory, we propose an co-evolutionary computing for statistical learning theory for overcoming local optima problems of statistical learning theory. We apply proposed model to classification and prediction problems of the learning. In the experimental results, we verify the improved performance of our model using the data sets from UCI machine learning repository and KDD Cup 2000.

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • v.14 no.3
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

Detecting outliers in segmented genomes of flu virus using an alignment-free approach

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.2.1-2.11
    • /
    • 2020
  • In this paper, we propose a new approach to detecting outliers in a set of segmented genomes of the flu virus, a data set with a heterogeneous set of sequences. The approach has the following computational phases: feature extraction, which is a mapping into feature space, alignment-free distance measure to measure the distance between any two segmented genomes, and a mapping into distance space to analyze a quantum of distance values. The approach is implemented using supervised and unsupervised learning modes. The experiments show robustness in detecting outliers of the segmented genome of the flu virus.

Harnessing sparsity in lamb wave-based damage detection for beams

  • Sen, Debarshi;Nagarajaiah, Satish;Gopalakrishnan, S.
    • Structural Monitoring and Maintenance
    • /
    • v.4 no.4
    • /
    • pp.381-396
    • /
    • 2017
  • Structural health monitoring (SHM) is a necessity for reliable and efficient functioning of engineering systems. Damage detection (DD) is a crucial component of any SHM system. Lamb waves are a popular means to DD owing to their sensitivity to small damages over a substantial length. This typically involves an active sensing paradigm in a pitch-catch setting, that involves two piezo-sensors, a transmitter and a receiver. In this paper, we propose a data-intensive DD approach for beam structures using high frequency signals acquired from beams in a pitch-catch setting. The key idea is to develop a statistical learning-based approach, that harnesses the inherent sparsity in the problem. The proposed approach performs damage detection, localization in beams. In addition, quantification is possible too with prior calibration. We demonstrate numerically that the proposed approach achieves 100% accuracy in detection and localization even with a signal to noise ratio of 25 dB.

A Multiple Instance Learning Problem Approach Model to Anomaly Network Intrusion Detection

  • Weon, Ill-Young;Song, Doo-Heon;Ko, Sung-Bum;Lee, Chang-Hoon
    • Journal of Information Processing Systems
    • /
    • v.1 no.1 s.1
    • /
    • pp.14-21
    • /
    • 2005
  • Even though mainly statistical methods have been used in anomaly network intrusion detection, to detect various attack types, machine learning based anomaly detection was introduced. Machine learning based anomaly detection started from research applying traditional learning algorithms of artificial intelligence to intrusion detection. However, detection rates of these methods are not satisfactory. Especially, high false positive and repeated alarms about the same attack are problems. The main reason for this is that one packet is used as a basic learning unit. Most attacks consist of more than one packet. In addition, an attack does not lead to a consecutive packet stream. Therefore, with grouping of related packets, a new approach of group-based learning and detection is needed. This type of approach is similar to that of multiple-instance problems in the artificial intelligence community, which cannot clearly classify one instance, but classification of a group is possible. We suggest group generation algorithm grouping related packets, and a learning algorithm based on a unit of such group. To verify the usefulness of the suggested algorithm, 1998 DARPA data was used and the results show that our approach is quite useful.

Machine Learning Approach for Pattern Analysis of Energy Consumption in Factory (머신러닝 기법을 활용한 공장 에너지 사용량 데이터 분석)

  • Sung, Jong Hoon;Cho, Yeong Sik
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.4
    • /
    • pp.87-92
    • /
    • 2019
  • This paper describes the pattern analysis for data of the factory energy consumption by using machine learning method. While usual statistical methods or approaches require specific equations to represent the physical characteristics of the plant, machine learning based approach uses historical data and calculate the result effectively. Although rule-based approach calculates energy usage with the physical equations, it is hard to identify the exact equations that represent the factory's characteristics and hidden variables affecting the results. Whereas the machine learning approach is relatively useful to find the relations quickly between the data. The factory has several components directly affecting to the electricity consumption which are machines, light, computers and indoor systems like HVAC (heating, ventilation and air conditioning). The energy loads from those components are generated in real-time and these data can be shown in time-series. The various sensors were installed in the factory to construct the database by collecting the energy usage data from the components. After preliminary statistical analysis for data mining, time-series clustering techniques are applied to extract the energy load pattern. This research can attributes to develop Factory Energy Management System (FEMS).

A Machine Learning Approach to Korean Language Stemming

  • Cho, Se-hyeong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.6
    • /
    • pp.549-557
    • /
    • 2001
  • Morphological analysis and POS tagging require a dictionary for the language at hand . In this fashion though it is impossible to analyze a language a dictionary. We also have difficulty if significant portion of the vocabulary is new or unknown . This paper explores the possibility of learning morphology of an agglutinative language. in particular Korean language, without any prior lexical knowledge of the language. We use unsupervised learning in that there is no instructor to guide the outcome of the learner, nor any tagged corpus. Here are the main characteristics of the approach: First. we use only raw corpus without any tags attached or any dictionary. Second, unlike many heuristics that are theoretically ungrounded, this method is based on statistical methods , which are widely accepted. The method is currently applied only to Korean language but since it is essentially language-neutral it can easily be adapted to other agglutinative languages.

  • PDF

Emotional Correlation Test from Binary Gender Perspective using Kansei Engineering Approach on IVML Prototype

  • Nur Faraha Mohd, Naim;Mintae, Hwang
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.68-74
    • /
    • 2023
  • This study examines the response of users' feelings from a gender perspective toward interactive video mobile learning (IVML). An IVML prototype was developed for the Android platform allowing users to install and make use of the app for m-learning purposes. This study aims to measure the level of feelings toward the IVML prototype and examine the differences in gender perspectives, identify the most responsive feelings between male, and female users as prominent feelings and measure the correlation between user-friendly feeling traits as an independent variable in accordance with gender attributes. The feelings response could then be extracted from the user experience, user interface, and human-computer interaction based on gender perspectives using the Kansei engineering approach as the measurement method. The statistical results demonstrated the different emotional reactions from a male and female perspective toward the IVML prototype may or may not have a correlation with the user-friendly trait, perhaps having a similar emotional response from one to another.

Input Variable Importance in Supervised Learning Models

  • Huh, Myung-Hoe;Lee, Yong Goo
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.239-246
    • /
    • 2003
  • Statisticians, or data miners, are often requested to assess the importances of input variables in the given supervised learning model. For the purpose, one may rely on separate ad hoc measures depending on modeling types, such as linear regressions, the neural networks or trees. Consequently, the conceptual consistency in input variable importance measures is lacking, so that the measures cannot be directly used in comparing different types of models, which is often done in data mining processes, In this short communication, we propose a unified approach to the importance measurement of input variables. Our method uses sensitivity analysis which begins by perturbing the values of input variables and monitors the output change. Research scope is limited to the models for continuous output, although it is not difficult to extend the method to supervised learning models for categorical outcomes.