• Title/Summary/Keyword: Statistical feature

Search Result 664, Processing Time 0.024 seconds

Face Recognition Using Feature Information and Neural Network

  • Chung, Jae-Mo;Bae, Hyeon;Kim, Sung-Shin
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.55.2-55
    • /
    • 2001
  • The statistical analysis of the feature extraction and the neural networks are proposed to recognize a human face. In the preprocessing step, the normalized skin color map with Gaussian functions is employed to extract the region efface candidate. The feature information in the region of face candidate is used to detect a face region. In the recognition step, as a tested, the 360 images of 30 persons are trained by the backpropagation algorithm. The images of each person are obtained from the various direction, pose, and facial expression, Input variables of the neural networks are the feature information that comes from the eigenface spaces. The simulation results of 30 persons show that the proposed method yields high recognition rates.

  • PDF

Use of Crown Feature Analysis to Separate the Two Pine Species in QuickBird Imagery

  • Kim, Cheon
    • Korean Journal of Remote Sensing
    • /
    • v.24 no.3
    • /
    • pp.267-272
    • /
    • 2008
  • Tree species-specific estimates with spacebome high-resolution imagery improve estimation of forest biomass which is needed to predict the long term planning for the sustainable forest management(SFM). This paper is a contribution to develop crown distinguishing coniferous species, Pinus densiflora and Pinus koraiensis, from QuickBird imagery. The proposed feature analysis derived from shape parameters and first and second-order statistical texture features of the same test area were compared for the two species separation and delineation. As expected, initial studies have shown that both formfactor and compactness shape parameters provided the successful differentiating method between the pine species within the compartment for single crown identification from spaceborne high resolution imagery. Another result revealed that the selected texture parameters - the mean, variance, angular second moment(ASM) - in the infrared band image could produce good subset combination of texture features for representing detailed tree crown outline.

Landslide susceptibility assessment using feature selection-based machine learning models

  • Liu, Lei-Lei;Yang, Can;Wang, Xiao-Mi
    • Geomechanics and Engineering
    • /
    • v.25 no.1
    • /
    • pp.1-16
    • /
    • 2021
  • Machine learning models have been widely used for landslide susceptibility assessment (LSA) in recent years. The large number of inputs or conditioning factors for these models, however, can reduce the computation efficiency and increase the difficulty in collecting data. Feature selection is a good tool to address this problem by selecting the most important features among all factors to reduce the size of the input variables. However, two important questions need to be solved: (1) how do feature selection methods affect the performance of machine learning models? and (2) which feature selection method is the most suitable for a given machine learning model? This paper aims to address these two questions by comparing the predictive performance of 13 feature selection-based machine learning (FS-ML) models and 5 ordinary machine learning models on LSA. First, five commonly used machine learning models (i.e., logistic regression, support vector machine, artificial neural network, Gaussian process and random forest) and six typical feature selection methods in the literature are adopted to constitute the proposed models. Then, fifteen conditioning factors are chosen as input variables and 1,017 landslides are used as recorded data. Next, feature selection methods are used to obtain the importance of the conditioning factors to create feature subsets, based on which 13 FS-ML models are constructed. For each of the machine learning models, a best optimized FS-ML model is selected according to the area under curve value. Finally, five optimal FS-ML models are obtained and applied to the LSA of the studied area. The predictive abilities of the FS-ML models on LSA are verified and compared through the receive operating characteristic curve and statistical indicators such as sensitivity, specificity and accuracy. The results showed that different feature selection methods have different effects on the performance of LSA machine learning models. FS-ML models generally outperform the ordinary machine learning models. The best FS-ML model is the recursive feature elimination (RFE) optimized RF, and RFE is an optimal method for feature selection.

Fault Location and Classification of Combined Transmission System: Economical and Accurate Statistic Programming Framework

  • Tavalaei, Jalal;Habibuddin, Mohd Hafiz;Khairuddin, Azhar;Mohd Zin, Abdullah Asuhaimi
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.6
    • /
    • pp.2106-2117
    • /
    • 2017
  • An effective statistical feature extraction approach of data sampling of fault in the combined transmission system is presented in this paper. The proposed algorithm leads to high accuracy at minimum cost to predict fault location and fault type classification. This algorithm requires impedance measurement data from one end of the transmission line. Modal decomposition is used to extract positive sequence impedance. Then, the fault signal is decomposed by using discrete wavelet transform. Statistical sampling is used to extract appropriate fault features as benchmark of decomposed signal to train classifier. Support Vector Machine (SVM) is used to illustrate the performance of statistical sampling performance. The overall time of sampling is not exceeding 1 1/4 cycles, taking into account the interval time. The proposed method takes two steps of sampling. The first step takes 3/4 cycle of during-fault and the second step takes 1/4 cycle of post fault impedance. The interval time between the two steps is assumed to be 1/4 cycle. Extensive studies using MATLAB software show accurate fault location estimation and fault type classification of the proposed method. The classifier result is presented and compared with well-established travelling wave methods and the performance of the algorithms are analyzed and discussed.

Analysis of Market Trajectory Data using k-NN

  • Park, So-Hyun;Ihm, Sun-Young;Park, Young-Ho
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.195-200
    • /
    • 2018
  • Recently, as the sensor and big data analysis technology have been developed, there have been a lot of researches that analyze the purchase-related data such as the trajectory information and the stay time. Such purchase-related data is usefully used for the purchase pattern prediction and the purchase time prediction. Because it is difficult to find periodic patterns in large-scale human data, it is necessary to look at actual data sets, find various feature patterns, and then apply a machine learning algorithm appropriate to the pattern and purpose. Although existing papers have been used to analyze data using various machine learning methods, there is a lack of statistical analysis such as finding feature patterns before applying the machine learning algorithm. Therefore, we analyze the purchasing data of Songjeong Maeil Market, which is a data gathering place, and finds some characteristic patterns through statistical data analysis. Based on the results of 1, we derive meaningful conclusions by applying the machine learning algorithm and present future research directions. Through the data analysis, it was confirmed that the number of visits was different according to the regional characteristics around Songjeong Maeil Market, and the distribution of time spent by consumers could be grasped.

Bearing fault detection through multiscale wavelet scalogram-based SPC

  • Jung, Uk;Koh, Bong-Hwan
    • Smart Structures and Systems
    • /
    • v.14 no.3
    • /
    • pp.377-395
    • /
    • 2014
  • Vibration-based fault detection and condition monitoring of rotating machinery, using statistical process control (SPC) combined with statistical pattern recognition methodology, has been widely investigated by many researchers. In particular, the discrete wavelet transform (DWT) is considered as a powerful tool for feature extraction in detecting fault on rotating machinery. Although DWT significantly reduces the dimensionality of the data, the number of retained wavelet features can still be significantly large. Then, the use of standard multivariate SPC techniques is not advised, because the sample covariance matrix is likely to be singular, so that the common multivariate statistics cannot be calculated. Even though many feature-based SPC methods have been introduced to tackle this deficiency, most methods require a parametric distributional assumption that restricts their feasibility to specific problems of process control, and thus limit their application. This study proposes a nonparametric multivariate control chart method, based on multiscale wavelet scalogram (MWS) features, that overcomes the limitation posed by the parametric assumption in existing SPC methods. The presented approach takes advantage of multi-resolution analysis using DWT, and obtains MWS features with significantly low dimensionality. We calculate Hotelling's $T^2$-type monitoring statistic using MWS, which has enough damage-discrimination ability. A bootstrap approach is used to determine the upper control limit of the monitoring statistic, without any distributional assumption. Numerical simulations demonstrate the performance of the proposed control charting method, under various damage-level scenarios for a bearing system.

A note on the distance distribution paradigm for Mosaab-metric to process segmented genomes of influenza virus

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.7.1-7.7
    • /
    • 2020
  • In this paper, we present few technical notes about the distance distribution paradigm for Mosaab-metric using 1, 2, and 3 grams feature extraction techniques to analyze composite data points in high dimensional feature spaces. This technical analysis will help the specialist in bioinformatics and biotechnology to deeply explore the biodiversity of influenza virus genome as a composite data point. Various technical examples are presented in this paper, in addition, the integrated statistical learning pipeline to process segmented genomes of influenza virus is illustrated as sequential-parallel computational pipeline.

Detecting outliers in segmented genomes of flu virus using an alignment-free approach

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.2.1-2.11
    • /
    • 2020
  • In this paper, we propose a new approach to detecting outliers in a set of segmented genomes of the flu virus, a data set with a heterogeneous set of sequences. The approach has the following computational phases: feature extraction, which is a mapping into feature space, alignment-free distance measure to measure the distance between any two segmented genomes, and a mapping into distance space to analyze a quantum of distance values. The approach is implemented using supervised and unsupervised learning modes. The experiments show robustness in detecting outliers of the segmented genome of the flu virus.

Cross-architecture Binary Function Similarity Detection based on Composite Feature Model

  • Xiaonan Li;Guimin Zhang;Qingbao Li;Ping Zhang;Zhifeng Chen;Jinjin Liu;Shudan Yue
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2101-2123
    • /
    • 2023
  • Recent studies have shown that the neural network-based binary code similarity detection technology performs well in vulnerability mining, plagiarism detection, and malicious code analysis. However, existing cross-architecture methods still suffer from insufficient feature characterization and low discrimination accuracy. To address these issues, this paper proposes a cross-architecture binary function similarity detection method based on composite feature model (SDCFM). Firstly, the binary function is converted into vector representation according to the proposed composite feature model, which is composed of instruction statistical features, control flow graph structural features, and application program interface calling behavioral features. Then, the composite features are embedded by the proposed hierarchical embedding network based on a graph neural network. In which, the block-level features and the function-level features are processed separately and finally fused into the embedding. In addition, to make the trained model more accurate and stable, our method utilizes the embeddings of predecessor nodes to modify the node embedding in the iterative updating process of the graph neural network. To assess the effectiveness of composite feature model, we contrast SDCFM with the state of art method on benchmark datasets. The experimental results show that SDCFM has good performance both on the area under the curve in the binary function similarity detection task and the vulnerable candidate function ranking in vulnerability search task.

Feature Filtering Methods for Web Documents Clustering (웹 문서 클러스터링에서의 자질 필터링 방법)

  • Park Heum;Kwon Hyuk-Chul
    • The KIPS Transactions:PartB
    • /
    • v.13B no.4 s.107
    • /
    • pp.489-498
    • /
    • 2006
  • Clustering results differ according to the datasets and the performance worsens even while using web documents which are manually processed by an indexer, because although representative clusters for a feature can be obtained by statistical feature selection methods, irrelevant features(i.e., non-obvious features and those appearing in general documents) are not eliminated. Those irrelevant features should be eliminated for improving clustering performance. Therefore, this paper proposes three feature-filtering algorithms which consider feature values per document set, together with distribution, frequency, and weights of features per document set: (l) features filtering algorithm in a document (FFID), (2) features filtering algorithm in a document matrix (FFIM), and (3) a hybrid method combining both FFID and FFIM (HFF). We have tested the clustering performance by feature selection using term frequency and expand co link information, and by feature filtering using the above methods FFID, FFIM, HFF methods. According to the results of our experiments, HFF had the best performance, whereas FFIM performed better than FFID.