• Title/Summary/Keyword: Data Classification Systems

Search Result 1,424, Processing Time 0.029 seconds

Classification of Remote Sensing Data using Random Selection of Training Data and Multiple Classifiers (훈련 자료의 임의 선택과 다중 분류자를 이용한 원격탐사 자료의 분류)

  • Park, No-Wook;Yoo, Hee Young;Kim, Yihyun;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.5
    • /
    • pp.489-499
    • /
    • 2012
  • In this paper, a classifier ensemble framework for remote sensing data classification is presented that combines classification results generated from both different training sets and different classifiers. A core part of the presented framework is to increase a diversity between classification results by using both different training sets and classifiers to improve classification accuracy. First, different training sets that have different sampling densities are generated and used as inputs for supervised classification using different classifiers that show different discrimination capabilities. Then several preliminary classification results are combined via a majority voting scheme to generate a final classification result. A case study of land-cover classification using multi-temporal ENVISAT ASAR data sets is carried out to illustrate the potential of the presented classification framework. In the case study, nine classification results were combined that were generated by using three different training sets and three different classifiers including maximum likelihood classifier, multi-layer perceptron classifier, and support vector machine. The case study results showed that complementary information on the discrimination of land-cover classes of interest would be extracted within the proposed framework and the best classification accuracy was obtained. When comparing different combinations, to combine any classification results where the diversity of the classifiers is not great didn't show an improvement of classification accuracy. Thus, it is recommended to ensure the greater diversity between classifiers in the design of multiple classifier systems.

Convolutional neural network-based data anomaly detection considering class imbalance with limited data

  • Du, Yao;Li, Ling-fang;Hou, Rong-rong;Wang, Xiao-you;Tian, Wei;Xia, Yong
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.63-75
    • /
    • 2022
  • The raw data collected by structural health monitoring (SHM) systems may suffer multiple patterns of anomalies, which pose a significant barrier for an automatic and accurate structural condition assessment. Therefore, the detection and classification of these anomalies is an essential pre-processing step for SHM systems. However, the heterogeneous data patterns, scarce anomalous samples and severe class imbalance make data anomaly detection difficult. In this regard, this study proposes a convolutional neural network-based data anomaly detection method. The time and frequency domains data are transferred as images and used as the input of the neural network for training. ResNet18 is adopted as the feature extractor to avoid training with massive labelled data. In addition, the focal loss function is adopted to soften the class imbalance-induced classification bias. The effectiveness of the proposed method is validated using acceleration data collected in a long-span cable-stayed bridge. The proposed approach detects and classifies data anomalies with high accuracy.

Research on Comparing the Size of the Data Workforce Across Countries (국가간 데이터직무 인력 규모 비교 연구)

  • Hyemi Um
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.79-95
    • /
    • 2024
  • In modern society, as data plays a crucial role at the levels of businesses, industries, and nations, the utilization of data becomes increasingly important. Consequently, governments are prioritizing the development and implementation of plans to cultivate data workforce, viewing the data industry as a cornerstone of national strategy. To enhance domestic capabilities and nurture workforce in the data industry, it is deemed necessary to conduct an objective comparative analysis with major foreign countries. Therefore, this study aims to analyze cases of domestic and international data industries and explore methods for quantitatively comparing data industry workforce across nations. Initially, the study distinguishes between "data industry workforce" and "data job-related workforce," particularly focusing on professionals handling data-related tasks. Subsequently, it compares the workforce sizes of data job-related workforce across nations, utilizing standardized occupational classification codes based on the International Standard Classification of Occupations(ISCO). However, it should be noted that countries employing their own unique occupational classification systems often require matching job titles with similar meanings for accurate comparison. Through this study, it is anticipated that policymakers will be able to establish future directions for cultivating data workforce based on comparable status.

Domain Adaptation Image Classification Based on Multi-sparse Representation

  • Zhang, Xu;Wang, Xiaofeng;Du, Yue;Qin, Xiaoyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2590-2606
    • /
    • 2017
  • Generally, research of classical image classification algorithms assume that training data and testing data are derived from the same domain with the same distribution. Unfortunately, in practical applications, this assumption is rarely met. Aiming at the problem, a domain adaption image classification approach based on multi-sparse representation is proposed in this paper. The existences of intermediate domains are hypothesized between the source and target domains. And each intermediate subspace is modeled through online dictionary learning with target data updating. On the one hand, the reconstruction error of the target data is guaranteed, on the other, the transition from the source domain to the target domain is as smooth as possible. An augmented feature representation produced by invariant sparse codes across the source, intermediate and target domain dictionaries is employed for across domain recognition. Experimental results verify the effectiveness of the proposed algorithm.

Integrated GUI Environment of Parallel Fuzzy Inference System for Pattern Classification of Remote Sensing Images

  • Lee, Seong-Hoon;Lee, Sang-Gu;Son, Ki-Sung;Kim, Jong-Hyuk;Lee, Byung-Kwon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.2
    • /
    • pp.133-138
    • /
    • 2002
  • In this paper, we propose an integrated GUI environment of parallel fuzzy inference system fur pattern classification of remote sensing data. In this, as 4 fuzzy variables in condition part and 104 fuzzy rules are used, a real time and parallel approach is required. For frost fuzzy computation, we use the scan line conversion algorithm to convert lines of each fuzzy linguistic term to the closest integer pixels. We design 4 fuzzy processor unit to be operated in parallel by using FPGA. As a GUI environment, PCI transmission, image data pre-processing, integer pixel mapping and fuzzy membership tuning are considered. This system can be used in a pattern classification system requiring a rapid inference time in a real-time.

THE MODIFIED UNSUPERVISED SPECTRAL ANGLE CLASSIFICATION (MUSAC) OF HYPERION, HYPERION-FLASSH AND ETM+ DATA USING UNIT VECTOR

  • Kim, Dae-Sung;Kim, Yong-Il
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.134-137
    • /
    • 2005
  • Unsupervised spectral angle classification (USAC) is the algorithm that can extract ground object information with the minimum 'Spectral Angle' operation on behalf of 'Spectral Euclidian Distance' in the clustering process. In this study, our algorithm uses the unit vector instead of the spectral distance to compute the mean of cluster in the unsupervised classification. The proposed algorithm (MUSAC) is applied to the Hyperion and ETM+ data and the results are compared with K-Meails and former USAC algorithm (FUSAC). USAC is capable of clearly classifying water and dark forest area and produces more accurate results than K-Means. Atmospheric correction for more accurate results was adapted on the Hyperion data (Hyperion-FLAASH) but the results did not have any effect on the accuracy. Thus we anticipate that the 'Spectral Angle' can be one of the most accurate classifiers of not only multispectral images but also hyperspectral images. Furthermore the cluster unit vector can be an efficient technique for determination of each cluster mean in the USAC.

  • PDF

Binary Classification Method using Invariant CSP for Hand Movements Analysis in EEG-based BCI System

  • Nguyen, Thanh Ha;Park, Seung-Min;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.2
    • /
    • pp.178-183
    • /
    • 2013
  • In this study, we proposed a method for electroencephalogram (EEG) classification using invariant CSP at special channels for improving the accuracy of classification. Based on the naive EEG signals from left and right hand movement experiment, the noises of contaminated data set should be eliminate and the proposed method can deal with the de-noising of data set. The considering data set are collected from the special channels for right and left hand movements around the motor cortex area. The proposed method is based on the fit of the adjusted parameter to decline the affect of invariant parts in raw signals and can increase the classification accuracy. We have run the simulation for hundreds time for each parameter and get averaged value to get the last result for comparison. The experimental results show the accuracy is improved more than the original method, the highest result reach to 89.74%.

A Study of Landscape Construction Work Classification for System Instruction of New Estimation System based on Historical Construction data. - With regard to Housing Landscape Construction - (실적공사비 적산방식 도입을 위한 조경공사 공종분류체계에 관한 연구 -주택단지 조경공사를 중심으로-)

  • 박원규;김두하;안동만
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.25 no.1
    • /
    • pp.82-99
    • /
    • 1997
  • The purpose of this study is to establish work classification system of landscape construction in order to offer the basis of new estimation system of public landscape construction. New estimation system is based on historical construction data. For application of this system, the standard work classification system is necessary. Because extensive cost data should be accumulated under an unified construction work classification system. In the study of new estimation system carried by KICT(Korea Institute of Construction Technology), landscaping works belong to earth work of civil engineering. It looks very unreasonable work classification, because landscape archtecture has its own specialties and professional domain. In this study, information classification systems in the construction industry and various landscaping works of housing developments are analysed. As a result. a standard work classification system of housing landscape construction is proposed in section VI-3. This standard work classification structure consists of three levels divisions (i.e large work division, middle work division, small work division) . Now in this study, housing landscape construction works are divided into four large works and twenty six middle works. According to work attributes, middle and small work division is possible to subdivide into details.

  • PDF

Construction of Customer Appeal Classification Model Based on Speech Recognition

  • Sheng Cao;Yaling Zhang;Shengping Yan;Xiaoxuan Qi;Yuling Li
    • Journal of Information Processing Systems
    • /
    • v.19 no.2
    • /
    • pp.258-266
    • /
    • 2023
  • Aiming at the problems of poor customer satisfaction and poor accuracy of customer classification, this paper proposes a customer classification model based on speech recognition. First, this paper analyzes the temporal data characteristics of customer demand data, identifies the influencing factors of customer demand behavior, and determines the process of feature extraction of customer voice signals. Then, the emotional association rules of customer demands are designed, and the classification model of customer demands is constructed through cluster analysis. Next, the Euclidean distance method is used to preprocess customer behavior data. The fuzzy clustering characteristics of customer demands are obtained by the fuzzy clustering method. Finally, on the basis of naive Bayesian algorithm, a customer demand classification model based on speech recognition is completed. Experimental results show that the proposed method improves the accuracy of the customer demand classification to more than 80%, and improves customer satisfaction to more than 90%. It solves the problems of poor customer satisfaction and low customer classification accuracy of the existing classification methods, which have practical application value.

A Model-based Collaborative Filtering Through Regularized Discriminant Analysis Using Market Basket Data

  • Lee, Jong-Seok;Jun, Chi-Hyuck;Lee, Jae-Wook;Kim, Soo-Young
    • Management Science and Financial Engineering
    • /
    • v.12 no.2
    • /
    • pp.71-85
    • /
    • 2006
  • Collaborative filtering, among other recommender systems, has been known as the most successful recommendation technique. However, it requires the user-item rating data, which may not be easily available. As an alternative, some collaborative filtering algorithms have been developed recently by utilizing the market basket data in the form of the binary user-item matrix. Viewing the recommendation scheme as a two-class classification problem, we proposed a new collaborative filtering scheme using a regularized discriminant analysis applied to the binary user-item data. The proposed discriminant model was built in terms of the major principal components and was used for predicting the probability of purchasing a particular item by an active user. The proposed scheme was illustrated with two modified real data sets and its performance was compared with the existing user-based approach in terms of the recommendation precision.