• Title/Summary/Keyword: Multi-class Classification

Search Result 226, Processing Time 0.024 seconds

Development of System Model for Integrated Information Management of Construction Material (건설자재 통합정보 관리를 위한 시스템 모델 구현)

  • Han, Choong-Han;Ju, Ki-Bum
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.433-440
    • /
    • 2009
  • As information technology of constructional area develops recently, web-based on-line system is rapidly increasing to provide information on diverse constructional materials so as to enhance productivity of constructional business and to reduce cost. Since the constructional materials information provided by these systems, i.e., quality, specification, etc are not standardized, however, the staffs on the constructional site suffer considerable difficulties in using materials information when acquiring information on specific materials, e.g., using diverse information systems or repeating similar jobs. Thus, this research typified information items of constructional materials on the basis of GDAS and designed multi system model to control integrated information on constructional materials. This system can efficiently control and utilize materials information by supporting automatic classification of constructional materials to which OmniClass Part-22 and UNSPSC are applied, conditional complex retrieval of materials information, real-time automatic embodiment of electronic catalog and retrieving/controlling RFID.

An Analytical Study on Automatic Classification of Domestic Journal articles Using Random Forest (랜덤포레스트를 이용한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.2
    • /
    • pp.57-77
    • /
    • 2019
  • Random Forest (RF), a representative ensemble technique, was applied to automatic classification of journal articles in the field of library and information science. Especially, I performed various experiments on the main factors such as tree number, feature selection, and learning set size in terms of classification performance that automatically assigns class labels to domestic journals. Through this, I explored ways to optimize the performance of random forests (RF) for imbalanced datasets in real environments. Consequently, for the automatic classification of domestic journal articles, Random Forest (RF) can be expected to have the best classification performance when using tree number interval 100~1000(C), small feature set (10%) based on chi-square statistic (CHI), and most learning sets (9-10 years).

Weighted Least Squares Based on Feature Transformation using Distance Computation for Binary Classification (이진 분류를 위하여 거리계산을 이용한 특징 변환 기반의 가중된 최소 자승법)

  • Jang, Se-In;Park, Choong-Shik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.219-224
    • /
    • 2020
  • Binary classification has been broadly investigated in machine learning. In addition, binary classification can be easily extended to multi class problems. To successfully utilize machine learning methods for classification tasks, preprocessing and feature extraction steps are essential. These are important steps to improve their classification performances. In this paper, we propose a new learning method based on weighted least squares. In the weighted least squares, designing weights has a significant role. Due to this necessity, we also propose a new technique to obtain weights that can achieve feature transformation. Based on this weighting technique, we also propose a method to combine the learning and feature extraction processes together to perform both processes simultaneously in one step. The proposed method shows the promising performance on five UCI machine learning data sets.

Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA

  • Jeon, Dong-Ha;Lee, Soo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.123-130
    • /
    • 2022
  • Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.

Classification of Pollution Patterns in High School Classrooms using Disjoint Principal Component Analysis (분산주성분 분석을 이용한 고등학교교실 내 오염패턴분류에 관한 연구)

  • Jang, Choul-Soon;Lee, Tae-Jung;Kim, Dong-Sool
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.22 no.6
    • /
    • pp.808-820
    • /
    • 2006
  • In regard to indoor air quality patterns, the government introduced various polices that were about managing and monitoring quality of indoor air as a major assignment, and also executed 'Indoor Air Quality Management Act' which was presented in the May, 2004. However, among the multi-usage facilities controlled by the Act, the school was not included yet. This study goal was to investigate PM 10 pollution patterns of the high school classrooms using a pattern recognition method based on cluster analysis and disjoint principal component analysis, and further to survey levels of inorganic elements in May, June, and September, 2004. A hierarchical clustering method was examined to obtain possible objects in pseudo homogeneous sample classes by transformation raw data and by applying various distance. Following the analysis, the disjoint principal component analysis was used to define homogeneous sample class after deleting outliers. Then three homogeneous Patterns were obtained as follows: the first class had been separated and objects in the class were considered to be sampled under semi-open condition. This class had high concentration of Ca, Fe, Mg, K, Al, and Na which are related with a soil and a chalk compounds. The second class was obtained in which objects were sampled while working air-conditioners and was identified low concentration of PM 10 and elements. Objects in the last class were assigned during rainy day. A chalk, soil element and various types of anthropogenic sources including combustions and industrial influenced the third class. This methodology was thought to be helpful enough to classify indoor air quality patterns and indoor environmental categories when controlling an indoor air quality.

An Anomaly Detection Framework Based on ICA and Bayesian Classification for IaaS Platforms

  • Wang, GuiPing;Yang, JianXi;Li, Ren
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3865-3883
    • /
    • 2016
  • Infrastructure as a Service (IaaS) encapsulates computer hardware into a large amount of virtual and manageable instances mainly in the form of virtual machine (VM), and provides rental service for users. Currently, VM anomaly incidents occasionally occur, which leads to performance issues and even downtime. This paper aims at detecting anomalous VMs based on performance metrics data of VMs. Due to the dynamic nature and increasing scale of IaaS, detecting anomalous VMs from voluminous correlated and non-Gaussian monitored performance data is a challenging task. This paper designs an anomaly detection framework to solve this challenge. First, it collects 53 performance metrics to reflect the running state of each VM. The collected performance metrics are testified not to follow the Gaussian distribution. Then, it employs independent components analysis (ICA) instead of principal component analysis (PCA) to extract independent components from collected non-Gaussian performance metric data. For anomaly detection, it employs multi-class Bayesian classification to determine the current state of each VM. To evaluate the performance of the designed detection framework, four types of anomalies are separately or jointly injected into randomly selected VMs in a campus-wide testbed. The experimental results show that ICA-based detection mechanism outperforms PCA-based and LDA-based detection mechanisms in terms of sensitivity and specificity.

Hippocampus Segmentation and Classification in Alzheimer's Disease and Mild Cognitive Impairment Applied on MR Images

  • Madusanka, Nuwan;Choi, Yu Yong;Choi, Kyu Yeong;Lee, Kun Ho;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.205-215
    • /
    • 2017
  • The brain magnetic resonance images (MRI) is an important imaging biomarker in Alzheimer's disease (AD) as the cerebral atrophy has been shown to strongly associate with cognitive symptoms. The decrease of volume estimates in different structures of the medial temporal lobe related to memory correlates with the decline of cognitive functions in neurodegenerative diseases. During the past decades several methods have been developed for quantifying the disease related atrophy of hippocampus from MRI. Special effort has been dedicated to separate AD and mild cognitive impairment (MCI) related modifications from normal aging for the purpose of early detection and prediction. We trained a multi-class support vector machine (SVM) with probabilistic outputs on a sample (n = 58) of 20 normal controls (NC), 19 individuals with MCI, and 19 individuals with AD. The model was then applied to the cross-validation of same data set which no labels were known and the predictions. This study presents data on the association between MRI quantitative parameters of hippocampus and its quantitative structural changes examination use on the classification of the diseases.

Malware classification using statistical techniques (통계적 기법을 이용한 악성 소프트웨어 분류)

  • Won, Sungmin;Kim, Hyunjoo;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.851-865
    • /
    • 2017
  • Ransomware such as WannaCry is a global issue and methods to defend against malware attacks are important. We have to be able to classify the malware types efficiently in order to minimize the damage from malwares. This study makes models to classify malware properly with various statistical techniques. Several classification techniques such as logistic regression, random forest, gradient boosting, and support vector machine are used to construct models. This study also helps us understand key variables to classify the type of malicious software.

Diagnosis of Alzheimer's Disease using Wrapper Feature Selection Method

  • Vyshnavi Ramineni;Goo-Rak Kwon
    • Smart Media Journal
    • /
    • v.12 no.3
    • /
    • pp.30-37
    • /
    • 2023
  • Alzheimer's disease (AD) symptoms are being treated by early diagnosis, where we can only slow the symptoms and research is still undergoing. In consideration, using T1-weighted images several classification models are proposed in Machine learning to identify AD. In this paper, we consider the improvised feature selection, to reduce the complexity by using wrapping techniques and Restricted Boltzmann Machine (RBM). This present work used the subcortical and cortical features of 278 subjects from the ADNI dataset to identify AD and sMRI. Multi-class classification is used for the experiment i.e., AD, EMCI, LMCI, HC. The proposed feature selection consists of Forward feature selection, Backward feature selection, and Combined PCA & RBM. Forward and backward feature selection methods use an iterative method starting being no features in the forward feature selection and backward feature selection with all features included in the technique. PCA is used to reduce the dimensions and RBM is used to select the best feature without interpreting the features. We have compared the three models with PCA to analysis. The following experiment shows that combined PCA &RBM, and backward feature selection give the best accuracy with respective classification model RF i.e., 88.65, 88.56% respectively.

EEG Feature Classification for Precise Motion Control of Artificial Hand (의수의 정확한 움직임 제어를 위한 동작 별 뇌파 특징 분류)

  • Kim, Dong-Eun;Yu, Je-Hun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.1
    • /
    • pp.29-34
    • /
    • 2015
  • Brain-computer interface (BCI) is being studied for convenient life in various application fields. The purpose of this study is to investigate a changing electroencephalography (EEG) for precise motion of a robot or an artificial arm. Three subjects who participated in this experiment performed three-task: Grip, Move, Relax. Acquired EEG data was extracted feature data using two feature extraction algorithm (power spectrum analysis and multi-common spatial pattern). Support vector machine (SVM) were applied the extracted feature data for classification. The classification accuracy was the highest at Grip class of two subjects. The results of this research are expected to be useful for patients required prosthetic limb using EEG.