• Title/Summary/Keyword: Multi-category classification

Search Result 43, Processing Time 0.024 seconds

A Fingerprint Classification Method Based on the Combination of Gray Level Co-Occurrence Matrix and Wavelet Features (명암도 동시발생 행렬과 웨이블릿 특징 조합에 기반한 지문 분류 방법)

  • Kang, Seung-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.7
    • /
    • pp.870-878
    • /
    • 2013
  • In this paper, we propose a novel fingerprint classification method to enhance the accuracy and efficiency of the fingerprint identification system, one of biometrics systems. According to the previous researches, fingerprints can be categorized into the several patterns based on their pattern of ridges and valleys. After construction of fingerprint database based on their patters, fingerprint classification approach can help to accelerate the fingerprint recognition. The reason is that classification methods reduce the size of the search space to the fingerprints of the same category before matching. First, we suggest a method to extract region of interest (ROI) which have real information about fingerprint from the image. And then we propose a feature extraction method which combines gray level co-occurrence matrix (GLCM) and wavelet features. Finally, we compare the performance of our proposed method with the existing method which use only GLCM as the feature of fingerprint by using the multi-layer perceptron and support vector machine.

Assessing Spatial Uncertainty Distributions in Classification of Remote Sensing Imagery using Spatial Statistics (공간 통계를 이용한 원격탐사 화상 분류의 공간적 불확실성 분포 추정)

  • Park No-Wook;Chi Kwang-Hoon;Kwon Byung-Doo
    • Korean Journal of Remote Sensing
    • /
    • v.20 no.6
    • /
    • pp.383-396
    • /
    • 2004
  • The application of spatial statistics to obtain the spatial uncertainty distributions in classification of remote sensing images is investigated in this paper. Two quantitative methods are presented for describing two kinds of uncertainty; one related to class assignment and the other related to the connection of reference samples. Three quantitative indices are addressed for the first category of uncertainty. Geostatistical simulation is applied both to integrate the exhaustive classification results with the sparse reference samples and to obtain the spatial uncertainty or accuracy distributions connected to those reference samples. To illustrate the proposed methods and to discuss the operational issues, the experiment was done on a multi-sensor remote sensing data set for supervised land-cover classification. As an experimental result, the two quantitative methods presented in this paper could provide additional information for interpreting and evaluating the classification results and more experiments should be carried out for verifying the presented methods.

A Propose of New Classification Indication about Work of Art through Numeric and Multivariate Data Analysis - Focused on the Specialist - (예술작품의 수치화와 다변량분석에 의한 새로운 분류 제안 - 전문가를 중심으로 -)

  • Suh, Myung-Ae;Ree, Sang-Bok
    • Journal of Korean Society for Quality Management
    • /
    • v.35 no.4
    • /
    • pp.67-77
    • /
    • 2007
  • We tried new interpreting about the work of art in this paper. The work of art respects the intention of the artist to make it and interprets intention until now. After critics distinguish by a period, an area that they set to philosophical thought which is the time and interpreted. We set to each one subjectivity and interpreted between artist to make the work of art and appreciator. But in this paper, we tied various criteria which appreciates the work of art. We tried so that we presented the intimacy each other newly. Otherwise we tied with the subjectivity of the individual and are the try to be an objectification low through statistical technique. We looked into the culture and art in the introduction and explain the discussion about the work of art interpreting which the main subject. We set the category 6 area, and explain an each criteria explanation and assessment method. We tried to propose new interpreting as the intimacy to be multi-variate data analysis result of the assessment analysis.

Korea Emissions Inventory Processing Using the US EPA's SMOKE System

  • Kim, Soon-Tae;Moon, Nan-Kyoung;Byun, Dae-Won W.
    • Asian Journal of Atmospheric Environment
    • /
    • v.2 no.1
    • /
    • pp.34-46
    • /
    • 2008
  • Emissions inputs for use in air quality modeling of Korea were generated with the emissions inventory data from the National Institute of Environmental Research (NIER), maintained under the Clean Air Policy Support System (CAPSS) database. Source Classification Codes (SCC) in the Korea emissions inventory were adapted to use with the U.S. EPA's Sparse Matrix Operator Kernel Emissions (SMOKE) by finding the best-matching SMOKE default SCCs for the chemical speciation and temporal allocation. A set of 19 surrogate spatial allocation factors for South Korea were developed utilizing the Multi-scale Integrated Modeling System (MIMS) Spatial Allocator and Korean GIS databases. The mobile and area source emissions data, after temporal allocation, show typical sinusoidal diurnal variations with high peaks during daytime, while point source emissions show weak diurnal variations. The model-ready emissions are speciated for the carbon bond version 4 (CB-4) chemical mechanism. Volatile organic carbon (VOC) emissions from painting related industries in area source category significantly contribute to TOL (Toluene) and XYL (Xylene) emissions. ETH (Ethylene) emissions are largely contributed from point industrial incineration facilities and various mobile sources. On the other hand, a large portion of OLE (Olefin) emissions are speciated from mobile sources in addition to those contributed by the polypropylene industry in point source. It was found that FORM (Formaldehyde) is mostly emitted from petroleum industry and heavy duty diesel vehicles. Chemical speciation of PM2.5 emissions shows that PEC (primary fine elemental carbon) and POA (primary fine organic aerosol) are the most abundant species from diesel and gasoline vehicles. To reduce uncertainties in processing the Korea emission inventory due to the mapping of Korean SCCs to those of U.S., it would be practical to develop and use domestic source profiles for the top 10 SCCs for area and point sources and top 5 SCCs for on-road mobile sources when VOC emissions from the sources are more than 90% of the total.

An Experimental Study on Automatic Summarization of Multiple News Articles (복수의 신문기사 자동요약에 관한 실험적 연구)

  • Kim, Yong-Kwang;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.1 s.59
    • /
    • pp.83-98
    • /
    • 2006
  • This study proposes a template-based method of automatic summarization of multiple news articles using the semantic categories of sentences. First, the semantic categories for core information to be included in a summary are identified from training set of documents and their summaries. Then, cue words for each slot of the template are selected for later classification of news sentences into relevant slots. When a news article is input, its event/accident category is identified, and key sentences are extracted from the news article and filled in the relevant slots. The template filled with simple sentences rather than original long sentences is used to generate a summary for an event/accident. In the user evaluation of the generated summaries, the results showed the 54.l% recall ratio and the 58.l% precision ratio in essential information extraction and 11.6% redundancy ratio.

Application of Text-Classification Based Machine Learning in Predicting Psychiatric Diagnosis (텍스트 분류 기반 기계학습의 정신과 진단 예측 적용)

  • Pak, Doohyun;Hwang, Mingyu;Lee, Minji;Woo, Sung-Il;Hahn, Sang-Woo;Lee, Yeon Jung;Hwang, Jaeuk
    • Korean Journal of Biological Psychiatry
    • /
    • v.27 no.1
    • /
    • pp.18-26
    • /
    • 2020
  • Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpatient admission notes with three diagnoses of major depressive disorder, type 1 bipolar disorder, and schizophrenia. Data were split into 400 training data and 94 independent validation data. Data were vectorized by two different models such as term frequency-inverse document frequency (TF-IDF) and Doc2vec. Machine learning models for classification including stochastic gradient descent, logistic regression, support vector classification, and deep learning (DL) were applied to predict three psychiatric diagnoses. Five-fold cross-validation was used to find an effective model. Metrics such as accuracy, precision, recall, and F1-score were measured for comparison between the models. Results Five-fold cross-validation in training data showed DL model with Doc2vec was the most effective model to predict the diagnosis (accuracy = 0.87, F1-score = 0.87). However, these metrics have been reduced in independent test data set with final working DL models (accuracy = 0.79, F1-score = 0.79), while the model of logistic regression and support vector machine with Doc2vec showed slightly better performance (accuracy = 0.80, F1-score = 0.80) than the DL models with Doc2vec and others with TF-IDF. Conclusions The current results suggest that the vectorization may have more impact on the performance of classification than the machine learning model. However, data set had a number of limitations including small sample size, imbalance among the category, and its generalizability. With this regard, the need for research with multi-sites and large samples is suggested to improve the machine learning models.

Study on New Classification Indication about Work of Art through Multi-variate Data Analysis;On Focused Specialist (다변량분석에 의한 예술작품 분류 시도 연구;전문가를 중심으로)

  • Suh, Myung-Ae;Ree, Sang-Bok
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2006.11a
    • /
    • pp.251-259
    • /
    • 2006
  • Evaluation of the work of art with intention of the artist different is not a possibility of free oneself from the limit which estimates an evaluation at value of appreciator. We tried new interpreting about the work of art in this paper. The work of art respects the intention of the artist to make it and interprets intention until now. After critics distinguish by a period, an area that they set to philosophical thought which is the time and interpreted. We set to each one subjectivity and interpreted between artist to make the work of art and appreciator. But in this paper, we tied various criteria which appreciates the work of art. We tried so that we presented the intimacy each other newly. Otherwise we tied with the subjectivity of the individual and are the try to be an objectification low through statistical technique. We looked into the culture and art in the introduction and explain the discussion about the work of art interpreting which the main subject. We set the category 6 area, and explain an each criteria explanation and assessment method. We tried to propose new interpreting as the intimacy to be multivariate data analysis result of the assessment analysis. Stopping from the thing which sees the work of art knows, it will be able to give meaning thing from this research prerequisite.

  • PDF

Intrusion Detection: Supervised Machine Learning

  • Fares, Ahmed H.;Sharawy, Mohamed I.;Zayed, Hala H.
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.4
    • /
    • pp.305-313
    • /
    • 2011
  • Due to the expansion of high-speed Internet access, the need for secure and reliable networks has become more critical. The sophistication of network attacks, as well as their severity, has also increased recently. As such, more and more organizations are becoming vulnerable to attack. The aim of this research is to classify network attacks using neural networks (NN), which leads to a higher detection rate and a lower false alarm rate in a shorter time. This paper focuses on two classification types: a single class (normal, or attack), and a multi class (normal, DoS, PRB, R2L, U2R), where the category of attack is also detected by the NN. Extensive analysis is conducted in order to assess the translation of symbolic data, partitioning of the training data and the complexity of the architecture. This paper investigates two engines; the first engine is the back-propagation neural network intrusion detection system (BPNNIDS) and the second engine is the radial basis function neural network intrusion detection system (BPNNIDS). The two engines proposed in this paper are tested against traditional and other machine learning algorithms using a common dataset: the DARPA 98 KDD99 benchmark dataset from International Knowledge Discovery and Data Mining Tools. BPNNIDS shows a superior response compared to the other techniques reported in literature especially in terms of response time, detection rate and false positive rate.

Prediction of Software Fault Severity using Deep Learning Methods (딥러닝을 이용한 소프트웨어 결함 심각도 예측)

  • Hong, Euyseok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.6
    • /
    • pp.113-119
    • /
    • 2022
  • In software fault prediction, a multi classification model that predicts the fault severity category of a module can be much more useful than a binary classification model that simply predicts the presence or absence of faults. A small number of severity-based fault prediction models have been proposed, but no classifier using deep learning techniques has been proposed. In this paper, we construct MLP models with 3 or 5 hidden layers, and they have a structure with a fixed or variable number of hidden layer nodes. As a result of the model evaluation experiment, MLP-based deep learning models shows significantly better performance in both Accuracy and AUC than MLPs, which showed the best performance among models that did not use deep learning. In particular, the model structure with 3 hidden layers, 32 batch size, and 64 nodes shows the best performance.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.