• 제목/요약/키워드: Data classification

검색결과 7,933건 처리시간 0.033초

얼굴 인식 성능 향상을 위한 재분류 방법 (Re-classifying Method for Face Recognition)

  • 배경률
    • 지능정보연구
    • /
    • 제10권3호
    • /
    • pp.105-114
    • /
    • 2004
  • 최근 생체인식에 대한 관심이 증가하면서 출입 통제나 사용자 인증과 같은 보안 분야에 적용이 활발히 진행되고 있다. 특히 얼굴인식은 생체인식 기술 중 사용자 편의성과 접촉 거부감이 적어 활용성이 증대되고 있으나 타 인식기술에 비해 인식 결과의 정확성과 재시도율(Re-attempt Rate)에 취약한 단점이 있다. 본 논문에서는 이러한 단점을 보완하기 위해 데이터 분류 방법(Data Classification Algorithm)으로 인식 결과를 재분류(Re-Classification)하는 접근법에 대해서 제안하고자 한다. 본 실험을 위해서 대표적인 형상 기반(Appearance-based) 알고리즘인 PCA를 사용하였고, 200명(총 얼굴 영상 200장)을 대상으로 제안한 재분류 접근법을 적용한 결과 재인식의 경우 성능이 향상되었음을 확인하였다.

  • PDF

A Predictive Model to identify possible affected Bipolar disorder students using Naive Baye's, Random Forest and SVM machine learning techniques of data mining and Building a Sequential Deep Learning Model using Keras

  • Peerbasha, S.;Surputheen, M. Mohamed
    • International Journal of Computer Science & Network Security
    • /
    • 제21권5호
    • /
    • pp.267-274
    • /
    • 2021
  • Medical care practices include gathering a wide range of student data that are with manic episodes and depression which would assist the specialist with diagnosing a health condition of the students correctly. In this way, the instructors of the specific students will also identify those students and take care of them well. The data which we collected from the students could be straightforward indications seen by them. The artificial intelligence has been utilized with Naive Baye's classification, Random forest classification algorithm, SVM algorithm to characterize the datasets which we gathered to check whether the student is influenced by Bipolar illness or not. Performance analysis of the disease data for the algorithms used is calculated and compared. Also, a sequential deep learning model is builded using Keras. The consequences of the simulations show the efficacy of the grouping techniques on a dataset, just as the nature and complexity of the dataset utilized.

ConvXGB: A new deep learning model for classification problems based on CNN and XGBoost

  • Thongsuwan, Setthanun;Jaiyen, Saichon;Padcharoen, Anantachai;Agarwal, Praveen
    • Nuclear Engineering and Technology
    • /
    • 제53권2호
    • /
    • pp.522-531
    • /
    • 2021
  • We describe a new deep learning model - Convolutional eXtreme Gradient Boosting (ConvXGB) for classification problems based on convolutional neural nets and Chen et al.'s XGBoost. As well as image data, ConvXGB also supports the general classification problems, with a data preprocessing module. ConvXGB consists of several stacked convolutional layers to learn the features of the input and is able to learn features automatically, followed by XGBoost in the last layer for predicting the class labels. The ConvXGB model is simplified by reducing the number of parameters under appropriate conditions, since it is not necessary re-adjust the weight values in a back propagation cycle. Experiments on several data sets from UCL Repository, including images and general data sets, showed that our model handled the classification problems, for all the tested data sets, slightly better than CNN and XGBoost alone and was sometimes significantly better.

Toward Practical Augmentation of Raman Spectra for Deep Learning Classification of Contamination in HDD

  • Seksan Laitrakun;Somrudee Deepaisarn;Sarun Gulyanon;Chayud Srisumarnk;Nattapol Chiewnawintawat;Angkoon Angkoonsawaengsuk;Pakorn Opaprakasit;Jirawan Jindakaew;Narisara Jaikaew
    • Journal of information and communication convergence engineering
    • /
    • 제21권3호
    • /
    • pp.208-215
    • /
    • 2023
  • Deep learning techniques provide powerful solutions to several pattern-recognition problems, including Raman spectral classification. However, these networks require large amounts of labeled data to perform well. Labeled data, which are typically obtained in a laboratory, can potentially be alleviated by data augmentation. This study investigated various data augmentation techniques and applied multiple deep learning methods to Raman spectral classification. Raman spectra yield fingerprint-like information about chemical compositions, but are prone to noise when the particles of the material are small. Five augmentation models were investigated to build robust deep learning classifiers: weighted sums of spectral signals, imitated chemical backgrounds, extended multiplicative signal augmentation, and generated Gaussian and Poisson-distributed noise. We compared the performance of nine state-of-the-art convolutional neural networks with all the augmentation techniques. The LeNet5 models with background noise augmentation yielded the highest accuracy when tested on real-world Raman spectral classification at 88.33% accuracy. A class activation map of the model was generated to provide a qualitative observation of the results.

다중시기 위성영상의 무감독분류에 의한 갯벌의 입자 분포도 (Particulate Distribution Map of Tidal Flat using Unsupervised Classification of Multi-Temporary Satellite Data)

  • 정종철
    • 대한원격탐사학회지
    • /
    • 제18권2호
    • /
    • pp.71-79
    • /
    • 2002
  • 본 연구는 현장조사에서 얻어진 갯벌의 퇴적물 입자조성과 동일시기의 위성영상에서 추출된 반사치를 이용하여 함평만 갯벌의 입자분포도를 제시하였다. Landsat TM 자료에서 추출된 갯벌 입자조성에 따른 스팩트럼이 분석되었고, 7개의 위성영상은 ISODATA 와 K-MEANS 방법으로 분류되었다. 무감독분류된 결과는 현장관측치에 의해 분류 정확도가 평가되었으며, ISODATA와 K-MEANS 방법의 분류 정확도는 84.3%와 85.7%이다. 다중시기 위성영상 분류 결과를 검증하기 위해 현장조사 자료에 의해 분류된 1999년 5월 TM 영상을 참조자료로 하여 다중시기의 영상분류 결과를 비교하였다.

이산 웨이블릿 변환을 이용한 지문의 계층적 분류 (Hierarchical classification of Fingerprints using Discrete Wavelet Transform)

  • 권용호;이정문
    • 산업기술연구
    • /
    • 제19권
    • /
    • pp.403-408
    • /
    • 1999
  • An efficient method is developed for classifying fingerprint data based on 2-D discrete wavelet transform. Fingerprint data is first converted to a binary image. Then a multi-level 2-D wavelet transform is performed. Vertical and horizontal subbands of the transformed data show typical energy distribution patterns relevant to the fingerprint categories. The proposed method with moderate level of wavelet transform is successful in classifying fingerprints into 5 different types. Finer classification is possible by higher frequency subbands and closer analysis of energy distribution.

  • PDF

Pre-Adjustment of Incomplete Group Variable via K-Means Clustering

  • Hwang, S.Y.;Hahn, H.E.
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권3호
    • /
    • pp.555-563
    • /
    • 2004
  • In classification and discrimination, we often face with incomplete group variable arising typically from many missing values and/or incredible cases. This paper suggests the use of K-means clustering for pre-adjusting incompleteness and in turn classification based on generalized statistical distance is performed. For illustrating the proposed procedure, simulation study is conducted comparatively with CART in data mining and traditional techniques which are ignoring incompleteness of group variable. Simulation study manifests that our methodology out-performs.

  • PDF

국가토지피복도와 무감독분류를 이용한 초기 훈련자료 자동추출과 토지피복지도 갱신 (Automatic Extraction of Initial Training Data Using National Land Cover Map and Unsupervised Classification and Updating Land Cover Map)

  • 이승기;최석근;노신택;임노열;최주원
    • 한국측량학회지
    • /
    • 제33권4호
    • /
    • pp.267-275
    • /
    • 2015
  • 토지피복지도는 환경, 군사, 의사결정 등 다양한 분야에서 널리 사용되고 있다. 본 연구에서는 단일 위성영상과 환경부에서 제공하는 국가토지피복도를 이용하여 훈련자료를 자동으로 추출하고, 이를 활용하여 피복을 분류하는 방법을 제안하였다. 이를 위하여 초기 훈련자료는 무감독분류인 ISODATA와 기존 토지피복도를 이용하였으며, 무감독 분류 사용시 각 클래스별 분류 선정과 클래스 명명, 감독분류에서 훈련자료 선정 등의 문제점을 해결하기 위하여 기존 토지피복도의 클래스 정보를 활용하여 자동으로 클래스를 분류하고 명명하였다. 추출된 초기 훈련자료는 대상 위성영상의 토지피복분류를 위하여 MLC의 훈련자료를 활용하였고, 피복분류의 정확도 향상을 위하여 반복방법을 적용하여 훈련자료를 갱신하였으며 최종적으로 토지피복지도를 추출하였다. 또한, 화소분류방법에서 발생하는 salt and pepper를 감소시키기 위하여 각 반복단계별 MRF를 적용하여 분류정확도를 향상시켰다. 본 연구에서 제안된 방법을 대상지역에 적용한 결과 효과적으로 토지피복지도를 생성할 수 있음을 정량적, 시각적으로 확인하였다.

Application of Multispectral Remotely Sensed Imagery for the Characterization of Complex Coastal Wetland Ecosystems of southern India: A Special Emphasis on Comparing Soft and Hard Classification Methods

  • Shanmugam, Palanisamy;Ahn, Yu-Hwan;Sanjeevi , Shanmugam
    • 대한원격탐사학회지
    • /
    • 제21권3호
    • /
    • pp.189-211
    • /
    • 2005
  • This paper makes an effort to compare the recently evolved soft classification method based on Linear Spectral Mixture Modeling (LSMM) with the traditional hard classification methods based on Iterative Self-Organizing Data Analysis (ISODATA) and Maximum Likelihood Classification (MLC) algorithms in order to achieve appropriate results for mapping, monitoring and preserving valuable coastal wetland ecosystems of southern India using Indian Remote Sensing Satellite (IRS) 1C/1D LISS-III and Landsat-5 Thematic Mapper image data. ISODATA and MLC methods were attempted on these satellite image data to produce maps of 5, 10, 15 and 20 wetland classes for each of three contrast coastal wetland sites, Pitchavaram, Vedaranniyam and Rameswaram. The accuracy of the derived classes was assessed with the simplest descriptive statistic technique called overall accuracy and a discrete multivariate technique called KAPPA accuracy. ISODATA classification resulted in maps with poor accuracy compared to MLC classification that produced maps with improved accuracy. However, there was a systematic decrease in overall accuracy and KAPPA accuracy, when more number of classes was derived from IRS-1C/1D and Landsat-5 TM imagery by ISODATA and MLC. There were two principal factors for the decreased classification accuracy, namely spectral overlapping/confusion and inadequate spatial resolution of the sensors. Compared to the former, the limited instantaneous field of view (IFOV) of these sensors caused occurrence of number of mixture pixels (mixels) in the image and its effect on the classification process was a major problem to deriving accurate wetland cover types, in spite of the increasing spatial resolution of new generation Earth Observation Sensors (EOS). In order to improve the classification accuracy, a soft classification method based on Linear Spectral Mixture Modeling (LSMM) was described to calculate the spectral mixture and classify IRS-1C/1D LISS-III and Landsat-5 TM Imagery. This method considered number of reflectance end-members that form the scene spectra, followed by the determination of their nature and finally the decomposition of the spectra into their endmembers. To evaluate the LSMM areal estimates, resulted fractional end-members were compared with normalized difference vegetation index (NDVI), ground truth data, as well as those estimates derived from the traditional hard classifier (MLC). The findings revealed that NDVI values and vegetation fractions were positively correlated ($r^2$= 0.96, 0.95 and 0.92 for Rameswaram, Vedaranniyam and Pitchavaram respectively) and NDVI and soil fraction values were negatively correlated ($r^2$ =0.53, 0.39 and 0.13), indicating the reliability of the sub-pixel classification. Comparing with ground truth data, the precision of LSMM for deriving moisture fraction was 92% and 96% for soil fraction. The LSMM in general would seem well suited to locating small wetland habitats which occurred as sub-pixel inclusions, and to representing continuous gradations between different habitat types.

최근 MODIS 식생지수 자료(2006-2008)를 이용한 동아시아 지역 지면피복 분류 (Land Cover Classification over East Asian Region Using Recent MODIS NDVI Data (2006-2008))

  • 강전호;서명석;곽종흠
    • 대기
    • /
    • 제20권4호
    • /
    • pp.415-426
    • /
    • 2010
  • A Land cover map over East Asian region (Kongju national university Land Cover map: KLC) is classified by using support vector machine (SVM) and evaluated with ground truth data. The basic input data are the recent three years (2006-2008) of MODIS (MODerate Imaging Spectriradiometer) NDVI (normalized difference vegetation index) data. The spatial resolution and temporal frequency of MODIS NDVI are 1km and 16 days, respectively. To minimize the number of cloud contaminated pixels in the MODIS NDVI data, the maximum value composite is applied to the 16 days data. And correction of cloud contaminated pixels based on the spatiotemporal continuity assumption are applied to the monthly NDVI data. To reduce the dataset and improve the classification quality, 9 phenological data, such as, NDVI maximum, amplitude, average, and others, derived from the corrected monthly NDVI data. The 3 types of land cover maps (International Geosphere Biosphere Programme: IGBP, University of Maryland: UMd, and MODIS) were used to build up a "quasi" ground truth data set, which were composed of pixels where the three land cover maps classified as the same land cover type. The classification results show that the fractions of broadleaf trees and grasslands are greater, but those of the croplands and needleleaf trees are smaller compared to those of the IGBP or UMd. The validation results using in-situ observation database show that the percentages of pixels in agreement with the observations are 80%, 77%, 63%, 57% in MODIS, KLC, IGBP, UMd land cover data, respectively. The significant differences in land cover types among the MODIS, IGBP, UMd and KLC are mainly occurred at the southern China and Manchuria, where most of pixels are contaminated by cloud and snow during summer and winter, respectively. It shows that the quality of raw data is one of the most important factors in land cover classification.