• 제목/요약/키워드: Data Classification Systems

검색결과 1,432건 처리시간 0.025초

디자인 패턴을 적용한 위성영상처리를 위한 군집화 분류시스템의 설계 (A Design of Clustering Classification Systems using Satellite Remote Sensing Images Based on Design Patterns)

  • 김동연;김진일
    • 정보처리학회논문지B
    • /
    • 제9B권3호
    • /
    • pp.319-326
    • /
    • 2002
  • 본 논문에서는 위성영상을 처리하기 위한 무감독분류 기법인 군집분류 시스템을 설계하고 구현하였다. 구현된 시스템은 새로운 위성영상 포맷과 군집분류 기법의 지원이 용이하고, 확장성 있는 시스템의 설계를 위하여 팩토리 패턴과 전략적 패턴 등 다양한 디자인 패턴을 적용하였다. 군집분류 시스템은 순차군집분류 기법, K-Means 군집분류 기법, ISODATA 기법, Fuzzy C-Means군집분류 기법을 설계, 구현하였으며 Landsat TM 위성영상을 분류기의 입력영상으로 실험하였다. 그 결과 군집분류 기법은 사전지식이 없는 위성영상의 분류를 위한 표본영역의 추출작업과 위성영상의 실시간 분류에 효과적인 사용이 가능함을 보였으며, 재사용성 및 확장성이 우수한 시스템을 개발하였다.

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제13권4호
    • /
    • pp.421-431
    • /
    • 2014
  • The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

엔트로피 기반 분할과 중심 인스턴스를 이용한 분류기법의 데이터 감소 (Data Reduction for Classification using Entropy-based Partitioning and Center Instances)

  • 손승현;김재련
    • 산업경영시스템학회지
    • /
    • 제29권2호
    • /
    • pp.13-19
    • /
    • 2006
  • The instance-based learning is a machine learning technique that has proven to be successful over a wide range of classification problems. Despite its high classification accuracy, however, it has a relatively high storage requirement and because it must search through all instances to classify unseen cases, it is slow to perform classification. In this paper, we have presented a new data reduction method for instance-based learning that integrates the strength of instance partitioning and attribute selection. Experimental results show that reducing the amount of data for instance-based learning reduces data storage requirements, lowers computational costs, minimizes noise, and can facilitates a more rapid search.

골 성숙도 판별을 위한 심층 메타 학습 기반의 분류 문제 학습 방법 (Deep Meta Learning Based Classification Problem Learning Method for Skeletal Maturity Indication)

  • 민정원;강동중
    • 한국멀티미디어학회논문지
    • /
    • 제21권2호
    • /
    • pp.98-107
    • /
    • 2018
  • In this paper, we propose a method to classify the skeletal maturity with a small amount of hand wrist X-ray image using deep learning-based meta-learning. General deep-learning techniques require large amounts of data, but in many cases, these data sets are not available for practical application. Lack of learning data is usually solved through transfer learning using pre-trained models with large data sets. However, transfer learning performance may be degraded due to over fitting for unknown new task with small data, which results in poor generalization capability. In addition, medical images require high cost resources such as a professional manpower and mcuh time to obtain labeled data. Therefore, in this paper, we use meta-learning that can classify using only a small amount of new data by pre-trained models trained with various learning tasks. First, we train the meta-model by using a separate data set composed of various learning tasks. The network learns to classify the bone maturity using the bone maturity data composed of the radiographs of the wrist. Then, we compare the results of the classification using the conventional learning algorithm with the results of the meta learning by the same number of learning data sets.

One-dimensional CNN Model of Network Traffic Classification based on Transfer Learning

  • Lingyun Yang;Yuning Dong;Zaijian Wang;Feifei Gao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권2호
    • /
    • pp.420-437
    • /
    • 2024
  • There are some problems in network traffic classification (NTC), such as complicated statistical features and insufficient training samples, which may cause poor classification effect. A NTC architecture based on one-dimensional Convolutional Neural Network (CNN) and transfer learning is proposed to tackle these problems and improve the fine-grained classification performance. The key points of the proposed architecture include: (1) Model classification--by extracting normalized rate feature set from original data, plus existing statistical features to optimize the CNN NTC model. (2) To apply transfer learning in the classification to improve NTC performance. We collect two typical network flows data from Youku and YouTube, and verify the proposed method through extensive experiments. The results show that compared with existing methods, our method could improve the classification accuracy by around 3-5%for Youku, and by about 7 to 27% for YouTube.

신경회로망을 이용한 분류모형 개발 (Development of Classification Model Using Neural Network)

  • 박광박;박영만;황승국
    • 한국지능시스템학회논문지
    • /
    • 제18권5호
    • /
    • pp.638-641
    • /
    • 2008
  • 본 논문에서는 데이터를 사전처리 한 후 Fuzzy TAM을 이용하여 분류하는 방법을 개발하였다. 사전 처리 방식은 category형 특성인 경우는 그 특성을 이용하여 문제를 분해시키고, 계량형 특성의 경우는 클래스별 영역을 설정하고 겹치지 않는 특성 영역이 있다면 그 영역의 자료를 고정시켜 분류에서 제외시킨다. 이러한 사전 처리를 한 후 Fuzzy TAM을 이용하여 분류를 수행한다.

분류체계에 관한 인용분석 - 국제서지를 바탕으로 - (A Reference Study on International Literature of Classification Systems During the Period 1981-1990)

  • 정연경
    • 한국문헌정보학회지
    • /
    • 제26권
    • /
    • pp.187-212
    • /
    • 1994
  • The present study examines the characteristics of the international literature of classification systems published in the period 1981-1990. The references in the 'Classification Literature' sections of International Classification and the references in these source items were examined. The present study focused on analyzing each of the following characteristics: format, subject, language, geographical origin, age, authorship and number of references. The findings from the data analyses show clearly that in the literature of classification systems, I) books were the most frequently cited format; 2) library and information science was the most frequently cited subject; 3) English was the major language; 4) the literature of each classification system was written predominently in English except for Library Bibliographic Classification; 5) the language of each source item was the same as that of the greatest number of references of that source item: 6) the U.S., Germany, India, Russia, and the U.K. were the major geographic origin of publication; 7) there was a very close relationship between country of publication and language: 8) the country of origin of the documents was cited more than any other country except for the U.S.: 9) Price's Index of the literature revealed that the literature was a soft science and the half-life of the literature was about 7.5 years; 10) there was a preponderance of single authorships; 11) the literature was not a scholarly or scientific literature, according to the average number of references in source items and the percentage of unreferenced items. The findings of this reference study provide a better understanding of the characteristics of the classification systems literature. They prove useful for the collection development and assist classification systems researchers to prepare linguistically for their careers and encourage international communication efforts.

  • PDF

Support Vector Machine based on Stratified Sampling

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제9권2호
    • /
    • pp.141-146
    • /
    • 2009
  • Support vector machine is a classification algorithm based on statistical learning theory. It has shown many results with good performances in the data mining fields. But there are some problems in the algorithm. One of the problems is its heavy computing cost. So we have been difficult to use the support vector machine in the dynamic and online systems. To overcome this problem we propose to use stratified sampling of statistical sampling theory. The usage of stratified sampling supports to reduce the size of training data. In our paper, though the size of data is small, the performance accuracy is maintained. We verify our improved performance by experimental results using data sets from UCI machine learning repository.

통계적 정보기반 계층적 퍼지-러프 분류기법 (Statistical Information-Based Hierarchical Fuzzy-Rough Classification Approach)

  • 손창식;서석태;정환묵;권순학
    • 한국지능시스템학회논문지
    • /
    • 제17권6호
    • /
    • pp.792-798
    • /
    • 2007
  • 본 논문에서는 학습기법을 사용하지 않고 패턴분류의 성능을 최대화하면서 규칙의 수를 줄일 수 있는 통계적 정보기반 계층적 퍼지-러프 분류방법을 제안한다. 제안된 방법에서 통계적 정보는 계층적 퍼지-러프 분류 시스템에서 각 계층의 입력부 퍼지집합의 분할 구간을 추출하기 위해서 사용되었고, 러프집합은 통계적 정보로부터 추출된 분할 구간들과 연관된 퍼지 if-then 규칙의 수를 최소화하기 위해서 사용되었다. 제안된 방법의 효과성을 보이기 위해 Fisher의 IRIS 데이터를 사용한 기존 패턴분류 방법의 분류 정확도와 규칙들의 수를 비교하였다. 그 결과, 제안된 방법은 기존 방법들의 분류 성능과 유사함을 확인할 수 있었다.

데이터 마이닝에서 패턴 분류를 위한 다중 SVM 분류기 (Multiple SVM Classifier for Pattern Classification in Data Mining)

  • 김만선;이상용
    • 한국지능시스템학회논문지
    • /
    • 제15권3호
    • /
    • pp.289-293
    • /
    • 2005
  • 패턴 분류는 실세계의 객체를 표현한 다양한 형태의 패턴 정보를 추출하여, 이것이 어떤 부류(클래스)인가를 결정하는 것이다. 패턴 분류 기술은 데이터 마이닝, 산업 자동화나 업무자동화를 위한 컴퓨터 응용 소프트웨어 기술로서 현재 다양한 분야에서 활용되고 있다. 패턴 분류 기술의 최대 목표는 분류 성능 향상이며 이것을 위해 지난 40년간 많은 연구자들이 다양한 접근 방법들을 시도해 왔다. 주로 이용되는 단일 분류 방법들로는 패턴들의 확률적 추론에 기반한 베이즈 분류기, 결정 트리, 거리함수를 이용하는 방법, 신경망, 군집화 등이 있으나 대용량 다차원 데이터를 분석하기에는 효율적이지 못하다. 따라서 상호 보완적인 여러 분류기들을 사용해 결합을 통하여 성능 향상에 도움을 주고 있는 다중 분류기 시스템에 대한 연구가 활발하게 진행되고 있다. 본 논문에서는 다중 SVM(Support Vector Machine) 분류기에 관한 기존 연구의 문제점을 지적하고 새로운 모델을 제안한다. SVM을 다중 클래스 분류기로 확장하기 위해 일대다 정책을 기반으로 하여 각각의 SVM 출력값을 비선형 패턴을 갖는 신호로 간주하고 이를 신경망에 학습하여 최종 분류 성능 결과를 결합하는 모델인 BORSE(Bootstrap Resampling SVM by Ensemble)를 제안한다.