• 제목/요약/키워드: Machine classification

검색결과 2,055건 처리시간 0.03초

Hand-crafted 특징 및 머신 러닝 기반의 은하 이미지 분류 기법 개발 (Development of Galaxy Image Classification Based on Hand-crafted Features and Machine Learning)

  • 오윤주;정희철
    • 대한임베디드공학회논문지
    • /
    • 제16권1호
    • /
    • pp.17-27
    • /
    • 2021
  • In this paper, we develop a galaxy image classification method based on hand-crafted features and machine learning techniques. Additionally, we provide an empirical analysis to reveal which combination of the techniques is effective for galaxy image classification. To achieve this, we developed a framework which consists of four modules such as preprocessing, feature extraction, feature post-processing, and classification. Finally, we found that the best technique for galaxy image classification is a method to use a median filter, ORB vector features and a voting classifier based on RBF SVM, random forest and logistic regression. The final method is efficient so we believe that it is applicable to embedded environments.

A Data-centric Analysis to Evaluate Suitable Machine-Learning-based Network-Attack Classification Schemes

  • Huong, Truong Thu;Bac, Ta Phuong;Thang, Bui Doan;Long, Dao Minh;Quang, Le Anh;Dan, Nguyen Minh;Hoang, Nguyen Viet
    • International Journal of Computer Science & Network Security
    • /
    • 제21권6호
    • /
    • pp.169-180
    • /
    • 2021
  • Since machine learning was invented, there have been many different machine learning-based algorithms, from shallow learning to deep learning models, that provide solutions to the classification tasks. But then it poses a problem in choosing a suitable classification algorithm that can improve the classification/detection efficiency for a certain network context. With that comes whether an algorithm provides good performance, why it works in some problems and not in others. In this paper, we present a data-centric analysis to provide a way for selecting a suitable classification algorithm. This data-centric approach is a new viewpoint in exploring relationships between classification performance and facts and figures of data sets.

Power Quality Disturbances Identification Method Based on Novel Hybrid Kernel Function

  • Zhao, Liquan;Gai, Meijiao
    • Journal of Information Processing Systems
    • /
    • 제15권2호
    • /
    • pp.422-432
    • /
    • 2019
  • A hybrid kernel function of support vector machine is proposed to improve the classification performance of power quality disturbances. The kernel function mathematical model of support vector machine directly affects the classification performance. Different types of kernel functions have different generalization ability and learning ability. The single kernel function cannot have better ability both in learning and generalization. To overcome this problem, we propose a hybrid kernel function that is composed of two single kernel functions to improve both the ability in generation and learning. In simulations, we respectively used the single and multiple power quality disturbances to test classification performance of support vector machine algorithm with the proposed hybrid kernel function. Compared with other support vector machine algorithms, the improved support vector machine algorithm has better performance for the classification of power quality signals with single and multiple disturbances.

Feature Selection and Hyper-Parameter Tuning for Optimizing Decision Tree Algorithm on Heart Disease Classification

  • Tsehay Admassu Assegie;Sushma S.J;Bhavya B.G;Padmashree S
    • International Journal of Computer Science & Network Security
    • /
    • 제24권2호
    • /
    • pp.150-154
    • /
    • 2024
  • In recent years, there are extensive researches on the applications of machine learning to the automation and decision support for medical experts during disease detection. However, the performance of machine learning still needs improvement so that machine learning model produces result that is more accurate and reliable for disease detection. Selecting the hyper-parameter that could produce the possible maximum classification accuracy on medical dataset is the most challenging task in developing decision support systems with machine learning algorithms for medical dataset classification. Moreover, selecting the features that best characterizes a disease is another challenge in developing machine-learning model with better classification accuracy. In this study, we have proposed an optimized decision tree model for heart disease classification by using heart disease dataset collected from kaggle data repository. The proposed model is evaluated and experimental test reveals that the performance of decision tree improves when an optimal number of features are used for training. Overall, the accuracy of the proposed decision tree model is 98.2% for heart disease classification.

기계학습 기반 저 복잡도 긴장 상태 분류 모델 (Design of Low Complexity Human Anxiety Classification Model based on Machine Learning)

  • 홍은재;박형곤
    • 전기학회논문지
    • /
    • 제66권9호
    • /
    • pp.1402-1408
    • /
    • 2017
  • Recently, services for personal biometric data analysis based on real-time monitoring systems has been increasing and many of them have focused on recognition of emotions. In this paper, we propose a classification model to classify anxiety emotion using biometric data actually collected from people. We propose to deploy the support vector machine to build a classification model. In order to improve the classification accuracy, we propose two data pre-processing procedures, which are normalization and data deletion. The proposed algorithms are actually implemented based on Real-time Traffic Flow Measurement structure, which consists of data collection module, data preprocessing module, and creating classification model module. Our experiment results show that the proposed classification model can infers anxiety emotions of people with the accuracy of 65.18%. Moreover, the proposed model with the proposed pre-processing techniques shows the improved accuracy, which is 78.77%. Therefore, we can conclude that the proposed classification model based on the pre-processing process can improve the classification accuracy with lower computation complexity.

API 정보와 기계학습을 통한 윈도우 실행파일 분류 (Classifying Windows Executables using API-based Information and Machine Learning)

  • 조대희;임경환;조성제;한상철;황영섭
    • 정보과학회 논문지
    • /
    • 제43권12호
    • /
    • pp.1325-1333
    • /
    • 2016
  • 소프트웨어 분류 기법은 저작권 침해 탐지, 악성코드의 분류, 소프트웨어 보관소의 소프트웨어 자동분류 등에 활용할 수 있으며, 불법 소프트웨어의 전송을 차단하기 위한 소프트웨어 필터링 시스템에도 활용할 수 있다. 소프트웨어 필터링 시스템에서 유사도 측정을 통해 불법 소프트웨어를 식별할 경우, 소프트웨어 분류를 활용하여 탐색 범위를 축소하면 평균 비교 횟수를 줄일 수 있다. 본 논문은 API 호출 정보와 기계학습을 통한 윈도우즈 실행파일 분류를 연구한다. 다양한 API 호출 정보 정제 방식과 기계학습 알고리즘을 적용하여 실행파일 분류 성능을 평가한다. 실험 결과, PolyKernel을 사용한 SVM (Support Vector Machine)이 가장 높은 성공률을 보였다. API 호출 정보는 바이너리 실행파일에서 추출할 수 있는 정보이며, 기계학습을 적용하여 변조 프로그램을 식별하고 실행파일의 빠른 분류가 가능하다. 그러므로 API 호출 정보와 기계학습에 기반한 소프트웨어 분류는 소프트웨어 필터링 시스템에 활용하기에 적당하다.

An Improved Text Classification Method for Sentiment Classification

  • Wang, Guangxing;Shin, Seong Yoon
    • Journal of information and communication convergence engineering
    • /
    • 제17권1호
    • /
    • pp.41-48
    • /
    • 2019
  • In recent years, sentiment analysis research has become popular. The research results of sentiment analysis have achieved remarkable results in practical applications, such as in Amazon's book recommendation system and the North American movie box office evaluation system. Analyzing big data based on user preferences and evaluations and recommending hot-selling books and hot-rated movies to users in a targeted manner greatly improve book sales and attendance rate in movies [1, 2]. However, traditional machine learning-based sentiment analysis methods such as the Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) had performed poorly in accuracy. In this paper, an improved kNN classification method is proposed. Through the improved method and normalizing of data, the purpose of improving accuracy is achieved. Subsequently, the three classification algorithms and the improved algorithm were compared based on experimental data. Experiments show that the improved method performs best in the kNN classification method, with an accuracy rate of 11.5% and a precision rate of 20.3%.

A Novel Thresholding for Prediction Analytics with Machine Learning Techniques

  • Shakir, Khan;Reemiah Muneer, Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.33-40
    • /
    • 2023
  • Machine-learning techniques are discovering effective performance on data analytics. Classification and regression are supported for prediction on different kinds of data. There are various breeds of classification techniques are using based on nature of data. Threshold determination is essential to making better model for unlabelled data. In this paper, threshold value applied as range, based on min-max normalization technique for creating labels and multiclass classification performed on rainfall data. Binary classification is applied on autism data and classification techniques applied on child abuse data. Performance of each technique analysed with the evaluation metrics.

지식 기반 시스템에서 GIS 자료를 활용하기 위한 기계 학습 기법에 관한 연구 - Landsat ETM+ 영상의 토지 피복 분류를 사례로 (A Machine learning Approach for Knowledge Base Construction Incorporating GIS Data for land Cover Classification of Landsat ETM+ Image)

  • 김화환;구자용
    • 대한지리학회지
    • /
    • 제43권5호
    • /
    • pp.761-774
    • /
    • 2008
  • 원격탐사에서 위성 영상의 디지털 처리 기술이 발달하면서 GIS 자료와 지식 기반 전문가 시스템과의 통합에 대한 관심이 증가하고 있다. 본 연구에서는 위성영상을 토지피복 분류하는 과정에서 GIS 자료를 통합하기 위하여 기계 학습 기법과 규칙 기반 분류 기법을 적용하였다. 사례 지역을 대상으로 Landsat ETM+ 영상과 고도, 경사, 향, 수역과의 거리, 도로와의 거리, 인구밀도 등의 GIS 자료를 함께 활용하였다. C5.0 추론 기계 학습 알고리듬을 이용하여 350개의 표본점으로부터 결정 트리와 분류 규칙을 생성하였다. 본 연구에서 도출된 규칙을 이용하여 분류한 결과, 고독 수역과의 거리, 인구밀도 등의 GIS 자료가 규칙 기반 분류에 효과적인 것으로 나타났다. 본 연구에서 제안한 기계 학습과 지식 기반 분류 기법을 이용하면 다양한 GIS 자료들을 통합하여 위성영상을 보다 효과적으로 분류할 수 있다.

Support Vector Machine 기반 지형분류 기법 (Terrain Cover Classification Technique Based on Support Vector Machine)

  • 성기열;박준성;유준
    • 전자공학회논문지SC
    • /
    • 제45권6호
    • /
    • pp.55-59
    • /
    • 2008
  • 야외 환경에서 무인차량의 자율주행에 있어서 효과적인 기동제어를 위해서는 장애물 탐지나 지형의 기하학적인 형상 정보외에 탐지된 장애물 및 지형 표면에 대한 재질 유형의 인식 및 분류 또한 중요한 요소이다. 영상 기반의 지표면 분류 알고리듬은 입력 영상에 대한 전처리, 특징추출, 분류 및 후처리의 절차로 수행된다. 본 논문에서는 컬러 CCD 카메라로부터 획득된 야외 지형영상에 대해 색상 및 질감 정보를 이용한 지형분류 기법을 제시한다. 전처리 단계에서 색공간 변환을 수행하고, 색상과 질감 정보를 이용하기 위해 웨이블릿 변환 특징을 사용하였으며, 분류기로서는 SVM(support vector machine)을 적용하였다. 야외 환경에서 획득된 실영상에 대한 실험을 통하여 제시된 알고리듬의 분류 성능을 평가하였으며, 제시된 알고리듬에 의한 효과적인 야지 지형분류의 가능성을 확인하였다.