• 제목/요약/키워드: feature importance

검색결과 408건 처리시간 0.023초

인간의 인지도에 근거한 질의를 통한 영상 검색의 성능 향상 (Performance Improvement of Image Retrieval System by Presenting Query based on Human Perception)

  • 유헌우;장동식;오근태
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제9권2호
    • /
    • pp.158-165
    • /
    • 2003
  • 영상간의 유사도는 일반적으로 영상으로부터 추출한 특징벡터간의 벡터공간상의 거리를 계산해서 판단한다. 그러나 이러한 특징벡터가 유사도 계산을 위한 하나의 방법이지만 항상 인간의 유사도 개념을 충실히 반영하지는 않는다. 그러므로 현존하는 대부분의 영상검색시스템들은 각 특징간의 중요도를 선정하여 유사도에 반영하는 방법을 사용하고 있다. 본 논문에서는 영상검색을 위한 새로운 초기 가중치 설정과 갱신 알고리즘을 제안한다. 이를 위해서 먼저 데이터 베이스 영상을 인간의 인지도 판단에 의해 그룹화 한 후, 내부질의와 외부질의를 수행하고, 검색된 영상중 유사한 영상이 어느 그룹에 속하는지 알아내어 각 영상별로 유사도 계산에 필요한 최적 특징 가중치를 계산한다. 2000개의 영상 데이타에 대한 실험을 통해서 제안된 알고리즘의 우수성을 보인다.

기계학습을 이용한 밴드갭 예측과 소재의 조성기반 특성인자의 효과 (Compositional Feature Selection and Its Effects on Bandgap Prediction by Machine Learning)

  • 남충희
    • 한국재료학회지
    • /
    • 제33권4호
    • /
    • pp.164-174
    • /
    • 2023
  • The bandgap characteristics of semiconductor materials are an important factor when utilizing semiconductor materials for various applications. In this study, based on data provided by AFLOW (Automatic-FLOW for Materials Discovery), the bandgap of a semiconductor material was predicted using only the material's compositional features. The compositional features were generated using the python module of 'Pymatgen' and 'Matminer'. Pearson's correlation coefficients (PCC) between the compositional features were calculated and those with a correlation coefficient value larger than 0.95 were removed in order to avoid overfitting. The bandgap prediction performance was compared using the metrics of R2 score and root-mean-squared error. By predicting the bandgap with randomforest and xgboost as representatives of the ensemble algorithm, it was found that xgboost gave better results after cross-validation and hyper-parameter tuning. To investigate the effect of compositional feature selection on the bandgap prediction of the machine learning model, the prediction performance was studied according to the number of features based on feature importance methods. It was found that there were no significant changes in prediction performance beyond the appropriate feature. Furthermore, artificial neural networks were employed to compare the prediction performance by adjusting the number of features guided by the PCC values, resulting in the best R2 score of 0.811. By comparing and analyzing the bandgap distribution and prediction performance according to the material group containing specific elements (F, N, Yb, Eu, Zn, B, Si, Ge, Fe Al), various information for material design was obtained.

퍼지 원 클래스 서포트 벡터 머신 (Fuzzy One Class Support Vector Machine)

  • 김기주;최영식
    • 인터넷정보학회논문지
    • /
    • 제6권3호
    • /
    • pp.159-170
    • /
    • 2005
  • OC-SVM(One Class Support Vector Machine)은 주어진 전체 데이터의 분포를 측정하는 대신에. 데이터 분포의 서포트(support)를 측정하는 기술로서 주어진 데이터를 가장 잘 설명할 수 있는 최적의 서포트 벡터(support vector)를 구하는 기술이다. OC-SVM은 데이터 분포의 표현에 아주 뛰어난 접근 방법이지만, 사람의 주관적인 중요도를 반영하는 것은 힘들다. 본 논문에서는 각 데이터에 퍼지 맴버쉽(fuzzy membership)을 적용하여 기존의 OC-SVM에 사용자의 주관적인 중요도를 표현할 수 있는 FOC-SVM(Fuzzy One class Support Vector Machine)을 유도 하였다. FOC-SVM은 데이터들을 동등하게 다루는 것이 아니라, 데이터 객체의 중요도에 따라 데이터를 다룬다. 즉, 덜 중요한 데이터의 특징 벡터는 OC-SVM의 처리과정에 덜 기여하도록 하기 위하여, 객체의 중요도에 따라 특징 벡터의 크기를 조정하였다. 이를 증명하기 위하여 가상의 데이터를 가지고 실험을 하였고, 실험 결과는 예측된 결과를 보여 주었다.

  • PDF

Identification of Topological Entities and Naming Mapping for Parametric CAD Model Exchanges

  • Mun, Duh-Wan;Han, Soon-Hung
    • International Journal of CAD/CAM
    • /
    • 제5권1호
    • /
    • pp.69-81
    • /
    • 2005
  • As collaborative design and configuration design gain increasing importance in product development, it becomes essential to exchange parametric CAD models among participants. Parametric CAD models can be represented and exchanged in the form of a macro file or a part file that contains the modeling history of a product. The modeling history of a parametric CAD model contains feature specifications and each feature has selection information that records the name of the referenced topological entities. Translating this selection information requires solving the problems of how to identify the referenced topological entities of a feature (persistent naming problem) and how to convert the selection information into the format of the receiving CAD system (naming mapping problem). The present paper introduces the problem of exchanging parametric CAD models and proposes a solution to naming mapping.

Improved image alignment algorithm based on projective invariant for aerial video stabilization

  • Yi, Meng;Guo, Bao-Long;Yan, Chun-Man
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권9호
    • /
    • pp.3177-3195
    • /
    • 2014
  • In many moving object detection problems of an aerial video, accurate and robust stabilization is of critical importance. In this paper, a novel accurate image alignment algorithm for aerial electronic image stabilization (EIS) is described. The feature points are first selected using optimal derivative filters based Harris detector, which can improve differentiation accuracy and obtain the precise coordinates of feature points. Then we choose the Delaunay Triangulation edges to find the matching pairs between feature points in overlapping images. The most "useful" matching points that belong to the background are used to find the global transformation parameters using the projective invariant. Finally, intentional motion of the camera is accumulated for correction by Sage-Husa adaptive filtering. Experiment results illustrate that the proposed algorithm is applied to the aerial captured video sequences with various dynamic scenes for performance demonstrations.

Explainable Machine Learning Based a Packed Red Blood Cell Transfusion Prediction and Evaluation for Major Internal Medical Condition

  • Lee, Seongbin;Lee, Seunghee;Chang, Duhyeuk;Song, Mi-Hwa;Kim, Jong-Yeup;Lee, Suehyun
    • Journal of Information Processing Systems
    • /
    • 제18권3호
    • /
    • pp.302-310
    • /
    • 2022
  • Efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery. To predict the appropriateness of patient-specific transfusions for the intensive care unit (ICU) patients who require real-time monitoring, we evaluated a model to predict the possibility of transfusion dynamically by using the Medical Information Mart for Intensive Care III (MIMIC-III), an ICU admission record at Harvard Medical School. In this study, we developed an explainable machine learning to predict the possibility of red blood cell transfusion for major medical diseases in the ICU. Target disease groups that received packed red blood cell transfusions at high frequency were selected and 16,222 patients were finally extracted. The prediction model achieved an area under the ROC curve of 0.9070 and an F1-score of 0.8166 (LightGBM). To explain the performance of the machine learning model, feature importance analysis and a partial dependence plot were used. The results of our study can be used as basic data for recommendations related to the adequacy of blood transfusions and are expected to ultimately contribute to the recovery of patients and prevention of excessive consumption of blood products.

운영 데이터를 활용한 제3자 물류 환경에서의 배송 트럭 무게 예측 (Truck Weight Estimation using Operational Statistics at 3rd Party Logistics Environment)

  • 이유진;최경민;김송은;박경수;정승환
    • 산업경영시스템학회지
    • /
    • 제45권4호
    • /
    • pp.127-133
    • /
    • 2022
  • Many manufacturers applying third party logistics (3PLs) have some challenges to increase their logistics efficiency. This study introduces an effort to estimate the weight of the delivery trucks provided by 3PL providers, which allows the manufacturer to package and load products in trailers in advance to reduce delivery time. The accuracy of the weigh estimation is more important due to the total weight regulation. This study uses not only the data from the company but also many general prediction variables such as weather, oil prices and population of destinations. In addition, operational statistics variables are developed to indicate the availabilities of the trucks in a specific weight category for each 3PL provider. The prediction model using XGBoost regressor and permutation feature importance method provides highly acceptable performance with MAPE of 2.785% and shows the effectiveness of the developed operational statistics variables.

Application of Random Forests to Assessment of Importance of Variables in Multi-sensor Data Fusion for Land-cover Classification

  • Park No-Wook;Chi kwang-Hoon
    • 대한원격탐사학회지
    • /
    • 제22권3호
    • /
    • pp.211-219
    • /
    • 2006
  • A random forests classifier is applied to multi-sensor data fusion for supervised land-cover classification in order to account for the importance of variable. The random forests approach is a non-parametric ensemble classifier based on CART-like trees. The distinguished feature is that the importance of variable can be estimated by randomly permuting the variable of interest in all the out-of-bag samples for each classifier. Two different multi-sensor data sets for supervised classification were used to illustrate the applicability of random forests: one with optical and polarimetric SAR data and the other with multi-temporal Radarsat-l and ENVISAT ASAR data sets. From the experimental results, the random forests approach could extract important variables or bands for land-cover discrimination and showed reasonably good performance in terms of classification accuracy.

MIMO-OFDM 시스템에서 에너지 효율성을 위한 기계 학습 기반 적응형 전송 기술 및 Feature Space 연구 (Machine-Learning-Based Link Adaptation for Energy-Efficient MIMO-OFDM Systems)

  • 오명석;김기범;박현철
    • 한국전자파학회논문지
    • /
    • 제27권5호
    • /
    • pp.407-415
    • /
    • 2016
  • 무선 통신의 최근 동향을 살펴보면 에너지 효율적 전송의 중요성이 강조되고 있다. 본 논문은 multiple-input multiple-output orthogonal frequency division multiplexing(MIMO-OFDM) 무선 시스템에서 에너지 효율성을 최대화하기 위해 기계학습 기술을 사용하는 적응형 전송을 고려한다. MIMO-OFDM 시스템의 채널 상태를 효과적으로 나타내기 위한 two- dimensional capacity(2D-CAP) feature space와 classification 기술을 통해 에너지 효율적인 적응형 전송을 수행하는 machine-learning-based bit and power adaptation(ML-BPA) 알고리즘을 제안한다. 모의 실험 결과를 통해 2D-CAP이 본 논문이 고려하는 무선 채널 상태를 정확하게 나타내며, 이를 통해 적응형 전송의 성능을 향상시킴을 확인하였다. 또한, ordered postprocessing signal-to-noise ratio(ordSNR)를 포함한 다른 feature space들과 직접적인 비교를 통해 2D-CAP이 전송 성능이나 복잡도 측면에서 뚜렷한 이득을 가짐을 확인하였다.