• 제목/요약/키워드: Data Sets

검색결과 3,737건 처리시간 0.031초

A Note on Linear Regression Model Using Non-Symmetric Triangular Fuzzy Number Coefficients

  • Hong, Dug-Hun;Kim, Kyung-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권2호
    • /
    • pp.445-449
    • /
    • 2005
  • Yen et al. [Fuzzy Sets and Systems 106 (1999) 167-177] calculated the fuzzy membership function for the output to find the non-symmetric triangular fuzzy number coefficients of a linear regression model for all given input-output data sets. In this note, we show that the result they obtained in their paper is invalid.

  • PDF

오토인코더를 이용한 딥러닝 기반 추천시스템 모형의 비교 연구 (Comparison of deep learning-based autoencoders for recommender systems)

  • 이효진;정윤서
    • 응용통계연구
    • /
    • 제34권3호
    • /
    • pp.329-345
    • /
    • 2021
  • 추천 시스템은 고객의 데이터를 이용하여 개인 맞춤화된 상품을 추천한다. 추천 시스템은 협업 필터링, 콘텐츠 기반 필터링 그리고 이 두 가지를 합친 하이브리드 방법의 세 가지로 크게 나누어진다. 이 연구에서는 딥러닝 방법론에 기초한 오토인코더를 이용한 추천 시스템에 대한 소개와 그 모형들의 비교 연구를 진행한다. 오토인코더는 데이터 행렬에 0이 많은 경우의 문제를 효과적으로 다룰 수 있는 딥러닝 기반의 비지도학습 모형이다. 이 연구에서는 세 개의 실제 데이터를 이용하여 다섯 가지 종류의 오토인코더 기반 모형들을 비교한다. 처음의 세 개 모형은 협업 필터링에 속한 모형이고 나머지 두 개의 모형은 하이브리드 모형이다. 실제 데이터는 고객의 평점 데이터이고, 대부분의 평점이 없어서 희박성 비율이 높다는 특징이 있다.

Neural network rule extraction for credit scoring

  • Bart Baesens;Rudy Setiono;Lille, Valerina-De;Stijn Viaene
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2001년도 The Pacific Aisan Confrence On Intelligent Systems 2001
    • /
    • pp.128-132
    • /
    • 2001
  • In this paper, we evaluate and contrast four neural network rule extraction approaches for credit scoring. Experiments are carried our on three real life credit scoring data sets. Both the continuous and the discretised versions of all data sets are analysed The rule extraction algorithms, Neurolonear, Neurorule. Trepan and Nefclass, have different characteristics, with respect to their perception of the neural network and their way of representing the generated rules or knowledge. It is shown that Neurolinear, Neurorule and Trepan are able to extract very concise rule sets or trees with a high predictive accuracy when compared to classical decision tree(rule) induction algorithms like C4.5(rules). Especially Neurorule extracted easy to understand and powerful propositional if -then rules for all discretised data sets. Hence, the Neurorule algorithm may offer a viable alternative for rule generation and knowledge discovery in the domain of credit scoring.

  • PDF

러프집합을 이용한 규칙기반 신체활동상태 결정방법 (Decision method for rule-based physical activity status using rough sets)

  • 이영동;손창식;정완영;박희준;김윤년
    • 센서학회지
    • /
    • 제18권6호
    • /
    • pp.432-440
    • /
    • 2009
  • This paper presents an accelerometer based system for physical activity decision that are capable of recognizing three different types of physical activities, i.e., standing, walking and running, using by rough sets. To collect physical acceleration data, we developed the body sensor node which consists of two custom boards for physical activity monitoring applications, a wireless sensor node and an accelerometer sensor module. The physical activity decision is based on the acceleration data collected from body sensor node attached on the user's chest. We proposed a method to classify physical activities using rough sets which can be generated rules as attributes of the preprocessed data and by constructing a new decision table, rules reduction. Our experimental results have successfully validated that performance of the rule patterns after removing the redundant attribute values are better and exactly same compare with before.

GA-Based Construction of Fuzzy Classifiers Using Information Granules

  • Kim Do-Wan;Lee Ho-Jae;Park Jin-Bae;Joo Young-Hoon
    • International Journal of Control, Automation, and Systems
    • /
    • 제4권2호
    • /
    • pp.187-196
    • /
    • 2006
  • A new GA-based methodology using information granules is suggested for the construction of fuzzy classifiers. The proposed scheme consists of three steps: selection of information granules, construction of the associated fuzzy sets, and tuning of the fuzzy rules. First, the genetic algorithm (GA) is applied to the development of the adequate information granules. The fuzzy sets are then constructed from the analysis of the developed information granules. An interpretable fuzzy classifier is designed by using the constructed fuzzy sets. Finally, the GA is utilized for tuning of the fuzzy rules, which can enhance the classification performance on the misclassified data (e.g., data with the strange pattern or on the boundaries of the classes). To show the effectiveness of the proposed method, an example, the classification of the Iris data, is provided.

ECG를 이용한 수면 무호흡 검출에 관한 연구 (A Study on the Detection of Obstructive Sleep Apnea Using ECG)

  • 조성필;최호선;이경중
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 V
    • /
    • pp.2879-2882
    • /
    • 2003
  • Obstructive Sleep Apnea(OSA) is a representative symptom of sleep disorder which is caused by airway obstruction. OSA is usually diagnosed through the laboratory based Polysomnography(PSG) which is uncomfortable and expensive. In this paper, the detection method for OSA events, using ECG, has been developed. The proposed method uses the ECG data sets provided from Physionet. The features for OSA events detection are the average and standard deviation of 1 minute R-R interval, power spectrum of R-R interval and S-pulse amplitude from data sets. These features are applied to the input of Neural Network. To evaluate the method, we used the another ECG data sets. And we achieved sensitivity of 89.66%, specificity of 95.25%. So, we can know that the features proposed in this paper are important to detect OSA.

  • PDF

강우량 공간분포 분석기법의 적용조건에 관한 연구 (The Qualifications for the Application of the Rainfall Spatial Distribution Analysis Technique)

  • 황세운;박승우;조영경
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2005년도 학술발표회 논문집
    • /
    • pp.943-947
    • /
    • 2005
  • This study was intended to interpose an objection about the analysis of rainfall spatial distribution without a proper standard, and offer the improved approach using 1,he geostatistical analysis method to analyze it. For this, spatially distributed daily rainfall data sets were collected for 41 weather stations in study area, and variogram and correlation analysis were conducted. In the results of correlation analysis, it was found that the longer distance between the stations reduces the correlation of the rainfall data, and maltes the characteristics of the rainfall spatial distribution. The variogram analysis shows that correlation range was less than 50 km for the 17 daily rainfall data sets of total 91 sets. It says that it involves some rike, to determine the application method for rainfall spatial distribution without some qualifications, hence the Application standards of the Rainfall Spatial Distribution Analysis Technique, were essential and that was contingent on characteristics of rainfall and landscape.

  • PDF

COMPARISON OF GLOBAL SEA SURFACE TEMPERATURE PRODUCTS

  • Kubota, Masahisa.;Iwasaki, Shinzuke
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.993-996
    • /
    • 2006
  • NOAA operational bulk SST product (Reynolds et al, 2002) is very popular global SST data sets and is extensively used for various studies. However, the original time resolution is weekly and relatively large. On the other hand, there exist many new global SST data sets at present. In this study, we compare many global SST data sets including NOAA operational bulk SST product, CAOS OI SST product, Microwave Optimum Interpolation (MWOI) SST, Real Time Global (RTG) SST and JMA merged satellite and in situ Global Daily (MGD) SST.

  • PDF

고차원 데이터의 분류를 위한 서포트 벡터 머신을 이용한 피처 감소 기법 (Feature reduction for classifying high dimensional data sets using support vector machine)

  • 고석하;이현주
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.877-878
    • /
    • 2008
  • We suggest a feature reduction method to classify mouse function data sets, which integrate several biological data sets represented as high dimensional vectors. To increase classification accuracy and decrease computational overhead, it is important to reduce the dimension of features. To do this, we employed Hybrid Huberized Support Vector Machine with kernels used for a kernel logistic regression method. When compared to support vector machine, this a pproach shows the better accuracy with useful features for each mouse function.

  • PDF

한국 지방공사 의료원의 생산성 평가와 비교 (Productivity Evaluation and Comparision of Korean Provincial Hospitals)

  • 안태식;박정식
    • 한국병원경영학회지
    • /
    • 제2권1호
    • /
    • pp.22-47
    • /
    • 1997
  • This paper evaluated the relative efficiency of 33 provincial medical centers using Data Envelopment Analysis(DEA) and compared the DEA efficiency results with those of the current method conducted by the management evaluation team. DEA Was selected as an alternative efficiency evaluation method since it could handle multiple inputs and multiple outputs simultaneously and identify the sources of inefficiency. To analyze the sensitivity of productivity values to the variable sets, four different sets of input and output variables were identified. Results showed that most of the medical centers are operating far away from the efficiency frontier supporting the previous results. Some centers showed 100% efficiency regardless of the selected variable sets. DEA results are compared with current management evaluation results. Some inconsistencies were found for some DMUs between the results of two methods showing the existence of methodology bias. DEA results and ratio analyses results mostly agree for 1992 data.

  • PDF