• Title/Summary/Keyword: data-sets

Search Result 3,783, Processing Time 0.028 seconds

A Note on Linear Regression Model Using Non-Symmetric Triangular Fuzzy Number Coefficients

  • Hong, Dug-Hun;Kim, Kyung-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.445-449
    • /
    • 2005
  • Yen et al. [Fuzzy Sets and Systems 106 (1999) 167-177] calculated the fuzzy membership function for the output to find the non-symmetric triangular fuzzy number coefficients of a linear regression model for all given input-output data sets. In this note, we show that the result they obtained in their paper is invalid.

  • PDF

Neural network rule extraction for credit scoring

  • Bart Baesens;Rudy Setiono;Lille, Valerina-De;Stijn Viaene
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.128-132
    • /
    • 2001
  • In this paper, we evaluate and contrast four neural network rule extraction approaches for credit scoring. Experiments are carried our on three real life credit scoring data sets. Both the continuous and the discretised versions of all data sets are analysed The rule extraction algorithms, Neurolonear, Neurorule. Trepan and Nefclass, have different characteristics, with respect to their perception of the neural network and their way of representing the generated rules or knowledge. It is shown that Neurolinear, Neurorule and Trepan are able to extract very concise rule sets or trees with a high predictive accuracy when compared to classical decision tree(rule) induction algorithms like C4.5(rules). Especially Neurorule extracted easy to understand and powerful propositional if -then rules for all discretised data sets. Hence, the Neurorule algorithm may offer a viable alternative for rule generation and knowledge discovery in the domain of credit scoring.

  • PDF

Decision method for rule-based physical activity status using rough sets (러프집합을 이용한 규칙기반 신체활동상태 결정방법)

  • Lee, Young-Dong;Son, Chang-Sik;Chung, Wan-Young;Park, Hee-Joon;Kim, Yoon-Nyun
    • Journal of Sensor Science and Technology
    • /
    • v.18 no.6
    • /
    • pp.432-440
    • /
    • 2009
  • This paper presents an accelerometer based system for physical activity decision that are capable of recognizing three different types of physical activities, i.e., standing, walking and running, using by rough sets. To collect physical acceleration data, we developed the body sensor node which consists of two custom boards for physical activity monitoring applications, a wireless sensor node and an accelerometer sensor module. The physical activity decision is based on the acceleration data collected from body sensor node attached on the user's chest. We proposed a method to classify physical activities using rough sets which can be generated rules as attributes of the preprocessed data and by constructing a new decision table, rules reduction. Our experimental results have successfully validated that performance of the rule patterns after removing the redundant attribute values are better and exactly same compare with before.

Comparison of deep learning-based autoencoders for recommender systems (오토인코더를 이용한 딥러닝 기반 추천시스템 모형의 비교 연구)

  • Lee, Hyo Jin;Jung, Yoonsuh
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.329-345
    • /
    • 2021
  • Recommender systems use data from customers to suggest personalized products. The recommender systems can be categorized into three cases; collaborative filtering, contents-based filtering, and hybrid recommender system that combines the first two filtering methods. In this work, we introduce and compare deep learning-based recommender system using autoencoder. Autoencoder is an unsupervised deep learning that can effective solve the problem of sparsity in the data matrix. Five versions of autoencoder-based deep learning models are compared via three real data sets. The first three methods are collaborative filtering and the others are hybrid methods. The data sets are composed of customers' ratings having integer values from one to five. The three data sets are sparse data matrix with many zeroes due to non-responses.

GA-Based Construction of Fuzzy Classifiers Using Information Granules

  • Kim Do-Wan;Lee Ho-Jae;Park Jin-Bae;Joo Young-Hoon
    • International Journal of Control, Automation, and Systems
    • /
    • v.4 no.2
    • /
    • pp.187-196
    • /
    • 2006
  • A new GA-based methodology using information granules is suggested for the construction of fuzzy classifiers. The proposed scheme consists of three steps: selection of information granules, construction of the associated fuzzy sets, and tuning of the fuzzy rules. First, the genetic algorithm (GA) is applied to the development of the adequate information granules. The fuzzy sets are then constructed from the analysis of the developed information granules. An interpretable fuzzy classifier is designed by using the constructed fuzzy sets. Finally, the GA is utilized for tuning of the fuzzy rules, which can enhance the classification performance on the misclassified data (e.g., data with the strange pattern or on the boundaries of the classes). To show the effectiveness of the proposed method, an example, the classification of the Iris data, is provided.

A Study on the Detection of Obstructive Sleep Apnea Using ECG (ECG를 이용한 수면 무호흡 검출에 관한 연구)

  • 조성필;최호선;이경중
    • Proceedings of the IEEK Conference
    • /
    • 2003.07c
    • /
    • pp.2879-2882
    • /
    • 2003
  • Obstructive Sleep Apnea(OSA) is a representative symptom of sleep disorder which is caused by airway obstruction. OSA is usually diagnosed through the laboratory based Polysomnography(PSG) which is uncomfortable and expensive. In this paper, the detection method for OSA events, using ECG, has been developed. The proposed method uses the ECG data sets provided from Physionet. The features for OSA events detection are the average and standard deviation of 1 minute R-R interval, power spectrum of R-R interval and S-pulse amplitude from data sets. These features are applied to the input of Neural Network. To evaluate the method, we used the another ECG data sets. And we achieved sensitivity of 89.66%, specificity of 95.25%. So, we can know that the features proposed in this paper are important to detect OSA.

  • PDF

The Qualifications for the Application of the Rainfall Spatial Distribution Analysis Technique (강우량 공간분포 분석기법의 적용조건에 관한 연구)

  • Hwang Sye-Woon;Park Seung-Woo;Cho Young-Kyoung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2005.05b
    • /
    • pp.943-947
    • /
    • 2005
  • This study was intended to interpose an objection about the analysis of rainfall spatial distribution without a proper standard, and offer the improved approach using 1,he geostatistical analysis method to analyze it. For this, spatially distributed daily rainfall data sets were collected for 41 weather stations in study area, and variogram and correlation analysis were conducted. In the results of correlation analysis, it was found that the longer distance between the stations reduces the correlation of the rainfall data, and maltes the characteristics of the rainfall spatial distribution. The variogram analysis shows that correlation range was less than 50 km for the 17 daily rainfall data sets of total 91 sets. It says that it involves some rike, to determine the application method for rainfall spatial distribution without some qualifications, hence the Application standards of the Rainfall Spatial Distribution Analysis Technique, were essential and that was contingent on characteristics of rainfall and landscape.

  • PDF

COMPARISON OF GLOBAL SEA SURFACE TEMPERATURE PRODUCTS

  • Kubota, Masahisa.;Iwasaki, Shinzuke
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.993-996
    • /
    • 2006
  • NOAA operational bulk SST product (Reynolds et al, 2002) is very popular global SST data sets and is extensively used for various studies. However, the original time resolution is weekly and relatively large. On the other hand, there exist many new global SST data sets at present. In this study, we compare many global SST data sets including NOAA operational bulk SST product, CAOS OI SST product, Microwave Optimum Interpolation (MWOI) SST, Real Time Global (RTG) SST and JMA merged satellite and in situ Global Daily (MGD) SST.

  • PDF

Feature reduction for classifying high dimensional data sets using support vector machine (고차원 데이터의 분류를 위한 서포트 벡터 머신을 이용한 피처 감소 기법)

  • Ko, Seok-Ha;Lee, Hyun-Ju
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.877-878
    • /
    • 2008
  • We suggest a feature reduction method to classify mouse function data sets, which integrate several biological data sets represented as high dimensional vectors. To increase classification accuracy and decrease computational overhead, it is important to reduce the dimension of features. To do this, we employed Hybrid Huberized Support Vector Machine with kernels used for a kernel logistic regression method. When compared to support vector machine, this a pproach shows the better accuracy with useful features for each mouse function.

  • PDF

Productivity Evaluation and Comparision of Korean Provincial Hospitals (한국 지방공사 의료원의 생산성 평가와 비교)

  • Ahn, Tae-Sik;Park, Jung-Sik
    • Korea Journal of Hospital Management
    • /
    • v.2 no.1
    • /
    • pp.22-47
    • /
    • 1997
  • This paper evaluated the relative efficiency of 33 provincial medical centers using Data Envelopment Analysis(DEA) and compared the DEA efficiency results with those of the current method conducted by the management evaluation team. DEA Was selected as an alternative efficiency evaluation method since it could handle multiple inputs and multiple outputs simultaneously and identify the sources of inefficiency. To analyze the sensitivity of productivity values to the variable sets, four different sets of input and output variables were identified. Results showed that most of the medical centers are operating far away from the efficiency frontier supporting the previous results. Some centers showed 100% efficiency regardless of the selected variable sets. DEA results are compared with current management evaluation results. Some inconsistencies were found for some DMUs between the results of two methods showing the existence of methodology bias. DEA results and ratio analyses results mostly agree for 1992 data.

  • PDF