• Title/Summary/Keyword: UCI

Search Result 194, Processing Time 0.02 seconds

데이터마이닝을 위한 혼합 데이터베이스에서의 속성선택

  • Cha, Un-Ok;Heo, Mun-Yeol
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.103-108
    • /
    • 2003
  • 데이터마이닝을 위한 대용량 데이터베이스를 축소시키는 방법 중에 속성선택 방법이 많이 사용되고 있다. 본 논문에서는 세 가지 속성선택 방법을 사용하여 조건속성 수를 60%이상 축소시켜 결정나무와 로지스틱 회귀모형에 적용시켜보고 이들의 효율을 비교해 본다. 세 가지 속성선택 방법은 MDI, 정보획득, ReliefF 방법이다. 결정나무 방법은 QUEST, CART, C4.5를 사용하였다. 속성선택 방법들의 분류 정확성은 UCI 데이터베이스에 주어진 Credit 승인 데이터베이스와 German Credit 데이터베이스를 사용하여 10층-교차확인 방법으로 평가하였다.

  • PDF

Improvement of SOM using Stratification

  • Jun, Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.9 no.1
    • /
    • pp.36-41
    • /
    • 2009
  • Self organizing map(SOM) is one of the unsupervised methods based on the competitive learning. Many clustering works have been performed using SOM. It has offered the data visualization according to its result. The visualized result has been used for decision process of descriptive data mining as exploratory data analysis. In this paper we propose improvement of SOM using stratified sampling of statistics. The stratification leads to improve the performance of SOM. To verify improvement of our study, we make comparative experiments using the data sets form UCI machine learning repository and simulation data.

Construction of Multiple Classifier Systems based on a Classifiers Pool (인식기 풀 기반의 다수 인식기 시스템 구축방법)

  • Kang, Hee-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.8
    • /
    • pp.595-603
    • /
    • 2002
  • Only a few studies have been conducted on how to select multiple classifiers from the pool of available classifiers for showing the good classification performance. Thus, the selection problem if classifiers on how to select or how many to select still remains an important research issue. In this paper, provided that the number of selected classifiers is constrained in advance, a variety of selection criteria are proposed and applied to tile construction of multiple classifier systems, and then these selection criteria will be evaluated by the performance of the constructed multiple classifier systems. All the possible sets of classifiers are trammed by the selection criteria, and some of these sets are selected as the candidates of multiple classifier systems. The multiple classifier system candidates were evaluated by the experiments recognizing unconstrained handwritten numerals obtained both from Concordia university and UCI machine learning repository. Among the selection criteria, particularly the multiple classifier system candidates by the information-theoretic selection criteria based on conditional entropy showed more promising results than those by the other selection criteria.

Medical Diagnosis Problem Solving Based on the Combination of Genetic Algorithms and Local Adaptive Operations (유전자 알고리즘 및 국소 적응 오퍼레이션 기반의 의료 진단 문제 자동화 기법 연구)

  • Lee, Ki-Kwang;Han, Chang-Hee
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.2
    • /
    • pp.193-206
    • /
    • 2008
  • Medical diagnosis can be considered a classification task which classifies disease types from patient's condition data represented by a set of pre-defined attributes. This study proposes a hybrid genetic algorithm based classification method to develop classifiers for multidimensional pattern classification problems related with medical decision making. The classification problem can be solved by identifying separation boundaries which distinguish the various classes in the data pattern. The proposed method fits a finite number of regional agents to the data pattern by combining genetic algorithms and local adaptive operations. The local adaptive operations of an agent include expansion, avoidance and relocation, one of which is performed according to the agent's fitness value. The classifier system has been tested with well-known medical data sets from the UCI machine learning database, showing superior performance to other methods such as the nearest neighbor, decision tree, and neural networks.

  • PDF

Characteristics of Soil Stress using Expansion Liquid Sheet (팽창약액시트를 이용한 지중응력 특성에 관한 연구)

  • Kang, Hyounhoi;Kim, Juho;Chung, Yoonseok;Park, Jeongjun
    • Journal of the Society of Disaster Information
    • /
    • v.13 no.1
    • /
    • pp.43-50
    • /
    • 2017
  • In this study, to investigate the strength enhancement and stress transfer effect of the inflatable chemicals used in the recovery of soft ground or partial settlement, the dilatant solution was prepared and classified by measuring the density and the earth pressure in the sand ground. The inflation reinforcing agent was prepared by injecting into a separate impervious vacuum sheet by dividing into a relatively high expansion group and a low expansion group, and a cementation experiment was performed in the lower part of the homogeneously formed model ground. As a result, reinforcing effect was shown up to about 15cm above the expansion reinforcement, and the soil pressure showed a compaction tendency similar to the concentrated load of $1.150{\sim}11.298t/m^2$.

Neural networks optimization for multi-dimensional digital signal processing in IoT devices (IoT 디바이스에서 다차원 디지털 신호 처리를 위한 신경망 최적화)

  • Choi, KwonTaeg
    • Journal of Digital Contents Society
    • /
    • v.18 no.6
    • /
    • pp.1165-1173
    • /
    • 2017
  • Deep learning method, which is one of the most famous machine learning algorithms, has proven its applicability in various applications and is widely used in digital signal processing. However, it is difficult to apply deep learning technology to IoT devices with limited CPU performance and memory capacity, because a large number of training samples requires a lot of memory and computation time. In particular, if the Arduino with a very small memory capacity of 2K to 8K, is used, there are many limitations in implementing the algorithm. In this paper, we propose a method to optimize the ELM algorithm, which is proved to be accurate and efficient in various fields, on Arduino board. Experiments have shown that multi-class learning is possible up to 15-dimensional data on Arduino UNO with memory capacity of 2KB and possible up to 42-dimensional data on Arduino MEGA with memory capacity of 8KB. To evaluate the experiment, we proved the effectiveness of the proposed algorithm using the data sets generated using gaussian mixture modeling and the public UCI data sets.

Discretization of Continuous-Valued Attributes considering Data Distribution (데이터 분포를 고려한 연속 값 속성의 이산화)

  • Lee, Sang-Hoon;Park, Jung-Eun;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.391-396
    • /
    • 2003
  • This paper proposes a new approach that converts continuous-valued attributes to categorical-valued ones considering the distribution of target attributes(classes). In this approach, It can be possible to get optimal interval boundaries by considering the distribution of data itself without any requirements of parameters. For each attributes, the distribution of target attributes is projected to one-dimensional space. And this space is clustered according to the criteria like as the density value of each target attributes and the amount of overlapped areas among each density values of target attributes. Clusters which are made in this ways are based on the probabilities that can predict a target attribute of instances. Therefore it has an interval boundaries that minimize a loss of information of original data. An improved performance of proposed discretization method can be validated using C4.5 algorithm and UCI Machine Learning Data Repository data sets.

Evaluation on Behavior Characteristics of a Pocketable Expansion Material for Ground Cavity Based on Wheel Tracking Test Results (휠트래킹 시험을 통한 포켓형 지반공동 긴급복구 팽창재료의 거동특성 평가)

  • Park, Jeong-Jun;Kim, Ju-Ho;Kim, Ki-Sung;Kim, Dongwook;Hong, Gigwon
    • Journal of the Korean Geosynthetics Society
    • /
    • v.17 no.1
    • /
    • pp.75-83
    • /
    • 2018
  • This paper described a results of dynamic stability by using wheel tracking test and unconfined compression test, in order to evaluate behavior characteristics on the developed pocketable expansion material for emergency restoration of ground cavity. The wheel tracking test result showed that the settlement increment ratio of the recovered ground by the expansion material was decreased compared to the sandy ground in high load condition. That is, it was confirmed that the expansion material was able to restrain the settlement due to the material stiffness, and the same results were obtained for the dynamic stability evaluation results. From the results of unconfined compression test, the pocketable expansion material was found to be able to fully support load on the restored cavity.

Fuzzy Clustering Model using Principal Components Analysis and Naive Bayesian Classifier (주성분 분석과 나이브 베이지안 분류기를 이용한 퍼지 군집화 모형)

  • Jun, Sung-Hae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.485-490
    • /
    • 2004
  • In data representation, the clustering performs a grouping process which combines given data into some similar clusters. The various similarity measures have been used in many researches. But, the validity of clustering results is subjective and ambiguous, because of difficulty and shortage about objective criterion of clustering. The fuzzy clustering provides a good method for subjective clustering problems. It performs clustering through the similarity matrix which has fuzzy membership value for assigning each object. In this paper, for objective fuzzy clustering, the clustering algorithm which joins principal components analysis as a dimension reduction model with bayesian learning as a statistical learning theory. For performance evaluation of proposed algorithm, Iris and Glass identification data from UCI Machine Learning repository are used. The experimental results shows a happy outcome of proposed model.

Web Mining Using Fuzzy Integration of Multiple Structure Adaptive Self-Organizing Maps (다중 구조적응 자기구성지도의 퍼지결합을 이용한 웹 마이닝)

  • 김경중;조성배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.61-70
    • /
    • 2004
  • It is difficult to find an appropriate web site because exponentially growing web contains millions of web documents. Personalization of web search can be realized by recommending proper web sites using user profile but more efficient method is needed for estimating preference because user's evaluation on web contents presents many aspects of his characteristics. As user profile has a property of non-linearity, estimation by classifier is needed and combination of classifiers is necessary to anticipate diverse properties. Structure adaptive self-organizing map (SASOM) that is suitable for Pattern classification and visualization is an enhanced model of SOM and might be useful for web mining. Fuzzy integral is a combination method using classifiers' relevance that is defined subjectively. In this paper, estimation of user profile is conducted by using ensemble of SASOM's teamed independently based on fuzzy integral and evaluated by Syskill & Webert UCI benchmark data. Experimental results show that the proposed method performs better than previous naive Bayes classifier as well as voting of SASOM's.