• 제목/요약/키워드: Data Sets

검색결과 3,790건 처리시간 0.046초

Design of Fuzzy Model for Data Mining

  • Kim, Do-Wan;Joo, Young-Hoon;Park, Jin-Bae
    • 한국지능시스템학회논문지
    • /
    • 제13권1호
    • /
    • pp.107-113
    • /
    • 2003
  • A new GA-based methodology using information granules is suggested for the construction of fuzzy classifiers. The proposed scheme consists of three steps: selection of information granules, construction of the associated fuzzy sets, and tuning of the fuzzy rules. First, the genetic algorithm (GA) is applied to the development of the adequate information granules. The fuzzy sets are then constructed from the analysis of the developed information granules. An interpretable fuzzy classifier is designed by using the constructed fuzzy sets. Finally, the GA are utilized for tuning of the fuzzy rules, which can enhance the classification performance on the misclassified data (e.g., data with the strange pattern or on the boundaries of the classes). To show the effectiveness of the proposed method, an example, the classification of the Iris data, is provided.

Modeling sulfuric acid induced swell in carbonate clays using artificial neural networks

  • Sivapullaiah, P.V.;Guru Prasad, B.;Allam, M.M.
    • Geomechanics and Engineering
    • /
    • 제1권4호
    • /
    • pp.307-321
    • /
    • 2009
  • The paper employs a feed forward neural network with back-propagation algorithm for modeling time dependent swell in clays containing carbonate in the presence of sulfuric acid. The oedometer swell percent is estimated at a nominal surcharge pressure of 6.25 kPa to develop 612 data sets for modeling. The input parameters used in the network include time, sulfuric acid concentration, carbonate percentage, and liquid limit. Among the total data sets, 280 (46%) were assigned to training, 175 (29%) for testing and the remaining 157 data sets (25%) were relegated to cross validation. The network was programmed to process this information and predict the percent swell at any time, knowing the variable involved. The study demonstrates that it is possible to develop a general BPNN model that can predict time dependent swell with relatively high accuracy with observed data ($R^2$=0.9986). The obtained results are also compared with generated non-linear regression model.

Quick Evaluations of the KOMPSAT-1 Orbit Maneuvers Using Small Sets of Real-time GPS Navigation Solutions

  • Lee, Byoung-Sun;Lee, Jeong-Sook;Kim, Jae-Hoon
    • Transactions on Control, Automation and Systems Engineering
    • /
    • 제3권3호
    • /
    • pp.196-202
    • /
    • 2001
  • Quick evaluations of two in-plane orbit maneuvers using small sets of real-time GPS navigation solutions were performed for the KOMPSAT-1 spacecraft operation. Real-time GPS navigation solutions of the KOMPSAT-1 were collected during the Korean Ground Station(KGS) pass. Only a few sets of position and velocity data after completion of the thruster firing were used for the quick maneuver evaluations. The results were used for antenna pointing data predictions for the next station contact. Normal orbit maneuver evaluations using large sets of playback GPS navigation solutions were also performed and the result were compared with the quick evaluation results.

  • PDF

골 성숙도 판별을 위한 심층 메타 학습 기반의 분류 문제 학습 방법 (Deep Meta Learning Based Classification Problem Learning Method for Skeletal Maturity Indication)

  • 민정원;강동중
    • 한국멀티미디어학회논문지
    • /
    • 제21권2호
    • /
    • pp.98-107
    • /
    • 2018
  • In this paper, we propose a method to classify the skeletal maturity with a small amount of hand wrist X-ray image using deep learning-based meta-learning. General deep-learning techniques require large amounts of data, but in many cases, these data sets are not available for practical application. Lack of learning data is usually solved through transfer learning using pre-trained models with large data sets. However, transfer learning performance may be degraded due to over fitting for unknown new task with small data, which results in poor generalization capability. In addition, medical images require high cost resources such as a professional manpower and mcuh time to obtain labeled data. Therefore, in this paper, we use meta-learning that can classify using only a small amount of new data by pre-trained models trained with various learning tasks. First, we train the meta-model by using a separate data set composed of various learning tasks. The network learns to classify the bone maturity using the bone maturity data composed of the radiographs of the wrist. Then, we compare the results of the classification using the conventional learning algorithm with the results of the meta learning by the same number of learning data sets.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제40권8호
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

적지적수(適地適樹) 판정(判定)을 위한 Neural Network 기법(技法)의 응용(應用) (Neural Network Applications to Determining Suitable Tree Species for Site-Specific Conditions)

  • 김형호;정주상
    • 한국산림과학회지
    • /
    • 제90권4호
    • /
    • pp.437-444
    • /
    • 2001
  • 이 연구는 인공신경망기법을 적용하여 적지적수를 판정할 수 있는 산림환경입지인자를 도출하고, 그들 인자 상호간의 관계를 분석하여 적지적수 판정방법을 제시하고자 수행되었다. 적지적수 선정을 위한 대상수종으로 5개의 주요 침엽수종(P. densiflora for. erecta, L. leptolepis, P. koraiensis, P. densiflora, P. thunbergil)을 선정하였다. 먼저 총 1,320개소의 표준지를 대상으로 각 수종별 지위지수가 높은 순으로 40개씩 추출하여 총 200개의 표준지를 선발하였다. 각각의 자료는 해당 표준지에 대한 13개 인자의 산림입지환경 정보를 보유하고 있다. 연구결과 인공신경망기법은 패턴분류에 의한 산림입지환경 조사 자료들의 전산정보처리에 매우 효과적인 것을 알 수 있었다. 이 기법을 적지적수 판정에 필요한 패턴의 유무 분석에 응용함으로써 적지적수 판정에 거의 영향을 미칠 수 없는 패턴을 소유하고 있거나, 불규칙한 양상의 패턴으로 인해 패턴분류 과정 자체를 교란할 수 있는 자료들을 선별하여 제거할 수 있었다. 그 외에 인공신경망기법은 입지인자 구성에 따라 적지적수 판정 적합도가 77.6%에서 91.8%까지 높게 나타남으로써 산림입지환경조사 자료를 토대로 하는 적지적수 판정에 매우 높은 잠재력을 보여주었다.

  • PDF

의료 빅 데이터를 활용한 서비스 제공 프레임워크 설계 (Design of Service Provision Framework using Medical Big Data)

  • 신봉희;전혜경
    • 융합정보논문지
    • /
    • 제9권2호
    • /
    • pp.1-6
    • /
    • 2019
  • 본 논문에서는 의료용 빅 데이터를 활용하여 비즈니스와 연계하여 새로운 서비스를 창출하기 위한 프레임 워크를 설계하였다. 단순한 데이터 분석 단계를 나타내는 것이 아니라 데이터의 활용 목적을 명확히 하고, 이에 대한 분석을 수행하여 그 속에서 가치를 추출하고 실제 사업이나 서비스를 운용할 때까지의 과정을 설계한다. 설계된 프레임워크는 기본 아키텍처, 사회 시스템 모델까지 커버할 수 있도록 하였다. 설계된 프레임 워크를 참조하여 사회 시스템에 적용될 수 있도록 디자인하였으며, 기본 데이터로는 의료용 빅 데이터를 중심으로 하였다. 의료용 기본 데이터를 적용한 프레임 워크 설계로 여러 의료용 사업 제휴 및 서비스 창출을 실현할 수 있을 것으로 기대하고 있다.

VARIABILITY OF THE LATENT HEAT FLUX DURING 1988-2005

  • Iwasaki, Shinsuke;Kubota, Masahisa
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2008년도 International Symposium on Remote Sensing
    • /
    • pp.289-292
    • /
    • 2008
  • Recently, several satellite data analyses projects and numerical weather prediction (NWP) reanalysis projects have produced the ocean surface Latent Heat Flux (LHF) data sets in the global coverage. Comparisons of these LHF data sets showed substantial discrepancies in the LHF values. Recently, the increase of LHF in during 1970s-1990s over the global ocean is shown by the LHF data that have been developed at the Objective Analyzed Air-Sea Fluxes (OAFlux) project. It is interesting to investigate the existence of the increase of LHF over a global ocean in the other LHF products. It is interesting to investigate the existence of the increase of LHF over a global ocean in the other LHF products. In this study, we assessed the consistencies and discrepancies of the inter-annual variability and decadal trend for the period 1988-2005 among six LHF products ((J-OFURO2, HOAPS3, IFREMER, NCEP1,2 and OAFlux) over the global ocean. As results, all LHF products showed a positive trend. In particular, the positive trend in satellite-based data analyses (J-OFURO2, HOAPS3, IFREMER) is larger than that in reanalysis products (NCEP1/2). Also, the consistencies and discrepancies are shown on the spatial patterns of the LHF trends across the six data sets. The positive trend of LHF is remarkable in the regions of western boundary currents such as the Kuroshio and the Gulf Stream in all LHF data sets. But, the discrepancies are shown on the spatial patterns of the LHF trends in tropics and subtropics. These discrepancies are primarily caused by the differences of the input meteorological state variables, particularly for the air specific humidity, used to calculate LHF.

  • PDF

데이터세트 기록물의 기술요소에 관한 연구 (A Study on the Description of Archival Datasets)

  • 김포옥;윤수영
    • 한국비블리아학회지
    • /
    • 제18권2호
    • /
    • pp.39-59
    • /
    • 2007
  • 데이터베이스시스템을 활용하여 데이터를 수집 처리하는 분야가 급속도로 확대됨에 따라 데이터세트에 대해서도 일반기록물과 같이 수집 평가 보존 활용해야 할 필요성이 증대되고 있다. 그럼에도 불구하고 국내 기록관리 분야에서의 데이터세트에 대한 관심은 매우 미흡한 수준이다. 이에 본고에서는 데이터세트를 기록물로 인식하고 체계적인 관리를 하기 위한 기본항목을 제시하고자 한다. 국제표준인 ISAD(G)를 준용하여 RAD, MAD와 데이터세트를 기록물로 인정하여 서비스를 제공하고 있는 NDAD의 기술요소를 세밀히 조사 분석하여, ISAD(G)의 기술영역을 기준으로 국내 데이터세트 기술에 필요한 각 기술영역과 영 역별 내 주요 기술요소안을 제시하였다.

Tests for Equality of Two Distributions with Life-Table Model

  • 강신수
    • Journal of the Korean Data and Information Science Society
    • /
    • 제12권2호
    • /
    • pp.71-82
    • /
    • 2001
  • There are several ways to test the equality of two survival distributions under a variety of situations. Tests for equality of two distributions with life-table model for univariate independent response times are reviewed and introduced. It is developed that the methodology to test it for correlated response times where treatments are applied to different independent sets of cohorts. Data, which can be separated into two independent sets, from an angioplasty study where more than one procedure is performed on some patients are used to illustrate this methodology.

  • PDF