• 제목/요약/키워드: Decision trees

검색결과 299건 처리시간 0.035초

사례기반 추론을 위한 동적 속성 가중치 부여 방법 (A Dynamic feature Weighting Method for Case-based Reasoning)

  • 이재식;전용준
    • 지능정보연구
    • /
    • 제7권1호
    • /
    • pp.47-61
    • /
    • 2001
  • 사례기반 추론과 같은 사후학습 기법은 인공신경망이나 의사결정나무와 같은 사전학습 기법에 비해서 여러 장점을 가지고 있다. 하지만, 사후학습 기법은 사례 표현에 관련성이 적은 속성이 포함된 경우에는 성능이 저하되는 단점을 가지고 있다. 이러한 단점을 극복하기 위해서, 속성 가중치 부여 방법들이 연구되었다. 기존의 속성 가중치 부여 방법들은 대부분 전역적으로 속성 가중치를 부여하는 것이었다. 본 연구에서는 새로운 지역적 속성 가중치 부여 방법인 CBDFW를 제안한다. CBDFW 기법은 무작위로 생성된 속성 가중치들의 분류 성공 여부를 저장하고 있다가, 새로운 사례가 주어졌을 때에 성공적인 분류 결과를 보인 가중치들을 검색하여 동적으로 새로운 가중치들을 생성해낸다. 신용평가 데이터로 CBDFW의 성능을 실험한 결과, 기존의 연구들에서 제시된 분류 적중률보다 우수한 성능을 보였다.

  • PDF

도시하천 류축경의 시각적 선호요소 분석 (Analysis on Visual Preference Elements of Riverscape Axis)

  • 김용수;정계순;김수봉
    • 한국조경학회지
    • /
    • 제26권2호
    • /
    • pp.101-109
    • /
    • 1998
  • Recently, improvement of the quality of urban riverscape environment has been emphasized not only by landscape architectural field but also by various professionals in planning and ecology. Regarding to this current movement, the aim of this paper is to highlight major visual elements of riverscape axis as a case study of Shinchon River in Taegu City to suggest some basic guidelines for arranging riverscape in urban area. The study was mainly based on Repertory Grid Development method which was developed in Japan. The method is consist of three steps such as decision of element landscape in study area for slide photos, selection of evaluation items for interview and obstraction of proper evaluation factors. The major findings through this study are as follows; 1) The 12 major visual elements which possibly improve riverscape, based on abstraction of proper evaluation factors, are Dunchi, surface of the water,, equipment of river, buildings near riverside, river vertical and horizontal facilities like bridge, fine view, riverbed, water plant, naturalness, water's edge line, harmony and street trees by order. 2) Total numbers of adjective which describe 12 common factors are 25, such as clean, open, stable, quiet, comfortable, friendly, bright, natural etc. In addition, Dunchi was described 337 times by various adjectives, surface of the water was 200 times and arrangement of river was 146 times which is similar result with the order of 12 influential common factors. 3) Therefore, Dunchi, surface of the water and equipment of river are three most important factors which could create better riverscape. These three factors implies us how we supply good quality of urban river environment for the urban residents.

  • PDF

Classification of Piperazinylalkylisoxazole Library by Recursive Partitioning

  • Kim, Hye-Jung;Park, Woo-Kyu;Cho, Yong-Seo;No, Kyoung-Tai;Koh, Hun-Yeong;Choo, Hyun-Ah;Pae, Ae-Nim
    • Bulletin of the Korean Chemical Society
    • /
    • 제29권1호
    • /
    • pp.111-116
    • /
    • 2008
  • A piperazinylalkylisoxazole library containing 86 compounds was constructed and evaluated for the binding affinities to dopamine (D3) and serotonin (5-HT2A/2C) receptor to develop antipsychotics. Dopamine antagonists (DA) showing selectivity for D3 receptor over the D2 receptor, serotonin antagonists (SA), and serotonin-dopamine dual antagonists (SDA) were identified based on their binding affinity and selectivity. The analogues were divided into three groups of 7 DAs (D3), 33 SAs (5-HT2A/2C), and 46 SDAs (D3 and 5-HT2A/2C). A classification model was generated for identifying structural characteristics of those antagonists with different affinity profiles. On the basis of the results from our previous study, we conducted the generation of the decision trees by the recursive-partitioning (RP) method using Cerius2 2D descriptors, and identified and interpreted the descriptors that discriminate in-house antipsychotic compounds.

부스팅 인공신경망학습의 기업부실예측 성과비교 (An Empirical Analysis of Boosing of Neural Networks for Bankruptcy Prediction)

  • 김명종;강대기
    • 한국정보통신학회논문지
    • /
    • 제14권1호
    • /
    • pp.63-69
    • /
    • 2010
  • 최근 기계학습 분야에서 분류자의 정확도 개선을 위하여 제안된 다양한 방법들 중 가장 큰 주목을 받고 있는 학습방법 중 하나는 앙상블 학습이다. 그러나 앙상블 학습은 의사결정트리와 같이 불안정한 학습 알고리즘의 성과 개선 효과는 탁월한 반면, 인공신경망과 같이 안정적인 학습알고리즘의 성과 개선 효과는 응용 분야와 구현 방법에 따라 서로 상반된 결론들을 보여주고 있다. 본 연구에서는 국내 기업의 부실화 예측문제를 활용하여 인공신경 망 분류자 및 대표적 앙상블 학습기법인 부스팅 분류자를 적용한 결과 앙상블 학습은 기업부실 예측문제에 있어 전통적 인공신경망의 성과를 개선할 수 있음을 검증하였다.

Data Mining for High Dimensional Data in Drug Discovery and Development

  • Lee, Kwan R.;Park, Daniel C.;Lin, Xiwu;Eslava, Sergio
    • Genomics & Informatics
    • /
    • 제1권2호
    • /
    • pp.65-74
    • /
    • 2003
  • Data mining differs primarily from traditional data analysis on an important dimension, namely the scale of the data. That is the reason why not only statistical but also computer science principles are needed to extract information from large data sets. In this paper we briefly review data mining, its characteristics, typical data mining algorithms, and potential and ongoing applications of data mining at biopharmaceutical industries. The distinguishing characteristics of data mining lie in its understandability, scalability, its problem driven nature, and its analysis of retrospective or observational data in contrast to experimentally designed data. At a high level one can identify three types of problems for which data mining is useful: description, prediction and search. Brief review of data mining algorithms include decision trees and rules, nonlinear classification methods, memory-based methods, model-based clustering, and graphical dependency models. Application areas covered are discovery compound libraries, clinical trial and disease management data, genomics and proteomics, structural databases for candidate drug compounds, and other applications of pharmaceutical relevance.

규모의 경제성을 고려한 전략적 온실가스저감기술 개발을 위한 다기준의사결정기법: AHP/DEA CCR-I 및 BCC-I 혼합모형 적용 (Multi-criteria Decision Making Method for Developing Greenhouse Gas Technologies Strategically Considering Scale Efficiency: AHP/DEA CCR-I and BCC-I Integrated model Approach)

  • 이성곤;겐토모기;김종욱
    • 한국수소및신에너지학회논문집
    • /
    • 제19권6호
    • /
    • pp.552-560
    • /
    • 2008
  • In 1997, Korean government established the National Energy and Resources Plan, which targeted from 1997 to 2005 with strategic energy technology development. At the end of 2005, Korean government built a New National Energy and Resources Plan preparing for upcoming 10 years from 2006 until 2015 based on energy technology trees comparing with the previous plan, which based on the energy R&D projects. In this research, we prioritize the relative preferences and efficiency by an AHP/DEA CCR-I and BCC-I integrated model approach considering scale efficiency for well focused R&D and efficiency of developing Greenhouse Gas technologies as an extended research from a view point of econometrics as an extended research.

이동통신서비스 해지고객 예측모형의 비교 분석에 관한 연구 (A Study on the Analysis of Comparison of Churn Prediction Models in Mobile Telecommunication Services)

  • 김충영;장남식;김준우
    • Asia pacific journal of information systems
    • /
    • 제12권1호
    • /
    • pp.139-158
    • /
    • 2002
  • As the telecommunication market becomes mature in Korea, severe competition has already begun on the market. While service providers struggled for the last couple of years to acquire as many new customers as possible, nowadays they are making more efforts on retaining the current customers. The churn management by analyzing customers' demographic and transactional data becomes one of the key customer retention strategies which most companies pursue. However, the customer data analysis has still remained at the basic level in the industry, even though it has considerable potential as a tool for understanding customer behavior. This paper develops several churn prediction models using data mining techniques such as logistic regression, decision trees, and neural networks. For model-building, real data were used which were collected from one of the major telecommunication companies in Korea. This paper explores various ways of comparing model performance, while the hit ratio was mainly focused in the previous research. The comparison criteria used in this study include gain ratio, Kolmogorov-Smirnov statistics, distribution of the predicted values, and explanation ability. This paper also suggest some guidance for model selection in applying data mining techniques.

인구통계학적 특성에 따른 초등학생의 스마트폰 중독 수준 분석 (Analysis of Elementary Students' Smartphone Addiction Level by Demographic Features)

  • 이수정
    • 컴퓨터교육학회논문지
    • /
    • 제17권6호
    • /
    • pp.1-8
    • /
    • 2014
  • 최근 스마트폰의 사용은 전 연령층을 대상으로 급격히 증가하여, 스마트폰 중독 문제를 유발시키고 있다. 본 연구에서는 인구통계학적 변수들을 중심으로 초등학생의 스마트폰 중독에 미치는 영향 요인을 분석하였다. 우선 각 요인별 중독군의 분포 차이와 가장 많이 사용하는 스마트폰 기능의 분포 차이를 분석한 결과, 학년과 성적에 따라 가장 큰 중독 사용자군의 분포 차이를 보였으며, 성별, 학년, 성적에 따라 사용기능의 차이를 보였다. 또한 중독 사용자군별 사용기능의 분포 차이도 유의하다고 할 수 있었다. 이에 더하여, 로짓회귀분석과 결정트리를 통해 스마트폰 중독에 영향을 주는 요인들을 분석하였는데, 학년, 성적, 부모의 맞벌이 여부, 거주지역 순으로 영향이 컸다.

  • PDF

Minimizing the MOLAP/ROLAP Divide: You Can Have Your Performance and Scale It Too

  • Eavis, Todd;Taleb, Ahmad
    • Journal of Computing Science and Engineering
    • /
    • 제7권1호
    • /
    • pp.1-20
    • /
    • 2013
  • Over the past generation, data warehousing and online analytical processing (OLAP) applications have become the cornerstone of contemporary decision support environments. Typically, OLAP servers are implemented on top of either proprietary array-based storage engines (MOLAP) or as extensions to conventional relational DBMSs (ROLAP). While MOLAP systems do indeed provide impressive performance on common analytics queries, they tend to have limited scalability. Conversely, ROLAP's table oriented model scales quite nicely, but offers mediocre performance at best relative to the MOLAP systems. In this paper, we describe a storage and indexing framework that aims to provide both MOLAP like performance and ROLAP like scalability by essentially combining some of the best features from both. Based upon a combination of R-trees and bitmap indexes, the storage engine has been integrated with a robust OLAP query engine prototype that is able to fully exploit the efficiency of the proposed storage model. Specifically, it utilizes an OLAP algebra coupled with a domain specific query optimizer, to map user queries directly to the storage and indexing framework. Experimental results demonstrate that not only does the design improve upon more naive approaches, but that it does indeed offer the potential to optimize both query performance and scalability.

IMPROVEMENT OF THE LOCA PSA MODEL USING A BEST-ESTIMATE THERMAL-HYDRAULIC ANALYSIS

  • Lee, Dong Hyun;Lim, Ho-Gon;Yoon, Han Young;Jeong, Jae Jun
    • Nuclear Engineering and Technology
    • /
    • 제46권4호
    • /
    • pp.541-546
    • /
    • 2014
  • Probabilistic Safety Assessment (PSA) has been widely used to estimate the overall safety of nuclear power plants (NPP) and it provides base information for risk informed application (RIA) and risk informed regulation (RIR). For the effective and correct use of PSA in RIA/RIR related decision making, the risk estimated by a PSA model should be as realistic as possible. In this work, a best-estimate thermal-hydraulic analysis of loss-of-coolant accidents (LOCAs) for the Hanul Nuclear Units 3&4 is first carried out in a systematic way. That is, the behaviors of peak cladding temperature (PCT) were analyzed with various combinations of break sizes, the operating conditions of safety systems, and the operator's action time for aggressive secondary cooling. Thereafter, the results of the thermal-hydraulic analysis have been reflected in the improvement of the PSA model by changing both accident sequences and success criteria of the event trees for the LOCA scenarios.