통합 검색 | Korea Science

Comparison of Binary Discretization Algorithms for Data Mining

Na, Jong-Hwa;Kim, Jeong-Mi;Cho, Wan-Sup
- Journal of the Korean Data and Information Science Society
- /
- 제16권4호
- /
- pp.769-780
- /
- 2005
Recently, the discretization algorithms for continuous data have been actively studied. But there are few articles to compare the efficiency of these algorithms. In this paper we introduce the principles of some binary discretization algorithms including C4.5, CART and QUEST and investigate the efficiency of these algorithms through numerical study. For various underlying distribution, we compare these algorithms in view of misclassification rate and MSE. Real data examples are also included.
PDF

러브집합이론과 SOM을 이용한 연속형 속성의 이산화 (Discretization of Continuous Attributes based on Rough Set Theory and SOM)

서완석;김재련
- 산업경영시스템학회지
- /
- 제28권1호
- /
- pp.1-7
- /
- 2005
Data mining is widely used for turning huge amounts of data into useful information and knowledge in the information industry in recent years. When analyzing data set with continuous values in order to gain knowledge utilizing data mining, we often undergo a process called discretization, which divides the attribute's value into intervals. Such intervals from new values for the attribute allow to reduce the size of the data set. In addition, discretization based on rough set theory has the advantage of being easily applied. In this paper, we suggest a discretization algorithm based on Rough Set theory and SOM(Self-Organizing Map) as a means of extracting valuable information from large data set, which can be employed even in the case where there lacks of professional knowledge for the field.
PDF KSCI

Time-Discretization of Nonlinear Systems with Delayed Multi-Input Using Taylor Series

Park, Ji-Hyang;Chong, Kil-To;Nikolaos Kazantzis;Alexander G. Parlos
- Journal of Mechanical Science and Technology
- /
- 제18권7호
- /
- pp.1107-1120
- /
- 2004
This study proposes a new scheme for the sampled-data representation of nonlinear systems with time-delayed multi-input. The proposed scheme is based on the Taylor-series expansion and zero-order hold assumption. The mathematical structure of a new discretization scheme is explored. On the basis of this structure, the sampled-data representation of nonlinear systems including time-delay is derived. The new scheme is applied to nonlinear systems with two inputs and then the delayed multi-input general equation is derived. The resulting time-discretization provides a finite-dimensional representation of nonlinear control systems with time-delay enabling existing controller design techniques to be applied to them. In order to evaluate the tracking performance of the proposed scheme, an algorithm is tested for some of the examples including maneuvering of an automobile and a 2-DOF mechanical system.
PDF KSCI

동일 빈도 이산화를 가상 경기에 적용한 연속형 최적화 알고리즘 (A Continuous Optimization Algorithm Using Equal Frequency Discretization Applied to a Fictitious Play)

이창용
- 산업경영시스템학회지
- /
- 제36권2호
- /
- pp.8-16
- /
- 2013
In this paper, we proposed a new method for the determination of strategies that are required in a continuous optimization algorithm based on the fictitious play theory. In order to apply the fictitious play theory to continuous optimization problems, it is necessary to express continuous values of a variable in terms of discrete strategies. In this paper, we proposed a method in which all strategies contain an equal number of selected real values that are sorted in their magnitudes. For comparative analysis of the characteristics and performance of the proposed method of representing strategies with respect to the conventional method, we applied the method to the two types of benchmarking functions: separable and inseparable functions. From the experimental results, we can infer that, in the case of the separable functions, the proposed method not only outperforms but is more stable. In the case of inseparable functions, on the contrary, the performance of the optimization depends on the benchmarking functions. In particular, there is a rather strong correlation between the performance and stability regardless of the benchmarking functions.
https://doi.org/10.11627/jkise.2013.36.2.8 인용 PDF KSCI

동적 해석을 위한 효과적 고차 Taylor Galerkin법에 관한 연구 (A Study on an Effective Higher-Order Taylor-Galerkin Method for the Analysis of Structural Dynamics)

윤성기;박상훈
- 소음진동
- /
- 제3권4호
- /
- pp.353-359
- /
- 1993
In this study, the Taylor-Galerkin method is modified to take into consideration the third order term in the Taylor series of the fundamental variable. In the Taylor-Galerkin method, after expressing the governing equation of motion in conservation form, the temporal discretization is done first and then spatial discretization follows in contrast to the conventional approaches. A predictor-corrector type algorithm has been developed previously by the same author. A new computationally efficient direct algorithm is proposed in this study. A study on convergency and accuracy of the solution is carried out. Numerical examples show that this new algorithm exhibits the same order of accuracy with less computational effort.
PDF

러프 소속 함수를 이용한 수치 속성의 이산화와 근사 추론 (Discretization of Numerical Attributes and Approximate Reasoning by using Rough Membership Function))

권은아;김홍기
- 한국정보과학회논문지:데이타베이스
- /
- 제28권4호
- /
- pp.545-557
- /
- 2001
본 논문에서는 저장 데이타베이스의 정보 시스템을 정제하여 이해 가능한 정보로 전환하고 새로운 객체를 근사 추론할 수 있도록 하기 위해 러프 소속 함수 값의 개념을 도입한 계층적 근사 분류 알 고리즘을 제안한다. 제안하는 알고리즘은 근사 추론의 한 방법인 퍼지 추론 방법의 언어적 불확실성을 속 성의 퍼지 소속 함수 값으로 나타내고 조건 속성의 소속 함수 값의 합성에 의해 근사 추론하는 방법을 이용하였으며 퍼지 소속 함수 값 대신에 러프 소속 함수 값을 이용하도록 제안하였다. 이는 퍼지 소속 함 수 값을 이용하여 괴지 규칙을 생성하는 과정을 생략할 수 있는 장점이 있다. 또한 정보 시스템 내의 속 성 중에서 수치 속성에 대한 이산화 방법을 연구하고 이것 또한 러프 소속 함수 값과 정보이론의 무질서 도의 개념을 이용한 수치 속성의 이산화를 제안하였다. 제안된 알고리즘을 이용하여 패턴 분류 문제에 교 준적으로 사용되는 IRIS 데이타에 대한 실험결과96%~98% 분류율을 나타냈으며 다른 실험 데이타에서 도 기존 알고리즘과 비교하여 수치 이산화나 근사 추론 모두 우수함을 보였다.
PDF

분류학습을 위한 연속 애트리뷰트의 이산화 방법에 관한 연구 (Discretization of Continuous-Valued Attributes for Classification Learning)

이창환
- 한국정보처리학회논문지
- /
- 제4권6호
- /
- pp.1541-1549
- /
- 1997
대부분의 기계학습 방법들은 이산형의 데이타를 학습에 사용되는 데이타의 형식으로 요구하고 있다. 따라서 연속형 데이타의 경우는 기계학습 방법들을 적용하기 전에 그 데이타를 이산형으로 바꾸어 주는 과정이 필요하다. 이러한 이산화 과정은 그 중요성에 비하여 상대적으로 관련 연구가 미비한 수준이다. 따라서 이 논문은 정보이론을 사용하여 연속형 자료를 이산형의 형태로 변환시키는 새로운 방법을 제안하였다. 각 애트리뷰트의 값들이 목적 애트리뷰트에 제공하는 정보의 량을 엔트로피 함수의 일종인 Hellinger 변량을 이용하여 계산하였으며, 각 애트리뷰트마다 제공하는 정보의 손실을 최소화할 수 있는 이산화 경계선을 계산하였다. 본 논문이 제안한 방법의 성능을 ID3 와 신경망 알고리즘을 사용하여 기존의 이산화 방법들과 비교하였으며 거의 대부분 우수한 정확성을 보였다.
PDF

기능성 경사복합재의 적층조형을 위한 분해기반 공정계획 (Decomposition-based Process Planning far Layered Manufacturing of Functionally Gradient Materials)

신기훈;김성환
- 한국CDE학회논문집
- /
- 제11권3호
- /
- pp.223-233
- /
- 2006
Layered manufacturing(LM) is emerging as a new technology that enables the fabrication of three dimensional heterogeneous objects such as Multi-materials and Functionally Gradient Materials (FGMs). Among various types of heterogeneous objects, more attention has recently paid on the fabrication of FGMs because of their potentials in engineering applications. The necessary steps for LM fabrication of FGMs include representation and process planning of material information inside an FGM. This paper introduces a new process planning algorithm that takes into account the processing of material information. The detailed tasks are discretization (i.e., decomposition-based approximation of volume fraction), orientation (build direction selection), and adaptive slicing of heterogeneous objects. In particular, this paper focuses on the discretization process that converts all of the material information inside an FGM into material features like geometric features. It is thus possible to choose an optimal build direction among various pre-selected ones by approximately estimating build time. This is because total build time depends on the complexity of features. This discretization process also allows adaptive slicing of heterogeneous objects to minimize surface finish and material composition error. In addition, tool path planning can be simplified into fill pattern generation. Specific examples are shown to illustrate the overall procedure.
PDF KSCI

Goal-oriented multi-collision source algorithm for discrete ordinates transport calculation

Wang, Xinyu;Zhang, Bin;Chen, Yixue
- Nuclear Engineering and Technology
- /
- 제54권7호
- /
- pp.2625-2634
- /
- 2022
Discretization errors are extremely challenging conundrums of discrete ordinates calculations for radiation transport problems with void regions. In previous work, we have presented a multi-collision source method (MCS) to overcome discretization errors, but the efficiency needs to be improved. This paper proposes a goal-oriented algorithm for the MCS method to adaptively determine the partitioning of the geometry and dynamically change the angular quadrature in remaining iterations. The importance factor based on the adjoint transport calculation obtains the response function to get a problem-dependent, goal-oriented spatial decomposition. The difference in the scalar fluxes from one high-order quadrature set to a lower one provides the error estimation as a driving force behind the dynamic quadrature. The goal-oriented algorithm allows optimizing by using ray-tracing technology or high-order quadrature sets in the first few iterations and arranging the integration order of the remaining iterations from high to low. The algorithm has been implemented in the 3D transport code ARES and was tested on the Kobayashi benchmarks. The numerical results show a reduction in computation time on these problems for the same desired level of accuracy as compared to the standard ARES code, and it has clear advantages over the traditional MCS method in solving radiation transport problems with reflective boundary conditions.
https://doi.org/10.1016/j.net.2022.01.020 인용 PDF KSCI

대용량 데이터를 위한 전역적 범주화를 이용한 결정 트리의 순차적 생성 (Incremental Generation of A Decision Tree Using Global Discretization For Large Data)

한경식;이수원
- 정보처리학회논문지B
- /
- 제12B권4호
- /
- pp.487-498
- /
- 2005
최근 들어, 대용량의 데이터를 처리할 수 있는 트리 생성 방법에 많은 관심이 집중되고 있다 그러나 대용량 데이터를 위한 대부분의 알고리즘은 일괄처리 방식으로 데이터를 처리하기 때문에 새로운 데이터가 추가되면 이 데이터를 반영한 결정 트리를 생성하기 위해 처음부터 트리를 다시 생성해야 하다. 이러한 재생성에 따른 비용문제에 보다 효율적인 접근 방법은 결정 트리를 순차적으로 생성하는 접근 방법이다. 대표적인 알고리즘으로 BOAT와 ITI를 들 수 있으며 이들 알고리즘은 수치형 데이터 처리를 위해 지역적 범주화를 이용한다. 그러나 범주화는 정렬된 형태의 수치형 데이터를 요구하기 때문에 대용량 데이터를 처리해야하는 상황에서 전체 데이터에 대해 한번만 정렬을 수행하는 전역적 범주화 기법이 모든 노드에서 매번 정렬을 수행하는 지역적 범주화보다 적합하다. 본 논문은 수치형 데이터 처리를 위해 전역적 범주화를 이용하여 생성된 트리를 효율적으로 재생성하는 순차적 트리 생성 방법을 제안한다. 새로운 데이터가 추가될 경우, 전역적 범주화에 기반 한 트리를 순차적으로 생성하기 위해서는 첫째, 이 새로운 데이터가 반영된 범주를 재생성해야 하며, 둘째, 범주 변화에 맞게 트리의 구조를 변화시켜야한다. 본 논문에서는 효율적인 범주 재생성을 위해 샘플 분할 포인트를 추출하고 이로부터 범주화를 수행하는 기법을 제안하며 범주 변화에 맞는 트리 구조 변화를 위해 신뢰구간과 트리 재구조화기법을 이용한다. 본 논문에서 피플 데이터베이스를 이용하여 기존의 지역적 범주화를 이용한 경우와 비교 실험하였다.
https://doi.org/10.3745/KIPSTB.2005.12B.4.487 인용 PDF KSCI

검색결과 124건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)