Search | Korea Science

Estimation of Drought Index Using CART Algorithm and Satellite Data (CART기법과 위성자료를 이용한 향상된 공간가뭄지수 산정)

Kim, Gwang-Seob;Park, Han-Gyun
- Journal of the Korean Association of Geographic Information Studies
- /
- v.13 no.1
- /
- pp.128-141
- /
- 2010
Drought indices such as SPI(Standard Precipitation Index) and PDSI(Palmer Drought Severity Index) estimated using ground observations are not enough to describe detail spatial distribution of drought condition. In this study, the drought index with improved spatial resolution was estimated by using the CART algorithm and ancillary data such as MODIS NDVI, MODIS LST, land cover, rainfall, average air temperature, SPI, and PDSI data. Estimated drought index using the proposed approach for the year 2008 demonstrates better spatial information than that of traditional approaches. Results show that the availability of satellite imageries and various associated data allows us to get improved spatial drought information using a data mining technique and ancillary data and get better understanding of drought condition and prediction.
https://doi.org/10.11108/kagis.2010.13.1.128 인용 PDF KSCI

An Effective Recruits' Assignment Method for Early Job Adaptation of Air-munition Maintenance Airmen Using Datamining Technique (데이터마이닝을 이용한 공군 무기정비병의 조기 숙달을 위한 배속방안 연구)

Kang, Kew-Young;Yoon, Bong-Kyoo
- Journal of the military operations research society of Korea
- /
- v.37 no.1
- /
- pp.147-159
- /
- 2011
Recently, the military service period has been shortened continuously. Meanwhile, more skilled airmen are needed as the complexity of weapon systems increase. This phenomenon could lead to a disastrous result such as deteriorating the level of the readiness and the fighting power. We suggest a method to improve recruit's maintenance capability rapidly by assigning airmen to jobs appropriate to their characteristics using Datamining methods (K-menas and CART). We focus on the assigning method for air force's air-munition maintenance airmen since they are requested more skilled than other airmen. Grouping airmen with k-means method and devising classification rule with CART algorithm, we found that airmen's proficiency arrival period could be shortened by 1.79 months when they are assigned in the suggested way.
PDF KSCI

A Study for Improving the Performance of Data Mining Using Ensemble Techniques (앙상블기법을 이용한 다양한 데이터마이닝 성능향상 연구)

Jung, Yon-Hae;Eo, Soo-Heang;Moon, Ho-Seok;Cho, Hyung-Jun
- Communications for Statistical Applications and Methods
- /
- v.17 no.4
- /
- pp.561-574
- /
- 2010
We studied the performance of 8 data mining algorithms including decision trees, logistic regression, LDA, QDA, Neral network, and SVM and their combinations of 2 ensemble techniques, bagging and boosting. In this study, we utilized 13 data sets with binary responses. Sensitivity, Specificity and missclassificate error were used as criteria for comparison.
https://doi.org/10.5351/CKSS.2010.17.4.561 인용 PDF KSCI

Predicting Stock Liquidity by Using Ensemble Data Mining Methods

Bae, Eun Chan;Lee, Kun Chang
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.6
- /
- pp.9-19
- /
- 2016
In finance literature, stock liquidity showing how stocks can be cashed out in the market has received rich attentions from both academicians and practitioners. The reasons are plenty. First, it is known that stock liquidity affects significantly asset pricing. Second, macroeconomic announcements influence liquidity in the stock market. Therefore, stock liquidity itself affects investors' decision and managers' decision as well. Though there exist a great deal of literature about stock liquidity in finance literature, it is quite clear that there are no studies attempting to investigate the stock liquidity issue as one of decision making problems. In finance literature, most of stock liquidity studies had dealt with limited views such as how much it influences stock price, which variables are associated with describing the stock liquidity significantly, etc. However, this paper posits that stock liquidity issue may become a serious decision-making problem, and then be handled by using data mining techniques to estimate its future extent with statistical validity. In this sense, we collected financial data set from a number of manufacturing companies listed in KRX (Korea Exchange) during the period of 2010 to 2013. The reason why we selected dataset from 2010 was to avoid the after-shocks of financial crisis that occurred in 2008. We used Fn-GuidPro system to gather total 5,700 financial data set. Stock liquidity measure was computed by the procedures proposed by Amihud (2002) which is known to show best metrics for showing relationship with daily return. We applied five data mining techniques (or classifiers) such as Bayesian network, support vector machine (SVM), decision tree, neural network, and ensemble method. Bayesian networks include GBN (General Bayesian Network), NBN (Naive BN), TAN (Tree Augmented NBN). Decision tree uses CART and C4.5. Regression result was used as a benchmarking performance. Ensemble method uses two types-integration of two classifiers, and three classifiers. Ensemble method is based on voting for the sake of integrating classifiers. Among the single classifiers, CART showed best performance with 48.2%, compared with 37.18% by regression. Among the ensemble methods, the result from integrating TAN, CART, and SVM was best with 49.25%. Through the additional analysis in individual industries, those relatively stabilized industries like electronic appliances, wholesale & retailing, woods, leather-bags-shoes showed better performance over 50%.
https://doi.org/10.9708/jksci.2016.21.6.009 인용 PDF KSCI

A Combinatorial Optimization for Influential Factor Analysis: a Case Study of Political Preference in Korea

Yun, Sung Bum;Yoon, Sanghyun;Heo, Joon
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.35 no.5
- /
- pp.415-422
- /
- 2017
Finding influential factors from given clustering result is a typical data science problem. Genetic Algorithm based method is proposed to derive influential factors and its performance is compared with two conventional methods, Classification and Regression Tree (CART) and Chi-Squared Automatic Interaction Detection (CHAID), by using Dunn's index measure. To extract the influential factors of preference towards political parties in South Korea, the vote result of $18^{th}$ presidential election and 'Demographic', 'Health and Welfare', 'Economic' and 'Business' related data were used. Based on the analysis, reverse engineering was implemented. Implementation of reverse engineering based approach for influential factor analysis can provide new set of influential variables which can present new insight towards the data mining field.
https://doi.org/10.7848/ksgpc.2017.35.5.415 인용 PDF KSCI

Modeling of Environmental Survey by Decision Trees

Park, Hee-Chang;Cho, Kwang-Hyun
- Journal of the Korean Data and Information Science Society
- /
- v.15 no.4
- /
- pp.759-771
- /
- 2004
The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.
PDF

Wireless Internet Service Classification using Data Mining (데이터 마이닝을 이용한 무선 인터넷 서비스 분류기법)

Lee, Seong-Jin;Song, Jong-Woo;Ahn, Soo-Han;Won, You-Jip;Chang, Jae-Sung
- Journal of KIISE:Information Networking
- /
- v.36 no.3
- /
- pp.153-162
- /
- 2009
It is a challenging work for service operators to accurately classify different services, which runs on various wireless networks based upon numerous platforms. This works focuses on design and implementation of a classifier, which accurately classifies applications, which are captured horn WiBro Network. Notion of session is introduced for the classifier, instead of commonly used Flow to develop a classifier. Based on session information of given traffic, two classification algorithms are presented, Classification and Regression Tree and Support Vector Machine. Both algorithms are capable of classifying accurately and effectively with misclassification rate of 0.85%, and 0.94%, respectively. This work shows that classifier using CART provides ease of interpreting the result and implementation.
PDF KSCI

Evaluation on Performance of Accuracy for Analysis and Classification of Data Related to Industrial Accidents (산업재해 데이터의 분석 및 분류를 위한 정확도 성능 평가)

Leem Young-Moon;Ryu Chang-Hyun
- Proceedings of the Safety Management and Science Conference
- /
- 2006.04a
- /
- pp.51-56
- /
- 2006
Recently data mining techniques have been used for analysis and classification of data related to industrial accidents. The main objective of this study is to compare performance of algorithms for data analysis of industrial accidents and this paper provides a comparative analysis of 5 kinds of algorithms including CHAID, CART, C4.5, LR (Logistic Regression) and NN (Neural Network) with ROC chart, lift chart and response threshold. In this study, data on 67,278 accidents were analyzed to create risk groups for a number of complications, including the risk of disease and accident. The sample for this work chosen from data related to manufacturing industries during three years $(2002\sim2004)$ in korea. According to the result analysis, NN has excellent performance for data analysis and classification of industrial accidents.
PDF

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

Tang, Tzung-I;Zheng, Gang;Huang, Yalou;Shu, Guangfu;Wang, Pengtao
- Industrial Engineering and Management Systems
- /
- v.4 no.1
- /
- pp.102-108
- /
- 2005
This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.
PDF KSCI

Modeling of Environmental Survey by Decision Trees

Park, Hee-Chang;Cho, Kwang-Hyun
- 한국데이터정보과학회:학술대회논문집
- /
- 2004.10a
- /
- pp.63-75
- /
- 2004
The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, category merging, etc. We analyze Gyeongnam social indicator survey data using decision tree techniques for environmental information. We can use these decision tree outputs for environmental preservation and improvement.
PDF

Search Result 68, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)