Search | Korea Science

A Comparison Study of Classification Algorithms in Data Mining

Lee, Seung-Joo;Jun, Sung-Rae
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.8 no.1
- /
- pp.1-5
- /
- 2008
Generally the analytical tools of data mining have two learning types which are supervised and unsupervised learning algorithms. Classification and prediction are main analysis tools for supervised learning. In this paper, we perform a comparison study of classification algorithms in data mining. We make comparative studies between popular classification algorithms which are LDA, QDA, kernel method, K-nearest neighbor, naive Bayesian, SVM, and CART. Also, we use almost all classification data sets of UCI machine learning repository for our experiments. According to our results, we are able to select proper algorithms for given classification data sets.
https://doi.org/10.5391/IJFIS.2008.8.1.001 인용 PDF KSCI

TEMPORAL CLASSIFICATION METHOD FOR FORECASTING LOAD PATTERNS FROM AMR DATA

Lee, Heon-Gyu;Shin, Jin-Ho;Ryu, Keun-Ho
- Proceedings of the KSRS Conference
- /
- 2007.10a
- /
- pp.594-597
- /
- 2007
We present in this paper a novel mid and long term power load prediction method using temporal pattern mining from AMR (Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.
PDF

Directed Association Rules Mining and Classification (목표 속성을 고려한 연관규칙과 분류 기법)

한경록;김재련
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.24 no.63
- /
- pp.23-31
- /
- 2001
Data mining can be either directed or undirected. One way of thinking about it is that we use undirected data mining to recognize relationship in the data and directed data mining to explain those relationships once they have been found. Several data mining techniques have received considerable research attention. In this paper, we propose an algorithm for discovering association rules as directed data mining and applying them to classification. In the first phase, we find frequent closed itemsets and association rules. After this phase, we construct the decision trees using discovered association rules. The algorithm can be applicable to customer relationship management.
PDF

Using Genetic Rule-Based Classifier System for Data Mining (유전자 알고리즘을 이용한 데이터 마이닝의 분류 시스템에 관한 연구)

Han, Myung-Mook
- Journal of Internet Computing and Services
- /
- v.1 no.1
- /
- pp.63-72
- /
- 2000
Data mining means a process of nontrivial extraction of hidden knowledge or potentially useful information from data in large databases. Data mining algorithm is a multi-disciplinary field of research; machine learning, statistics, and computer science all make a contribution. Different classification schemes can be used to categorize data mining methods based on the kinds of tasks to be implemented and the kinds of application classes to be utilized, and classification has been identified as an important task in the emerging field of data mining. Since classification is the basic element of human's way of thinking, it is a well-studied problem in a wide varietyof application. In this paper, we propose a classifier system based on genetic algorithm with robust property, and the proposed system is evaluated by applying it to nDmC problem related to classification task in data mining.
PDF

Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey

Yoo, Ill-Hoi;Song, Min
- Journal of Computing Science and Engineering
- /
- v.2 no.2
- /
- pp.109-136
- /
- 2008
In this survey paper, we discuss biomedical ontologies and major text mining techniques applied to biomedicine and healthcare. Biomedical ontologies such as UMLS are currently being adopted in text mining approaches because they provide domain knowledge for text mining approaches. In addition, biomedical ontologies enable us to resolve many linguistic problems when text mining approaches handle biomedical literature. As the first example of text mining, document clustering is surveyed. Because a document set is normally multiple topic, text mining approaches use document clustering as a preprocessing step to group similar documents. Additionally, document clustering is able to inform the biomedical literature searches required for the practice of evidence-based medicine. We introduce Swanson's UnDiscovered Public Knowledge (UDPK) model to generate biomedical hypotheses from biomedical literature such as MEDLINE by discovering novel connections among logically-related biomedical concepts. Another important area of text mining is document classification. Document classification is a valuable tool for biomedical tasks that involve large amounts of text. We survey well-known classification techniques in biomedicine. As the last example of text mining in biomedicine and healthcare, we survey information extraction. Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. We also address techniques and issues of evaluating text mining applications in biomedicine and healthcare.
https://doi.org/10.5626/JCSE.2008.2.2.109 인용 PDF

Experimental investigation on multi-parameter classification predicting degradation model for rock failure using Bayesian method

Wang, Chunlai;Li, Changfeng;Chen, Zeng;Liao, Zefeng;Zhao, Guangming;Shi, Feng;Yu, Weijian
- Geomechanics and Engineering
- /
- v.20 no.2
- /
- pp.113-120
- /
- 2020
Rock damage is the main cause of accidents in underground engineering. It is difficult to predict rock damage accurately by using only one parameter. In this study, a rock failure prediction model was established by using stress, energy, and damage. The prediction level was divided into three levels according to the ratio of the damage threshold stress to the peak stress. A classification predicting model was established, including the stress, energy, damage and AE impact rate using Bayesian method. Results show that the model is good practicability and effectiveness in predicting the degree of rock failure. On the basis of this, a multi-parameter classification predicting deterioration model of rock failure was established. The results provide a new idea for classifying and predicting rockburst.
https://doi.org/10.12989/gae.2020.20.2.113 인용 KSCI

A Knowledge Based Physical Activity Evaluation Model Using Associative Classification Mining Approach (연관 분류 마이닝 기법을 활용한 지식기반 신체활동 평가 모델)

Son, Chang-Sik;Choi, Rock-Hyun;Kang, Won-Seok
- IEMEK Journal of Embedded Systems and Applications
- /
- v.13 no.4
- /
- pp.215-223
- /
- 2018
Recently, as interest of wearable devices has increased, commercially available smart wristbands and applications have been used as a tool for personal healthy management. However most previous studies have focused on evaluating the accuracy and reliability of the technical problems of wearable devices, especially step counts, walking distance, and energy consumption measured from the smart wristbands. In this study, we propose a physical activity evaluation model using classification rules, induced from the associative classification mining approach. These rules associated with five physical activities were generated by considering activities and walking times in target heart rate zones such as 'Out-of Zone', 'Fat Burn Zone', 'Cardio Zone', and 'Peak Zone'. In the experiment, we evaluated the prediction power of classification rules and verified its effectiveness by comparing classification accuracies between the proposed model and support vector machine.
https://doi.org/10.14372/IEMEK.2018.13.4.215 인용 PDF KSCI

Genetic Algorithm Application to Machine Learning

Han, Myung-mook;Lee, Yill-byung
- Journal of the Korean Institute of Intelligent Systems
- /
- v.11 no.7
- /
- pp.633-640
- /
- 2001
In this paper we examine the machine learning issues raised by the domain of the Intrusion Detection Systems(IDS), which have difficulty successfully classifying intruders. There systems also require a significant amount of computational overhead making it difficult to create robust real-time IDS. Machine learning techniques can reduce the human effort required to build these systems and can improve their performance. Genetic algorithms are used to improve the performance of search problems, while data mining has been used for data analysis. Data Mining is the exploration and analysis of large quantities of data to discover meaningful patterns and rules. Among the tasks for data mining, we concentrate the classification task. Since classification is the basic element of human way of thinking, it is a well-studied problem in a wide variety of application. In this paper, we propose a classifier system based on genetic algorithm, and the proposed system is evaluated by applying it to IDS problem related to classification task in data mining. We report our experiments in using these method on KDD audit data.
PDF

Genetics-Based Machine Learning for Generating Classification Rule in Data Mining (데이터 마이닝의 분류 규칙 발견을 위한 유전자알고리즘 학습방법)

김대희;박상호
- Proceedings of the Korea Multimedia Society Conference
- /
- 2001.11a
- /
- pp.429-434
- /
- 2001
데이터(data)치 홍수와 정보의 빈곤이라는 환경에 처한 지금, 정보기술을 이용하여 데이터를 여과하고, 분석하며, 결과를 해석하는 자동화 된 데이터 분석 방안에 높은 관심을 가지게 되었으며, 데이터 마이닝(Data Mining))은 이러한 요구를 충족시키는 정보기술의 활용방법이다. 특히 데이터 마이닝(Data Mining)의 분류(Classification) 방법은 중요한 분야가 되고 있다. 분류 작업의 핵심은 어떻게 적당한 결정규칙(decision rule)을 정의하느냐에 달려 있는데 이를 위해 학습능력을 가지고 있는 알고리즘이 필요하다. 본 논문에서는 유전자 알고리즘(Genetic Algorithm)을 기반으로 하는 강건한 학습방법을 제시했으며, 이러한 학습을 통해 데이터 마이닝(Data Mining)의 분류시스템을 제안하였다.
PDF

Temporal Associative Classification based on Calendar Patterns (캘린더 패턴 기반의 시간 연관적 분류 기법)

Lee Heon Gyu;Noh Gi Young;Seo Sungbo;Ryu Keun Ho
- Journal of KIISE:Databases
- /
- v.32 no.6
- /
- pp.567-584
- /
- 2005
Temporal data mining, the incorporation of temporal semantics to existing data mining techniques, refers to a set of techniques for discovering implicit and useful temporal knowledge from temporal data. Association rules and classification are applied to various applications which are the typical data mining problems. However, these approaches do not consider temporal attribute and have been pursued for discovering knowledge from static data although a large proportion of data contains temporal dimension. Also, data mining researches from temporal data treat problems for discovering knowledge from data stamped with time point and adding time constraint. Therefore, these do not consider temporal semantics and temporal relationships containing data. This paper suggests that temporal associative classification technique based on temporal class association rules. This temporal classification applies rules discovered by temporal class association rules which extends existing associative classification by containing temporal dimension for generating temporal classification rules. Therefore, this technique can discover more useful knowledge in compared with typical classification techniques.
PDF KSCI

Search Result 734, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)