• Title/Summary/Keyword: Classification Methods

Search Result 5,249, Processing Time 0.033 seconds

Classification Strategies for High Resolution Images of Korean Forests: A Case Study of Namhansansung Provincial Park, Korea

  • Park, Chong-Hwa;Choi, Sang-Il
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.708-708
    • /
    • 2002
  • Recent developments in sensor technologies have provided remotely sensed data with very high spatial resolution. In order to fully utilize the potential of high resolution images, new image classification strategies are necessary. Unfortunately, the high resolution images increase the spectral within-field variability, and the classification accuracy of traditional methods based on pixel-based classification algorithms such as Maximum-Likelihood method may be decreased (Schiewe 2001). Recent development in Object Oriented Classification based on image segmentation algorithms can be used for the classification of forest patches on rugged terrain of Korea. The objectives of this paper are as follows. First, to compare the pros and cons of image classification methods based on pixel-based and object oriented classification algorithm for the forest patch classification. Landsat ETM+ data and IKONOS data will be used for the classification. Second, to investigate ways to increase classification accuracy of forest patches. Supplemental data such as DTM and Forest Type Map of 1:25,000 scale are used for topographic correction and image segmentation. Third, to propose the best classification strategy for forest patch classification in terms of accuracy and data requirement. The research site for this paper is Namhansansung Provincial Park located at the eastern suburb of Seoul Metropolitan City for its diverse forest patch types and data availability. Both Landsat ETM+ and IKONOS data are used for the classification. Preliminary results can be summarized as follows. First, topographic correction of reflectance is essential for the classification of forest patches on rugged terrain. Second, object oriented classification of IKONOS data enables higher classification accuracy compared to Landsat ETM+ and pixel-based classification. Third, multi-stage segmentation is very useful to investigate landscape ecological aspect of forest communities of Korea.

  • PDF

Rock Mass Classification and Its Use in Blast Design for Tunneling (암분류기법과 터널굴착을 위한 발파설계에의 활용)

  • Ryu Chang-Ha;SunWoo Choon;Choi Byung-Hee
    • Explosives and Blasting
    • /
    • v.24 no.1
    • /
    • pp.63-69
    • /
    • 2006
  • Building tunnels means dealing with what rock is encountered. Relocation of the site of the underground structure is rarely possible. Tunneling engineers and miners have to cope with the quality of the rock mass as it is. Different tunneling philosophies and different rock classification methods have been developed in various countries. Most of the rock classification methods are based on the response of the rock mass to the excavation. Tunnel support requirements could be assessed analytically, supplemented by rock mass classification predictions, and verified by measurements during construction. Rock mass classifications on their own should only be used for preliminary, planning purposes and not for final tunnel support. Design of blast pattern in tunneling projects in Korea is also mostly prepared according to the general rock classification methods such as RMR or Q. They, however, do not take into account the blast performance, and as a consequence, produce poor blasting results. In this paper, the methods of general rock classification and blast design for tunnel excavation in Korea are reviewed, and efforts to develop a new classification method, reflecting the blasting performance, are presented.

The Difference Analysis between Maturity Stages of Venture Firms by Classification Techniques of Big Data (빅데이터 분류 기법에 따른 벤처 기업의 성장 단계별 차이 분석)

  • Jung, Byoungho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.197-212
    • /
    • 2019
  • The purpose of this study is to identify the maturity stages of venture firms through classification analysis, which is widely used as a big data technique. Venture companies should develop a competitive advantage in the market. And the maturity stage of a company can be classified into five stages. I will analyze a difference in the growth stage of venture firms between the survey response and the statistical classification methods. The firm growth level distinguished five stages and was divided into the period of start-up and declines. A classification method of big data uses popularly k-mean cluster analysis, hierarchical cluster analysis, artificial neural network, and decision tree analysis. I used variables that asset increase, capital increase, sales increase, operating profit increase, R&D investment increase, operation period and retirement number. The research results, each big data analysis technique showed a large difference of samples sized in the group. In particular, the decision tree and neural networks' methods were classified as three groups rather than five groups. The groups size of all classification analysis was all different by the big data analysis methods. Furthermore, according to the variables' selection and the sample size may be dissimilar results. Also, each classed group showed a number of competitive differences. The research implication is that an analysts need to interpret statistics through management theory in order to interpret classification of big data results correctly. In addition, the choice of classification analysis should be determined by considering not only management theory but also practical experience. Finally, the growth of venture firms needs to be examined by time-series analysis and closely monitored by individual firms. And, future research will need to include significant variables of the company's maturity stages.

Comparison of graph clustering methods for analyzing the mathematical subject classification codes

  • Choi, Kwangju;Lee, June-Yub;Kim, Younjin;Lee, Donghwan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.569-578
    • /
    • 2020
  • Various graph clustering methods have been introduced to identify communities in social or biological networks. This paper studies the entropy-based and the Markov chain-based methods in clustering the undirected graph. We examine the performance of two clustering methods with conventional methods based on quality measures of clustering. For the real applications, we collect the mathematical subject classification (MSC) codes of research papers from published mathematical databases and construct the weighted code-to-document matrix for applying graph clustering methods. We pursue to group MSC codes into the same cluster if the corresponding MSC codes appear in many papers simultaneously. We compare the MSC clustering results based on the several assessment measures and conclude that the Markov chain-based method is suitable for clustering the MSC codes.

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

An Empirical Study on Improving the Performance of Text Categorization Considering the Relationships between Feature Selection Criteria and Weighting Methods (자질 선정 기준과 가중치 할당 방식간의 관계를 고려한 문서 자동분류의 개선에 대한 연구)

  • Lee Jae-Yun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.2
    • /
    • pp.123-146
    • /
    • 2005
  • This study aims to find consistent strategies for feature selection and feature weighting methods, which can improve the effectiveness and efficiency of kNN text classifier. Feature selection criteria and feature weighting methods are as important factor as classification algorithms to achieve good performance of text categorization systems. Most of the former studies chose conflicting strategies for feature selection criteria and weighting methods. In this study, the performance of several feature selection criteria are measured considering the storage space for inverted index records and the classification time. The classification experiments in this study are conducted to examine the performance of IDF as feature selection criteria and the performance of conventional feature selection criteria, e.g. mutual information, as feature weighting methods. The results of these experiments suggest that using those measures which prefer low-frequency features as feature selection criterion and also as feature weighting method. we can increase the classification speed up to three or five times without loosing classification accuracy.

Classification of e-mail Using Dynamic Category Hierarchy and Automatic category generation (자동 카테고리 생성과 동적 분류 체계를 사용한 이메일 분류)

  • Ahn Chan Min;Park Sang Ho;Lee Ju-Hong;Choi Bum-Ghi;Park Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.79-89
    • /
    • 2004
  • Since the amount of E-mail messages has increased , we need a new technique for efficient e-mail classification. E-mail classifications are grouped into two classes: binary classification, multi-classification. The current binary classification methods are mostly spm mail classification methods which are based on rule driven, bayesian, SVM, etc. The current multi- classification methods are based on clustering which groups e-mails by similarity. In this paper, we propose a novel method for e-mail classification. It combines the automatic category generation method based on the vector model and the dynamic category hierarchy construction method. This method can multi-classify e-mail automatically and manage a large amount of e-mail efficiently. In addition, this method increases the search accuracy by dynamic reclassification of e-mails.

  • PDF

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

Text Classification for Patents: Experiments with Unigrams, Bigrams and Different Weighting Methods

  • Im, ChanJong;Kim, DoWan;Mandl, Thomas
    • International Journal of Contents
    • /
    • v.13 no.2
    • /
    • pp.66-74
    • /
    • 2017
  • Patent classification is becoming more critical as patent filings have been increasing over the years. Despite comprehensive studies in the area, there remain several issues in classifying patents on IPC hierarchical levels. Not only structural complexity but also shortage of patents in the lower level of the hierarchy causes the decline in classification performance. Therefore, we propose a new method of classification based on different criteria that are categories defined by the domain's experts mentioned in trend analysis reports, i.e. Patent Landscape Report (PLR). Several experiments were conducted with the purpose of identifying type of features and weighting methods that lead to the best classification performance using Support Vector Machine (SVM). Two types of features (noun and noun phrases) and five different weighting schemes (TF-idf, TF-rf, TF-icf, TF-icf-based, and TF-idcef-based) were experimented on.

A study on data mining techniques for soil classification methods using cone penetration test results

  • Junghee Park;So-Hyun Cho;Jong-Sub Lee;Hyun-Ki Kim
    • Geomechanics and Engineering
    • /
    • v.35 no.1
    • /
    • pp.67-80
    • /
    • 2023
  • Due to the nature of the conjunctive Cone Penetration Test(CPT), which does not verify the actual sample directly, geotechnical engineers commonly classify the underground geomaterials using CPT results with the classification diagrams proposed by various researchers. However, such classification diagrams may fail to reflect local geotechnical characteristics, potentially resulting in misclassification that does not align with the actual stratification in regions with strong local features. To address this, this paper presents an objective method for more accurate local CPT soil classification criteria, which utilizes C4.5 decision tree models trained with the CPT results from the clay-dominant southern coast of Korea and the sand-dominant region in South Carolina, USA. The results and analyses demonstrate that the C4.5 algorithm, in conjunction with oversampling, outlier removal, and pruning methods, can enhance and optimize the decision tree-based CPT soil classification model.