• Title/Summary/Keyword: NUMERICAL CLASSIFICATION

Search Result 331, Processing Time 0.024 seconds

The use of support vector machines in semi-supervised classification

  • Bae, Hyunjoo;Kim, Hyungwoo;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.193-202
    • /
    • 2022
  • Semi-supervised learning has gained significant attention in recent applications. In this article, we provide a selective overview of popular semi-supervised methods and then propose a simple but effective algorithm for semi-supervised classification using support vector machines (SVM), one of the most popular binary classifiers in a machine learning community. The idea is simple as follows. First, we apply the dimension reduction to the unlabeled observations and cluster them to assign labels on the reduced space. SVM is then employed to the combined set of labeled and unlabeled observations to construct a classification rule. The use of SVM enables us to extend it to the nonlinear counterpart via kernel trick. Our numerical experiments under various scenarios demonstrate that the proposed method is promising in semi-supervised classification.

A Study of the Application of Relative Location System and Minute Classification System in the DDC (DDC의 상관식 배가법 적용과 분류체계 세분화에 대한 연구)

  • Kwak, Chul-Wan
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.3
    • /
    • pp.45-61
    • /
    • 2017
  • The objective of this study is to understand the application of relative location system and minute classification system in the DDC and to identify the effect of the relative location system and minute classification system during the late of 19th century. In order to achieve the objective, four main investigation areas were chosen: relative location system, minute classification system, and DDC influence to other libraries and classification systems. First, DDC applied a relative location system revolutionarily instead of a fixed location system for arranging books on the shelves, so it opened the period of modern library classification systems. Second, it used a minute classification system, and could classify books which had minute subjects. Third, it applied form to a criterion for dividing divisions and sections, so it helped for classifying books. Fourth, it used a numerical decimal system as a classification system, then people could use it economically and practically. Last, DDC influenced modern classification system such as the Expansive Classification and the Subject Classification etc. DDC is a suitable library classification system for the needs of the times, and it is a practical classification system for each library.

A Study on the Relationship between Class Similarity and the Performance of Hierarchical Classification Method in a Text Document Classification Problem (텍스트 문서 분류에서 범주간 유사도와 계층적 분류 방법의 성과 관계 연구)

  • Jang, Soojung;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.77-93
    • /
    • 2020
  • The literature has reported that hierarchical classification methods generally outperform the flat classification methods for a multi-class document classification problem. Unlike the literature that has constructed a class hierarchy, this paper evaluates the performance of hierarchical and flat classification methods under a situation where the class hierarchy is predefined. We conducted numerical evaluations for two data sets; research papers on climate change adaptation technologies in water sector and 20NewsGroup open data set. The evaluation results show that the hierarchical classification method outperforms the flat classification methods under a certain condition, which differs from the literature. The performance of hierarchical classification method over flat classification method depends on class similarities at levels in the class structure. More importantly, the hierarchical classification method works better when the upper level similarity is less that the lower level similarity.

Optimal Production Planning for Remanufacturing with Quality Classification Errors under Uncertainty in Quality of Used Products

  • Iwao, Masatoshi;Kusukawa, Etsuko
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.2
    • /
    • pp.231-249
    • /
    • 2014
  • This paper discusses a green supply chain with a manufacturer and a collection trader, and it proposes an optimal production planning for remanufacturing of parts in used products with quality classification errors made by the collection trader. When a manufacturer accepts an order for parts from a retailer and procures used products from a collection trader, the collection trader might have some quality classification errors due to the lack of equipment or expert knowledge regarding quality classification. After procurement of used products, the manufacturer inspects if there are any classification errors. If errors are detected, the manufacturer reclassifies the misclassified (overestimated) used products at a cost. Accordingly, the manufacturer decides to remanufacture from the higher-quality used products based on a remanufacturing ratio or produce parts from new materials. This paper develops a mathematical model to find how quality classification errors affect the optimal decisions for a lower limit of procurement quality of used products and a remanufacturing ratio under the lower limit and the expected profit of the manufacturer. Numerical analysis investigates how quality of used products, the reclassification cost and the remanufacturing cost of used products affect the optimal production planning and the expected profit of a manufacturer.

Automatic Recognition of Digital Modulation Types using Wavelet Transformation (웨이브릿 변환을 이용한 디지털 변조타입 자동 인식)

  • Park, Cheol-Sun;Nah, Sun-Phil;Yang, Jong-Won;Choi, Jun-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.4
    • /
    • pp.22-30
    • /
    • 2008
  • In this paper, we deal with modulation classification method using WT capable of classifying incident digital signals without a priori information. These key features should have good properties of sensitive with modulation types and insensitive with SNR variation. The 4 key features for modulation recognition are selected using WT coefficients, which have the property of insentive to the changing of noise. The numerical simulations for classifying 8 digital modulation types using these features are peformed. The numerical simulations of the 3 types (i.e. DTC, MDC, and SVMC) of modulation classifiers are performed the investigation of classification accuracy and execution time to design the modulation classification module in software radio. The simulation result indicated that the execution time of MDC and DTC was best and MDC and SVMC showed good classification performance.

Modified Version of SVM for Text Categorization

  • Jo, Tae-Ho
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.1
    • /
    • pp.52-60
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors for text categorization and modified versions of SVM to be adaptable to string vectors. Traditionally, when the traditional version of SVM is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text categorization, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and apply the modified version of SVM adaptable to string vectors for text categorization.

Inverted Index based Modified Version of K-Means Algorithm for Text Clustering

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.2
    • /
    • pp.67-76
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text clustering, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the k means algorithm adaptable to string vectors for text clustering.

Lipid analysis of streptomycetes isolated form volcanic soil

  • Kim, Seung-Bum;Kim, Min-Young;Seong, Chi-Nam;Ouk, Kang-Sa;Hah, Yung-Chil
    • Journal of Microbiology
    • /
    • v.34 no.2
    • /
    • pp.184-191
    • /
    • 1996
  • The cellular fatty acids and quinones of streptomycetes isolated from volcanic soils were analysed. The strains contained fatty acids of 14 to 17 carbon chains, and 12-methyltetradecanoic acid and 14 methylpentadecanoic acid were dominant in most strains. The total profiles consisted of 74% branched fatty acid family, 16.8% linear family and 8.2% unsaturated family. The largest cluster of grey spore meases defined by numerical classification was separated from the remainders in the principal component analysis, but the other clusters were overlapped with one another. In the analysis of respiratory quinones, all of the strains contained either the menaquinone of 9 isoprene units with 6 hydrogenations of 8 hydrogenations as the major species. The distribution of menaquinones among the clusters could provide an important key in the chemotaxonomy of streptomycetes.

  • PDF

Inverted Index based Modified Version of KNN for Text Categorization

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.1
    • /
    • pp.17-26
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of KNN to be adaptable to string vectors for text categorization. Traditionally, when KNN are used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text categorization, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the supervised learning algorithms adaptable to string vectors for text categorization.

A model-free soft classification with a functional predictor

  • Lee, Eugene;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.6
    • /
    • pp.635-644
    • /
    • 2019
  • Class probability is a fundamental target in classification that contains complete classification information. In this article, we propose a class probability estimation method when the predictor is functional. Motivated by Wang et al. (Biometrika, 95, 149-167, 2007), our estimator is obtained by training a sequence of functional weighted support vector machines (FWSVM) with different weights, which can be justified by the Fisher consistency of the hinge loss. The proposed method can be extended to multiclass classification via pairwise coupling proposed by Wu et al. (Journal of Machine Learning Research, 5, 975-1005, 2004). The use of FWSVM makes our method model-free as well as computationally efficient due to the piecewise linearity of the FWSVM solutions as functions of the weight. Numerical investigation to both synthetic and real data show the advantageous performance of the proposed method.