• Title/Summary/Keyword: Multi-label classification

Search Result 59, Processing Time 0.039 seconds

HANDWRITTEN HANGUL RECOGNITION MODEL USING MULTI-LABEL CLASSIFICATION

  • HANA CHOI
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.2
    • /
    • pp.135-145
    • /
    • 2023
  • Recently, as deep learning technology has developed, various deep learning technologies have been introduced in handwritten recognition, greatly contributing to performance improvement. The recognition accuracy of handwritten Hangeul recognition has also improved significantly, but prior research has focused on recognizing 520 Hangul characters or 2,350 Hangul characters using SERI95 data or PE92 data. In the past, most of the expressions were possible with 2,350 Hangul characters, but as globalization progresses and information and communication technology develops, there are many cases where various foreign words need to be expressed in Hangul. In this paper, we propose a model that recognizes and combines the consonants, medial vowels, and final consonants of a Korean syllable using a multi-label classification model, and achieves a high recognition accuracy of 98.38% as a result of learning with the public data of Korean handwritten characters, PE92. In addition, this model learned only 2,350 Hangul characters, but can recognize the characters which is not included in the 2,350 Hangul characters

Prediction of Protein Subcellular Localization using Label Power-set Classification and Multi-class Probability Estimates (레이블 멱집합 분류와 다중클래스 확률추정을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2562-2570
    • /
    • 2014
  • One of the important hints for inferring the function of unknown proteins is the knowledge about protein subcellular localization. Recently, there are considerable researches on the prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular localization. In this paper, label power-set classification is improved for the accurate prediction of multiple subcellular localization. The predicted multi-labels from the label power-set classifier are combined with their prediction probability to give the final result. To find the accurate probability estimates of multi-classes, this paper employs pair-wise comparison and error-correcting output codes frameworks. Prediction experiments on protein subcellular localization show significant performance improvement.

Applying Coarse-to-Fine Curriculum Learning Mechanism to the multi-label classification task (다중 레이블 분류 작업에서의 Coarse-to-Fine Curriculum Learning 메카니즘 적용 방안)

  • Kong, Heesan;Park, Jaehun;Kim, Kwangsu
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.29-30
    • /
    • 2022
  • Curriculum learning은 딥러닝의 성능을 향상시키기 위해 사람의 학습 과정과 유사하게 일종의 'curriculum'을 도입해 모델을 학습시키는 방법이다. 대부분의 연구는 학습 데이터 중 개별 샘플의 난이도를 기반으로 점진적으로 모델을 학습시키는 방안에 중점을 두고 있다. 그러나, coarse-to-fine 메카니즘은 데이터의 난이도보다 학습에 사용되는 class의 유사도가 더욱 중요하다고 주장하며, 여러 난이도의 auxiliary task를 차례로 학습하는 방법을 제안했다. 그러나, 이 방법은 혼동행렬 기반으로 class의 유사성을 판단해 auxiliary task를 생성함으로 다중 레이블 분류에는 적용하기 어렵다는 한계점이 있다. 따라서, 본 논문에서는 multi-label 환경에서 multi-class와 binary task를 생성하는 방법을 제안해 coarse-to-fine 메카니즘 적용을 위한 방안을 제시하고, 그 결과를 분석한다.

  • PDF

Sparse and low-rank feature selection for multi-label learning

  • Lim, Hyunki
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.1-7
    • /
    • 2021
  • In this paper, we propose a feature selection technique for multi-label classification. Many existing feature selection techniques have selected features by calculating the relation between features and labels such as a mutual information scale. However, since the mutual information measure requires a joint probability, it is difficult to calculate the joint probability from an actual premise feature set. Therefore, it has the disadvantage that only a few features can be calculated and only local optimization is possible. Away from this regional optimization problem, we propose a feature selection technique that constructs a low-rank space in the entire given feature space and selects features with sparsity. To this end, we designed a regression-based objective function using Nuclear norm, and proposed an algorithm of gradient descent method to solve the optimization problem of this objective function. Based on the results of multi-label classification experiments on four data and three multi-label classification performance, the proposed methodology showed better performance than the existing feature selection technique. In addition, it was showed by experimental results that the performance change is insensitive even to the parameter value change of the proposed objective function.

Improving Accuracy of Multi-label Naive Bayes Classifier (다중 레이블 나이브 베이지안 분류기의 정확도 개선 연구)

  • Kim, Hae-Choen;Lee, Jae-Sung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.01a
    • /
    • pp.147-148
    • /
    • 2018
  • 다중 레이블 분류 문제는 다중 레이블 데이터를 입력받았을 때 연관된 다수의 레이블을 추측하는 문제이다. 본 논문에서는 다중 레이블 분류 문제의 기법 중 하나인 나이브 베이지안 분류기에 레이블 의존성을 계산하여 결과에 반영한 결과 다중 레이블 분류 문제의 성능이 개선됨을 확인하였다.

  • PDF

An Efficient Deep Learning Ensemble Using a Distribution of Label Embedding

  • Park, Saerom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.27-35
    • /
    • 2021
  • In this paper, we propose a new stacking ensemble framework for deep learning models which reflects the distribution of label embeddings. Our ensemble framework consists of two phases: training the baseline deep learning classifier, and training the sub-classifiers based on the clustering results of label embeddings. Our framework aims to divide a multi-class classification problem into small sub-problems based on the clustering results. The clustering is conducted on the label embeddings obtained from the weight of the last layer of the baseline classifier. After clustering, sub-classifiers are constructed to classify the sub-classes in each cluster. From the experimental results, we found that the label embeddings well reflect the relationships between classification labels, and our ensemble framework can improve the classification performance on a CIFAR 100 dataset.

A Study on Facial Skin Disease Recognition Using Multi-Label Classification (다중 레이블 분류를 활용한 안면 피부 질환 인식에 관한 연구)

  • Lim, Chae Hyun;Son, Min Ji;Kim, Myung Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.555-560
    • /
    • 2021
  • Recently, as people's interest in facial skin beauty has increased, research on skin disease recognition for facial skin beauty is being conducted by using deep learning. These studies recognized a variety of skin diseases, including acne. Existing studies can recognize only the single skin diseases, but skin diseases that occur on the face can enact in a more diverse and complex manner. Therefore, in this paper, complex skin diseases such as acne, blackheads, freckles, age spots, normal skin, and whiteheads are identified using the Inception-ResNet V2 deep learning mode with multi-label classification. The accuracy was 98.8%, hamming loss was 0.003, and precision, recall, F1-Score achieved 96.6% or more for each single class.

IPC Multi-label Classification based on Functional Characteristics of Fields in Patent Documents (특허문서 필드의 기능적 특성을 활용한 IPC 다중 레이블 분류)

  • Lim, Sora;Kwon, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.77-88
    • /
    • 2017
  • Recently, with the advent of knowledge based society where information and knowledge make values, patents which are the representative form of intellectual property have become important, and the number of the patents follows growing trends. Thus, it needs to classify the patents depending on the technological topic of the invention appropriately in order to use a vast amount of the patent information effectively. IPC (International Patent Classification) is widely used for this situation. Researches about IPC automatic classification have been studied using data mining and machine learning algorithms to improve current IPC classification task which categorizes patent documents by hand. However, most of the previous researches have focused on applying various existing machine learning methods to the patent documents rather than considering on the characteristics of the data or the structure of patent documents. In this paper, therefore, we propose to use two structural fields, technical field and background, considered as having impacts on the patent classification, where the two field are selected by applying of the characteristics of patent documents and the role of the structural fields. We also construct multi-label classification model to reflect what a patent document could have multiple IPCs. Furthermore, we propose a method to classify patent documents at the IPC subclass level comprised of 630 categories so that we investigate the possibility of applying the IPC multi-label classification model into the real field. The effect of structural fields of patent documents are examined using 564,793 registered patents in Korea, and 87.2% precision is obtained in the case of using title, abstract, claims, technical field and background. From this sequence, we verify that the technical field and background have an important role in improving the precision of IPC multi-label classification in IPC subclass level.

Recognition of Multi Label Fashion Styles based on Transfer Learning and Graph Convolution Network (전이학습과 그래프 합성곱 신경망 기반의 다중 패션 스타일 인식)

  • Kim, Sunghoon;Choi, Yerim;Park, Jonghyuk
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.29-41
    • /
    • 2021
  • Recently, there are increasing attempts to utilize deep learning methodology in the fashion industry. Accordingly, research dealing with various fashion-related problems have been proposed, and superior performances have been achieved. However, the studies for fashion style classification have not reflected the characteristics of the fashion style that one outfit can include multiple styles simultaneously. Therefore, we aim to solve the multi-label classification problem by utilizing the dependencies between the styles. A multi-label recognition model based on a graph convolution network is applied to detect and explore fashion styles' dependencies. Furthermore, we accelerate model training and improve the model's performance through transfer learning. The proposed model was verified by a dataset collected from social network services and outperformed baselines.

Two-stage Deep Learning Model with LSTM-based Autoencoder and CNN for Crop Classification Using Multi-temporal Remote Sensing Images

  • Kwak, Geun-Ho;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.4
    • /
    • pp.719-731
    • /
    • 2021
  • This study proposes a two-stage hybrid classification model for crop classification using multi-temporal remote sensing images; the model combines feature embedding by using an autoencoder (AE) with a convolutional neural network (CNN) classifier to fully utilize features including informative temporal and spatial signatures. Long short-term memory (LSTM)-based AE (LAE) is fine-tuned using class label information to extract latent features that contain less noise and useful temporal signatures. The CNN classifier is then applied to effectively account for the spatial characteristics of the extracted latent features. A crop classification experiment with multi-temporal unmanned aerial vehicle images is conducted to illustrate the potential application of the proposed hybrid model. The classification performance of the proposed model is compared with various combinations of conventional deep learning models (CNN, LSTM, and convolutional LSTM) and different inputs (original multi-temporal images and features from stacked AE). From the crop classification experiment, the best classification accuracy was achieved by the proposed model that utilized the latent features by fine-tuned LAE as input for the CNN classifier. The latent features that contain useful temporal signatures and are less noisy could increase the class separability between crops with similar spectral signatures, thereby leading to superior classification accuracy. The experimental results demonstrate the importance of effective feature extraction and the potential of the proposed classification model for crop classification using multi-temporal remote sensing images.