• Title/Summary/Keyword: Multi-label Classification

Search Result 60, Processing Time 0.024 seconds

Small-Scale Object Detection Label Reassignment Strategy

  • An, Jung-In;Kim, Yoon;Choi, Hyun-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.77-84
    • /
    • 2022
  • In this paper, we propose a Label Reassignment Strategy to improve the performance of an object detection algorithm. Our approach involves two stages: an inference stage and an assignment stage. In the inference stage, we perform multi-scale inference with predefined scale sizes on a trained model and re-infer masked images to obtain robust classification results. In the assignment stage, we calculate the IoU between bounding boxes to remove duplicates. We also check box and class occurrence between the detection result and annotation label to re-assign the dominant class type. We trained the YOLOX-L model with the re-annotated dataset to validate our strategy. The model achieved a 3.9% improvement in mAP and 3x better performance on AP_S compared to the model trained with the original dataset. Our results demonstrate that the proposed Label Reassignment Strategy can effectively improve the performance of an object detection model.

Weakly-supervised Semantic Segmentation using Exclusive Multi-Classifier Deep Learning Model (독점 멀티 분류기의 심층 학습 모델을 사용한 약지도 시맨틱 분할)

  • Choi, Hyeon-Joon;Kang, Dong-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.227-233
    • /
    • 2019
  • Recently, along with the recent development of deep learning technique, neural networks are achieving success in computer vision filed. Convolutional neural network have shown outstanding performance in not only for a simple image classification task, but also for tasks with high difficulty such as object segmentation and detection. However many such deep learning models are based on supervised-learning, which requires more annotation labels than image-level label. Especially image semantic segmentation model requires pixel-level annotations for training, which is very. To solve these problems, this paper proposes a weakly-supervised semantic segmentation method which requires only image level label to train network. Existing weakly-supervised learning methods have limitations in detecting only specific area of object. In this paper, on the other hand, we use multi-classifier deep learning architecture so that our model recognizes more different parts of objects. The proposed method is evaluated using VOC 2012 validation dataset.

An Analytical Study on Automatic Classification of Domestic Journal articles Using Random Forest (랜덤포레스트를 이용한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.2
    • /
    • pp.57-77
    • /
    • 2019
  • Random Forest (RF), a representative ensemble technique, was applied to automatic classification of journal articles in the field of library and information science. Especially, I performed various experiments on the main factors such as tree number, feature selection, and learning set size in terms of classification performance that automatically assigns class labels to domestic journals. Through this, I explored ways to optimize the performance of random forests (RF) for imbalanced datasets in real environments. Consequently, for the automatic classification of domestic journal articles, Random Forest (RF) can be expected to have the best classification performance when using tree number interval 100~1000(C), small feature set (10%) based on chi-square statistic (CHI), and most learning sets (9-10 years).

Web Application for Creating Emotional ID Photos using Deep Learning (딥러닝을 활용한 감성 증명사진 제작 웹 애플리케이션)

  • Kim, Do Young;Kang, In Yeong;Kim, Yeon Su;Park, Goo man
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1261-1264
    • /
    • 2022
  • 최근 본인에게 어울리는 색상을 배경으로 촬영하는 감성 증명사진이 유행하고 있다. 개인마다 퍼스널 컬러를 찾아 배경색에 적용하는 것은 시간, 비용, 인력적으로 어려움이 있으므로 자동으로 개인에 따른 배경색을 찾아서 사진을 합성하여 감성 증명사진을 제작해 주는 딥러닝 기반 시스템을 구축하였다. 본 논문에서는 Convolution Neural Network 를 기반으로 한 딥러닝 기술을 이용해 Image Matting 과 Multi-Label Classification 을 수행하여 기존 감성 증명사진들을 학습하여 모델을 구축하였으며, 해당 시스템으로 사용자에게 새로운 배경색이 적용된 감성 증명사진을 제공하는 웹 애플리케이션을 제안한다.

  • PDF

Multi-label Open Intent Classification using Known Intent Information (의도 정보를 활용한 다중 레이블 오픈 의도 분류)

  • Nahyeon Park;Seongmin Cho;Hyun-Je Song
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.479-484
    • /
    • 2023
  • 다중 레이블 오픈 의도 분류란 다중 의도 분류와 오픈 의도 분류가 합쳐져 오픈 도메인을 가정하고 진행하는 다중 의도 분류 문제이다. 발화 속에는 여러 의도들이 존재한다. 이때 사전에 정의된 의도 여부만을 판별하는 것이 아니라 사전에 정의되어 있는 의도에 대해서만이라도 어떤 의도인지 분류할 수 있어야 한다. 본 논문에서는 발화 속 의도 정보를 활용하여 다중 레이블 오픈 의도를 분류하는 모델을 제안한다. 먼저, 문장의 의도 개수를 예측한다. 그리고 다중 레이블 의도 분류기를 통해 다중 레이블 의도 분류를 진행하여 의도 정보를 획득한다. 획득한 의도 정보 속 다중 의도 개수와 전체 의도 개수를 비교하여 전체 의도 개수가 더 많다면 오픈 의도가 존재한다고 판단한다. 실험 결과 제안한 방법은 MixATIS의 75% 의도에서 정확도 94.49, F1 97.44, MixSNIPS에서는 정확도 86.92, F1 92.96의 성능을 보여준다.

  • PDF

Smart Contract Vulnerability Detection Study Based on Control Flow Graphs (제어 흐름 그래프 기반 스마트 컨트랙트 취약성 탐지 연구)

  • Yoo-Young Cheong;La Yeon Choi;Dong-Hyuk Im
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.1247-1249
    • /
    • 2023
  • 스마트 컨트랙트는 블록체인 상에서 실행되는 프로그램으로 복잡한 비즈니스 논리를 처리할 수 있다. 그러나 블록체인의 무결성과 조건에 따라 실행되는 특성을 이용한 악의적 사용으로 인하여 블록체인 보안에서 시급한 문제가 되고있다. 따라서 스마트 컨트랙트 취약성 탐지문제는 최근 많은 연구가 이루어지고 있다. 그러나 기존 연구의 대부분이 단일 유형의 취약성 여부에 대한 탐지에만 초점이 맞춰져 있어 여러 유형의 취약성에 대한 동시 식별이 어렵다. 이 문제를 해결하고자 본 연구에서는 스마트 컨트랙트 소스코드 제어 흐름 그래프를 기반으로 그래프의 forward edge와 backward edge를 고려한 신경망으로 그래프 구조를 학습한 후 그래프 multi-label classification을 진행하여 다중 취약성을 탐지할 수 있는 모델을 제안한다.

Proposal of Git's Commit Message Complex Classification Model for Efficient S/W Maintenance (효율적인 S/W 유지관리를 위한 Git의 커밋메시지 복합 분류모델 제안)

  • Choi, Ji-Hoon;Kim, Jae-Woong;Lee, Youn-Yeoul;Chae, Yi-Geun;Kim, Joon-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.123-125
    • /
    • 2022
  • Git의 커밋 메시지는 프로젝트가 진행되면서 발생하는 각종 이슈 및 코드의 변경이력을 저장하고 관리하고 있기 때문에 소프트웨어 유지관리와 프로젝트의 생명주기와 밀접한 연관성을 갖고 있다. 이러한 Git의 커밋 메시지에 대한 정확한 분석 결과는 소프트웨어 개발 및 유지관리 활동 시, 시간과 비용의 효율적인 관리에 많은 영향을 끼치고 있다. 이에 대한 기존 연구로 Git에서 발생하는 커밋 메시지를 소프트웨어 유지관리의 세 가지 형태로 분류하고 매핑하여 정확한 분석을 시도하려는 연구가 진행되었으나, 최대 87%의 정확도를 제시한 연구 결과가 있었다. 이러한 연구들은 정확도가 낮아 실제 프로젝트의 개발 및 유지관리에 적용하기에는 위험성과 어려움이 있는 현실이다. 본 논문에서는 커밋 메시지 분류에 대한 선행 연구 조사를 통해 각 연구들의 프로세스와 특징을 추출하였고, 이를 이용한 분류 정확도를 높일 수 있는 커밋 복합 분류 모델에 대해 제안한다.

  • PDF

Malicious Traffic Classification Using Mitre ATT&CK and Machine Learning Based on UNSW-NB15 Dataset (마이터 어택과 머신러닝을 이용한 UNSW-NB15 데이터셋 기반 유해 트래픽 분류)

  • Yoon, Dong Hyun;Koo, Ja Hwan;Won, Dong Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • This study proposed a classification of malicious network traffic using the cyber threat framework(Mitre ATT&CK) and machine learning to solve the real-time traffic detection problems faced by current security monitoring systems. We applied a network traffic dataset called UNSW-NB15 to the Mitre ATT&CK framework to transform the label and generate the final dataset through rare class processing. After learning several boosting-based ensemble models using the generated final dataset, we demonstrated how these ensemble models classify network traffic using various performance metrics. Based on the F-1 score, we showed that XGBoost with no rare class processing is the best in the multi-class traffic environment. We recognized that machine learning ensemble models through Mitre ATT&CK label conversion and oversampling processing have differences over existing studies, but have limitations due to (1) the inability to match perfectly when converting between existing datasets and Mitre ATT&CK labels and (2) the presence of excessive sparse classes. Nevertheless, Catboost with B-SMOTE achieved the classification accuracy of 0.9526, which is expected to be able to automatically detect normal/abnormal network traffic.

No-Reference Image Quality Assessment based on Quality Awareness Feature and Multi-task Training

  • Lai, Lijing;Chu, Jun;Leng, Lu
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.75-86
    • /
    • 2022
  • The existing image quality assessment (IQA) datasets have a small number of samples. Some methods based on transfer learning or data augmentation cannot make good use of image quality-related features. A No Reference (NR)-IQA method based on multi-task training and quality awareness is proposed. First, single or multiple distortion types and levels are imposed on the original image, and different strategies are used to augment different types of distortion datasets. With the idea of weak supervision, we use the Full Reference (FR)-IQA methods to obtain the pseudo-score label of the generated image. Then, we combine the classification information of the distortion type, level, and the information of the image quality score. The ResNet50 network is trained in the pre-train stage on the augmented dataset to obtain more quality-aware pre-training weights. Finally, the fine-tuning stage training is performed on the target IQA dataset using the quality-aware weights to predicate the final prediction score. Various experiments designed on the synthetic distortions and authentic distortions datasets (LIVE, CSIQ, TID2013, LIVEC, KonIQ-10K) prove that the proposed method can utilize the image quality-related features better than the method using only single-task training. The extracted quality-aware features improve the accuracy of the model.

Identifying sources of heavy metal contamination in stream sediments using machine learning classifiers (기계학습 분류모델을 이용한 하천퇴적물의 중금속 오염원 식별)

  • Min Jeong Ban;Sangwook Shin;Dong Hoon Lee;Jeong-Gyu Kim;Hosik Lee;Young Kim;Jeong-Hun Park;ShunHwa Lee;Seon-Young Kim;Joo-Hyon Kang
    • Journal of Wetlands Research
    • /
    • v.25 no.4
    • /
    • pp.306-314
    • /
    • 2023
  • Stream sediments are an important component of water quality management because they are receptors of various pollutants such as heavy metals and organic matters emitted from upland sources and can be secondary pollution sources, adversely affecting water environment. To effectively manage the stream sediments, identification of primary sources of sediment contamination and source-associated control strategies will be required. We evaluated the performance of machine learning models in identifying primary sources of sediment contamination based on the physico-chemical properties of stream sediments. A total of 356 stream sediment data sets of 18 quality parameters including 10 heavy metal species(Cd, Cu, Pb, Ni, As, Zn, Cr, Hg, Li, and Al), 3 soil parameters(clay, silt, and sand fractions), and 5 water quality parameters(water content, loss on ignition, total organic carbon, total nitrogen, and total phosphorous) were collected near abandoned metal mines and industrial complexes across the four major river basins in Korea. Two machine learning algorithms, linear discriminant analysis (LDA) and support vector machine (SVM) classifiers were used to classify the sediments into four cases of different combinations of the sampling period and locations (i.e., mine in dry season, mine in wet season, industrial complex in dry season, and industrial complex in wet season). Both models showed good performance in the classification, with SVM outperformed LDA; the accuracy values of LDA and SVM were 79.5% and 88.1%, respectively. An SVM ensemble model was used for multi-label classification of the multiple contamination sources inlcuding landuses in the upland areas within 1 km radius from the sampling sites. The results showed that the multi-label classifier was comparable performance with sinlgle-label SVM in classifying mines and industrial complexes, but was less accurate in classifying dominant land uses (50~60%). The poor performance of the multi-label SVM is likely due to the overfitting caused by small data sets compared to the complexity of the model. A larger data set might increase the performance of the machine learning models in identifying contamination sources.