• Title/Summary/Keyword: classification model

Search Result 4,128, Processing Time 0.033 seconds

Design and implementation of malicious comment classification system using graph structure (그래프 구조를 이용한 악성 댓글 분류 시스템 설계 및 구현)

  • Sung, Ji-Suk;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.6
    • /
    • pp.23-28
    • /
    • 2020
  • A comment system is essential for communication on the Internet. However, there are also malicious comments such as inappropriate expression of others by exploiting anonymity online. In order to protect users from malicious comments, classification of malicious / normal comments is necessary, and this can be implemented as text classification. Text classification is one of the important topics in natural language processing, and studies using pre-trained models such as BERT and graph structures such as GCN and GAT have been actively conducted. In this study, we implemented a comment classification system using BERT, GCN, and GAT for actual published comments and compared the performance. In this study, the system using the graph-based model showed higher performance than the BERT.

A new classification method using penalized partial least squares (벌점 부분최소자승법을 이용한 분류방법)

  • Kim, Yun-Dae;Jun, Chi-Hyuck;Lee, Hye-Seon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.931-940
    • /
    • 2011
  • Classification is to generate a rule of classifying objects into several categories based on the learning sample. Good classification model should classify new objects with low misclassification error. Many types of classification methods have been developed including logistic regression, discriminant analysis and tree. This paper presents a new classification method using penalized partial least squares. Penalized partial least squares can make the model more robust and remedy multicollinearity problem. This paper compares the proposed method with logistic regression and PCA based discriminant analysis by some real and artificial data. It is concluded that the new method has better power as compared with other methods.

Classification of Tor network traffic using CNN (CNN을 활용한 Tor 네트워크 트래픽 분류)

  • Lim, Hyeong Seok;Lee, Soo Jin
    • Convergence Security Journal
    • /
    • v.21 no.3
    • /
    • pp.31-38
    • /
    • 2021
  • Tor, known as Onion Router, guarantees strong anonymity. For this reason, Tor is actively used not only for criminal activities but also for hacking attempts such as rapid port scan and the ex-filtration of stolen credentials. Therefore, fast and accurate detection of Tor traffic is critical to prevent the crime attempts in advance and secure the organization's information system. This paper proposes a novel classification model that can detect Tor traffic and classify the traffic types based on CNN(Convolutional Neural Network). We use UNB Tor 2016 Dataset to evaluate the performance of our model. The experimental results show that the accuracy is 99.98% and 97.27% in binary classification and multiclass classification respectively.

Ensemble Knowledge Distillation for Classification of 14 Thorax Diseases using Chest X-ray Images (흉부 X-선 영상을 이용한 14 가지 흉부 질환 분류를 위한 Ensemble Knowledge Distillation)

  • Ho, Thi Kieu Khanh;Jeon, Younghoon;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.313-315
    • /
    • 2021
  • Timely and accurate diagnosis of lung diseases using Chest X-ray images has been gained much attention from the computer vision and medical imaging communities. Although previous studies have presented the capability of deep convolutional neural networks by achieving competitive binary classification results, their models were seemingly unreliable to effectively distinguish multiple disease groups using a large number of x-ray images. In this paper, we aim to build an advanced approach, so-called Ensemble Knowledge Distillation (EKD), to significantly boost the classification accuracies, compared to traditional KD methods by distilling knowledge from a cumbersome teacher model into an ensemble of lightweight student models with parallel branches trained with ground truth labels. Therefore, learning features at different branches of the student models could enable the network to learn diverse patterns and improve the qualify of final predictions through an ensemble learning solution. Although we observed that experiments on the well-established ChestX-ray14 dataset showed the classification improvements of traditional KD compared to the base transfer learning approach, the EKD performance would be expected to potentially enhance classification accuracy and model generalization, especially in situations of the imbalanced dataset and the interdependency of 14 weakly annotated thorax diseases.

  • PDF

Machine Learning based Open Source Software Category Classification Model (머신러닝 기반의 오픈소스 SW 카테고리 분류 모델 연구)

  • Back, Seung-Chan;Choi, Hyunjae;Yun, Ho-Yeong;Joe, Yong-Joon;Shin, Dong-Myung
    • Journal of Software Assessment and Valuation
    • /
    • v.14 no.1
    • /
    • pp.9-17
    • /
    • 2018
  • In many respects, the use and importance of open source software in companies and individuals are increasing as the days pass. However, software evaluation for users, software classification of filtering fundamentals research can not deal flexibly according to the characteristics of open source software. They are using a fixed classification system. In this research, we provide a classification model of open source software that can flexibly deal with the classification of open source software and the software category of new open source software.

Development of a Classification Model for Driver's Drowsiness and Waking Status Using Heart Rate Variability and Respiratory Features

  • Kim, Sungho;Choi, Booyong;Cho, Taehwan;Lee, Yongkyun;Koo, Hyojin;Kim, Dongsoo
    • Journal of the Ergonomics Society of Korea
    • /
    • v.35 no.5
    • /
    • pp.371-381
    • /
    • 2016
  • Objective:This study aims to evaluate the features of heart rate variability (HRV) and respiratory signals as indices for a driver's drowsiness and waking status in order to develop the classification model for a driver's drowsiness and waking status using those features. Background: Driver's drowsiness is one of the major causal factors for traffic accidents. This study hypothesized that the application of combined bio-signals to monitor the alertness level of drivers would improve the effectiveness of the classification techniques of driver's drowsiness. Method: The features of three heart rate variability (HRV) measurements including low frequency (LF), high frequency (HF), and LF/HF ratio and two respiratory measurements including peak and rate were acquired by the monotonous car driving simulation experiments using the photoplethysmogram (PPG) and respiration sensors. The experiments were repeated a total of 50 times on five healthy male participants in their 20s to 50s. The classification model was developed by selecting the optimal measurements, applying a binary logistic regression method and performing 3-fold cross validation. Results: The power of LF, HF, and LF/HF ratio, and the respiration peak of drowsiness status were reduced by 38%, 22%, 31%, and 7%, compared to those of waking status, while respiration rate was increased by 3%. The classification sensitivity of the model using both HRV and respiratory features (91.4%) was improved, compared to that of the model using only HRV feature (89.8%) and that using only respiratory feature (83.6%). Conclusion: This study suggests that the classification of driver's drowsiness and waking status may be improved by utilizing a combination of HRV and respiratory features. Application: The results of this study can be applied to the development of driver's drowsiness prevention systems.

The Case of Proportional Cell Frequencies for the Two-Way Cross-Classification with Interaction

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.9 no.2
    • /
    • pp.119-138
    • /
    • 1998
  • The case of proportional cell frequencies for the two-way cross-classification with interaction is considered. Several types of hypotheses for the general unbalanced data that are commonly used in the literature are shown, and they are written out for this particular case. A reparameterized form of the cell means model is defined to establish the reparameterized model, and orthogonal property of the model is shown using the augmented matrix and the numerator sums of squares are computed. Different ways of producing the same analysis of variance tables are shown in both orthogonal and nonorthogonal situations.

  • PDF

Optimizing Intrusion Detection Pattern Model for Improving Network-based IDS Detection Efficiency

  • Kim, Jai-Myong;Lee, Kyu-Ho;Kim, Jong-Seob;Kim, Kuinam J.
    • Convergence Security Journal
    • /
    • v.1 no.1
    • /
    • pp.37-45
    • /
    • 2001
  • In this paper, separated and optimized pattern database model is proposed. In order to improve efficiency of Network-based IDS, pattern database is classified by proper basis. Classification basis is decided by the specific Intrusions validity on specific target. Using this model, IDS searches only valid patterns in pattern database on each captured packets. In result, IDS can reduce system resources for searching pattern database. So, IDS can analyze more packets on the network. In this paper, proper classification basis is proposed and pattern database classified by that basis is formed. And its performance is verified by experimental results.

  • PDF

New Fashion Clothing Image Classification (새로운 패션 의류 이미지 분류)

  • Shin, Seong-Yoon;Lee Hyun-Chang;Shin, Kwang-Seong;Kim, Hyung-Jin;Lee, Jae-Wan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.555-556
    • /
    • 2021
  • We propose a novel method based on a deep learning model with an optimized dynamic decay learning rate and improved model structure to achieve fast and accurate classification of fashion clothing images.

  • PDF

Development of Image Classification Model for Urban Park User Activity Using Deep Learning of Social Media Photo Posts (소셜미디어 사진 게시물의 딥러닝을 활용한 도시공원 이용자 활동 이미지 분류모델 개발)

  • Lee, Ju-Kyung;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.6
    • /
    • pp.42-57
    • /
    • 2022
  • This study aims to create a basic model for classifying the activity photos that urban park users shared on social media using Deep Learning through Artificial Intelligence. Regarding the social media data, photos related to urban parks were collected through a Naver search, were collected, and used for the classification model. Based on the indicators of Naturalness, Potential Attraction, and Activity, which can be used to evaluate the characteristics of urban parks, 21 classification categories were created. Urban park photos shared on Naver were collected by category, and annotated datasets were created. A custom CNN model and a transfer learning model utilizing a CNN pre-trained on the collected photo datasets were designed and subsequently analyzed. As a result of the study, the Xception transfer learning model, which demonstrated the best performance, was selected as the urban park user activity image classification model and evaluated through several evaluation indicators. This study is meaningful in that it has built AI as an index that can evaluate the characteristics of urban parks by using user-shared photos on social media. The classification model using Deep Learning mitigates the limitations of manual classification, and it can efficiently classify large amounts of urban park photos. So, it can be said to be a useful method that can be used for the monitoring and management of city parks in the future.