• Title/Summary/Keyword: 그래프 기반 준지도 학습

Search Result 6, Processing Time 0.035 seconds

A Label Inference Algorithm Considering Vertex Importance in Semi-Supervised Learning (준지도 학습에서 꼭지점 중요도를 고려한 레이블 추론)

  • Oh, Byonghwa;Yang, Jihoon;Lee, Hyun-Jin
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1561-1567
    • /
    • 2015
  • Abstract Semi-supervised learning is an area in machine learning that employs both labeled and unlabeled data in order to train a model and has the potential to improve prediction performance compared to supervised learning. Graph-based semi-supervised learning has recently come into focus with two phases: graph construction, which converts the input data into a graph, and label inference, which predicts the appropriate labels for unlabeled data using the constructed graph. The inference is based on the smoothness assumption feature of semi-supervised learning. In this study, we propose an enhanced label inference algorithm by incorporating the importance of each vertex. In addition, we prove the convergence of the suggested algorithm and verify its excellence.

Graph Construction Based on Fast Low-Rank Representation in Graph-Based Semi-Supervised Learning (그래프 기반 준지도 학습에서 빠른 낮은 계수 표현 기반 그래프 구축)

  • Oh, Byonghwa;Yang, Jihoon
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.15-21
    • /
    • 2018
  • Low-Rank Representation (LRR) based methods are widely used in many practical applications, such as face clustering and object detection, because they can guarantee high prediction accuracy when used to constructing graphs in graph - based semi-supervised learning. However, in order to solve the LRR problem, it is necessary to perform singular value decomposition on the square matrix of the number of data points for each iteration of the algorithm; hence the calculation is inefficient. To solve this problem, we propose an improved and faster LRR method based on the recently published Fast LRR (FaLRR) and suggests ways to introduce and optimize additional constraints on the underlying optimization goals in order to address the fact that the FaLRR is fast but actually poor in classification problems. Our experiments confirm that the proposed method finds a better solution than LRR does. We also propose Fast MLRR (FaMLRR), which shows better results when the goal of minimizing is added.

Semi-supervised classification with LS-SVM formulation (최소제곱 서포터벡터기계 형태의 준지도분류)

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.3
    • /
    • pp.461-470
    • /
    • 2010
  • Semi supervised classification which is a method using labeled and unlabeled data has considerable attention in recent years. Among various methods the graph based manifold regularization is proved to be an attractive method. Least squares support vector machine is gaining a lot of popularities in analyzing nonlinear data. We propose a semi supervised classification algorithm using the least squares support vector machines. The proposed algorithm is based on the manifold regularization. In this paper we show that the proposed method can use unlabeled data efficiently.

The Construction of a Domain-Specific Sentiment Dictionary Using Graph-based Semi-supervised Learning Method (그래프 기반 준지도 학습 방법을 이용한 특정분야 감성사전 구축)

  • Kim, Jung-Ho;Oh, Yean-Ju;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.18 no.1
    • /
    • pp.103-110
    • /
    • 2015
  • Sentiment lexicon is an essential element for expressing sentiment on a text or recognizing sentiment from a text. We propose a graph-based semi-supervised learning method to construct a sentiment dictionary as sentiment lexicon set. In particular, we focus on the construction of domain-specific sentiment dictionary. The proposed method makes up a graph according to lexicons and proximity among lexicons, and sentiments of some lexicons which already know their sentiment values are propagated throughout all of the lexicons on the graph. There are two typical types of the sentiment lexicon, sentiment words and sentiment phrase, and we construct a sentiment dictionary by creating each graph of them and infer sentiment of all sentiment lexicons. In order to verify our proposed method, we constructed a sentiment dictionary specific to the movie domain, and conducted sentiment classification experiments with it. As a result, it have been shown that the classification performance using the sentiment dictionary is better than the other using typical general-purpose sentiment dictionary.

Ethereum Phishing Scam Detection based on Graph Embedding and Semi-Supervised Learning (그래프 임베딩 및 준지도 기반의 이더리움 피싱 스캠 탐지)

  • Yoo-Young Cheong;Gyoung-Tae Kim;Dong-Hyuk Im
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.5
    • /
    • pp.165-170
    • /
    • 2023
  • With the recent rise of blockchain technology, cryptocurrency platforms using it are increasing, and currency transactions are being actively conducted. However, crimes that abuse the characteristics of cryptocurrency are also increasing, which is a problem. In particular, phishing scams account for more than a majority of Ethereum cybercrime and are considered a major security threat. Therefore, effective phishing scams detection methods are urgently needed. However, it is difficult to provide sufficient data for supervised learning due to the problem of data imbalance caused by the lack of phishing addresses labeled in the Ethereum participating account address. To address this, this paper proposes a phishing scams detection method that uses both Trans2vec, an effective graph embedding techique considering Ethereum transaction networks, and semi-supervised learning model Tri-training to make the most of not only labeled data but also unlabeled data.

Ethereum Phishing Scam Detection Based on Graph Embedding (그래프 임베딩 기반의 이더리움 피싱 스캠 탐지 연구)

  • Cheong, Yoo-Young;Kim, Gyoung-Tae;Im, Dong-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.266-268
    • /
    • 2022
  • 최근 블록체인 기술이 부상하면서 이를 이용한 암호화폐가 범죄의 대상이 되고 있다. 특히 피싱 스캠은 이더리움 사이버 범죄의 과반수 이상을 차지하며 주요 보안 위협원으로 여겨지고 있다. 따라서 효과적인 피싱 스캠 탐지 방법이 시급하다. 그러나 전체 노드에서 라벨링된 피싱 주소의 부족으로 인한 데이터 불균형으로 인하여 지도학습에 충분한 데이터 제공이 어려운 상황이다. 이를 해결하기 위해 본 논문에서는 이더리움 트랜잭션 네트워크를 고려한 효율적인 네트워크 임베딩 기법인 trans2vec 과 준지도 학습 모델 tri-training 을 함께 사용하여 라벨링된 데이터뿐만 아니라 라벨링되지 않은 데이터도 최대한 활용하는 피싱 스캠 탐지 방법을 제안한다.