DOI QR코드

DOI QR Code

그래프 임베딩 및 준지도 기반의 이더리움 피싱 스캠 탐지

Ethereum Phishing Scam Detection based on Graph Embedding and Semi-Supervised Learning

  • 정유영 (광운대학교 인공지능응용학과) ;
  • 김경태 (광운대학교 인공지능응용학과) ;
  • 임동혁 (광운대학교 정보융합학부)
  • 투고 : 2022.12.19
  • 심사 : 2023.02.14
  • 발행 : 2023.05.31

초록

최근 블록체인 기술이 부상하면서 이를 이용한 암호화폐 플랫폼이 늘어나며 화폐 거래가 활발이 이뤄지고 있다. 그러나 암호화폐의 특성을 악용한 범죄 또한 늘어나 문제가 되고 있다. 특히 피싱 스캠은 이더리움 사이버 범죄의 과반수 이상을 차지하며 주요 보안 위협원으로 여겨지고 있다. 따라서 효과적인 피싱 스캠 탐지 방법이 시급하다. 그러나 전체 이더리움 참여 계정 주소에서 라벨링된 피싱 주소의 부족으로 인한 데이터 불균형 문제로 지도학습에 충분한 데이터 제공이 어려운 상황이다. 이를 해결하기 위하여 본 논문에서는 이더리움 트랜잭션 네트워크를 고려한 효과적인 그래프 임베딩 기법인 trans2vec과 준지도 학습 모델 tri-training을 함께 사용하여 라벨링된 데이터 뿐만 아니라 라벨링되지 않은 데이터도 최대한 활용하는 피싱 스캠 탐지 방법을 제안한다.

With the recent rise of blockchain technology, cryptocurrency platforms using it are increasing, and currency transactions are being actively conducted. However, crimes that abuse the characteristics of cryptocurrency are also increasing, which is a problem. In particular, phishing scams account for more than a majority of Ethereum cybercrime and are considered a major security threat. Therefore, effective phishing scams detection methods are urgently needed. However, it is difficult to provide sufficient data for supervised learning due to the problem of data imbalance caused by the lack of phishing addresses labeled in the Ethereum participating account address. To address this, this paper proposes a phishing scams detection method that uses both Trans2vec, an effective graph embedding techique considering Ethereum transaction networks, and semi-supervised learning model Tri-training to make the most of not only labeled data but also unlabeled data.

키워드

과제정보

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임(No.NRF-2021R1F1A1054739). 또한, 본 연구는 과학기술정보통신부 및 정보통신기술진흥센터의 대학ICT연구센터지원사업의 연구결과로 수행되었음(IITP-2023-2018-0-01417).

참고문헌

  1. J. Wu et al., "Who are the phishers? phishing scam detection on ethereum via network embedding," IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol.52, No.2, pp.1156-1166, 2022. https://doi.org/10.1109/TSMC.2020.3016821
  2. S. Nakamoto, "Bitcoin: A peer-to-peer electronic cash system," [Internet], https://bitcoin.org/bitcoin.pdf, 2008.
  3. V. Buterin, "A next-generation smart contract and decentralized application platform," White Paper, Vol.3, No.37, pp.1-36, 2014.
  4. M. Conti, E. S. Kumar, C. Lal, and S. Ruj, "A servey on security and privacy issues of bitcoin," IEEE Communications Serveys & Tutorials, Vol.20, No.4, pp.3416-3452, 2018. https://doi.org/10.1109/COMST.2018.2842460
  5. K. F. K. Low and E. Teo, "Legal risk of owning crytocurrencies," Handbook of Blockchain, Digital Finance, and Inclusion, Vol.1, London: Academic Press, 2008.
  6. M. Khonji, Y. Iraqi, and A. Jones, "Phishing detection: A literature survey," IEEE Communications Surveys & Tutorials, Vol.15, No.4, pp.2091-2121, 2013. https://doi.org/10.1109/SURV.2013.032213.00009
  7. L. Chen, J. Peng, Y. Liu, J. Li, F. Xie, and Z. Zheng, "Phishing scams detection in ethereum transaction network," ACM Transactions on Internet Technology(TOIT), Vol.21, No.1, pp.1-16, 2020. https://doi.org/10.1145/3398071
  8. D. Lin, J. Wu, Q. Yuan, and Z. Zheng, "Modeling and understanding ethereum transaction records via a complex network approach," IEEE Transactions on Circuits and Systems II: Express Briefs, Vol.67, No.11, pp.2737-2741, 2020. https://doi.org/10.1109/TCSII.2020.2968376
  9. Y. Y. Cheong, K. T. Kim, and D. H. Im, "Ethereum phishing scam detection based on graph embedding," in Proceedings of the Annual Conference of Korea Information Processing Society Conference (KIPS) 2022, Vol.29, No.2, 2022.
  10. N. Abdelhamid, A. Ayesh, and F. Thabtah, "Phishing detection based associative classification data mining," Expert Systems with Applications, Vol.41, No.13, pp.5948-5959, 2014. https://doi.org/10.1016/j.eswa.2014.03.019
  11. T. Yu, X. Chen, Z. Xu, and J. Xu, "MP-GCN: A phishing nodes detection approach via graph convolution network for ethereum," Applied Sciences, Vol.12, No.14, pp.7294, 2022.
  12. P. Goyal and E. Ferrara, "Graph embedding techniques, applications, and performance: A survey," Knowledge-Based Systems, Vol.151, pp.78-94, 2018. https://doi.org/10.1016/j.knosys.2018.03.022
  13. B. Perozzi, R. Al-Rfou, S. Skiena, "DeepWalk: Online learning of social representations," in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.701-710, 2014.
  14. A. Grover, J. Leskovec, "Node2vec: Scalable feature learning for networks," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.855-864, 2016.
  15. Z. H. Zhou and M. Li, "Tri-training: Exploting unlabeled data using three classifiers," IEEE Transactions on Knowledge and Data Engineering, Vol.17, No.11, pp.1529-1541, 2005. https://doi.org/10.1109/TKDE.2005.186
  16. Y. He, P. Yang, and P. Cheng, "Semi-supervised internet water army detection based on graph embedding," Multimedia Tools and Applications, Vol.82, No.7, pp.9891-9912, 2023.  https://doi.org/10.1007/s11042-022-13633-1