• Title/Summary/Keyword: Detecting illegal gambling sites

Search Result 1, Processing Time 0.017 seconds

Development of an Intelligent Illegal Gambling Site Detection Model Based on Tag2Vec (Tag2vec 기반의 지능형 불법 도박 사이트 탐지 모형 개발)

  • Song, ChanWoo;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.211-227
    • /
    • 2022
  • Illegal gambling through online gambling sites has become a significant social problem. The development of Internet technology and the spread of smartphones have led to the proliferation of illegal gambling sites, so now illegal online gambling has become accessible to anyone. In order to mitigate its negative effect, the Korean government is trying to detect illegal gambling sites by using self-monitoring agents or reporting systems such as 'Nuricops.' However, it is difficult to detect all illegal sites due to limitations such as a lack of staffing. Accordingly, several scholars have proposed intelligent illegal gambling site detection techniques. Xu et al. (2019) found that fake or illegal websites generally have unique features in the HTML tag structure. It implies that the HTML tag structure can be important for detecting illegal sites. However, prior studies to improve the model's performance by utilizing the HTML tag structure in the illegal site detection model are rare. Against this background, our study aimed to improve the model's performance by utilizing the HTML tag structure and proposes Tag2Vec, a modified version of Doc2Vec, as a methodology to vectorize the HTML tag structure properly. To validate the proposed model, we perform the empirical analysis using a data set consisting of the list of harmful sites from 'The Cheat' and normal sites through Google search. As a result, it was confirmed that the Tag2Vec-based detection model proposed in this study showed better classification accuracy, recall, and F1_Score than the URL-based detection model-a comparative model. The proposed model of this study is expected to be effectively utilized to improve the health of our society through intelligent technology.