• Title/Summary/Keyword: HTML selector

Search Result 1, Processing Time 0.014 seconds

A Scraping Method of In-Frame Web Sources Using Python (파이썬을 이용한 프레임내 웹 페이지 스크래핑 기법)

  • Yun, Sujin;Seung, Li;Woo, Young Woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.271-274
    • /
    • 2019
  • In this paper, we proposed a detailed address acquisition scheme for automatically collecting data of a web page in a frame that is difficult to access by a general web access method. Using the Python language and the Beautiful Soup library, which can utilize the proposed address resolution technique and the HTML selector, we were able to automatically collect all the bulletin board text data written in several pages. By using the proposed method, we can collect large amount of data automatically by Python web scraping program for web pages of any form of address, and we expect that it can be used for big data analysis.

  • PDF