• Title/Summary/Keyword: Crawler

Search Result 199, Processing Time 0.038 seconds

Issue Analysis on Gas Safety Based on a Distributed Web Crawler Using Amazon Web Services (AWS를 활용한 분산 웹 크롤러 기반 가스 안전 이슈 분석)

  • Kim, Yong-Young;Kim, Yong-Ki;Kim, Dae-Sik;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.317-325
    • /
    • 2018
  • With the aim of creating new economic values and strengthening national competitiveness, governments and major private companies around the world are continuing their interest in big data and making bold investments. In order to collect objective data, such as news, securing data integrity and quality should be a prerequisite. For researchers or practitioners who wish to make decisions or trend analyses based on objective and massive data, such as portal news, the problem of using the existing Crawler method is that data collection itself is blocked. In this study, we implemented a method of collecting web data by addressing existing crawler-style problems using the cloud service platform provided by Amazon Web Services (AWS). In addition, we collected 'gas safety' articles and analyzed issues related to gas safety. In order to ensure gas safety, the research confirmed that strategies for gas safety should be established and systematically operated based on five categories: accident/occurrence, prevention, maintenance/management, government/policy and target.

Web crawler designed utilizing server overhead optimization system (웹크롤러의 서버 오버헤드 최적화 시스템 설계)

  • Lee, Jong-Won;Kim, Min-Ji;Kim, A-Yong;Ban, Tae-Hak;Jung, Hoe-Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.582-584
    • /
    • 2014
  • Conventional Web crawlers are reducing overhead burden on the server to ensure the integrity of data optimization measures have been continuously developed. The amount of data growing exponentially faster among those data, then the data needs to be collected should be used to the modern web crawler is the indispensable presence. In this paper, suggested that the existing Web crawler and Web crawler approach efficiency comparison and analysis. In addition, based on the results, compared to suggest an optimized technique, Web crawlers, data collection cycle dynamically reduces the overhead of the server system was designed for. This is a Web crawler approach will be utilized in the field of the search system.

  • PDF

Motion Control Algorithm for Crawler Type In-Pipe Robot (크롤러 방식 터널로봇의 모션제어 알고리즘)

  • Bae, Ki-Man;Lee, Sang-Ryong;Lee, Sang-il;Lee, Choon-Young
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.2
    • /
    • pp.66-73
    • /
    • 2008
  • The pipes have been laid underground while the industry is developing. We have to take maintenance procedure when the pipes are cracked or ruptured. It is very difficult jop to check pipe's crack because the pipes are narrow and laid underground. Using in-pipe robot, we can check the conditions of inner section of pipes, therefore, we designed a crawler type robot to search cracked pipe. In this paper, we have made a special focus on the control of the robot using differential drive algorithm to move in curved section of pipes. The detailed design of the robot with experimental result show the effectiveness of the robot in pipe maintenance.

  • PDF

Efficient Internet Information Extraction Using Hyperlink Structure and Fitness of Hypertext Document (웹의 연결구조와 웹문서의 적합도를 이용한 효율적인 인터넷 정보추출)

  • Hwang Insoo
    • Journal of Information Technology Applications and Management
    • /
    • v.11 no.4
    • /
    • pp.49-60
    • /
    • 2004
  • While the World-Wide Web offers an incredibly rich base of information, organized as a hypertext it does not provide a uniform and efficient way to retrieve specific information. Therefore, it is needed to develop an efficient web crawler for gathering useful information in acceptable amount of time. In this paper, we studied the order in which the web crawler visit URLs to rapidly obtain more important web pages. We also developed an internet agent for efficient web crawling using hyperlink structure and fitness of hypertext documents. As a result of experiment on a website. it is shown that proposed agent outperforms other web crawlers using BackLink and PageRank algorithm.

  • PDF

Driving and Swing Analysis of a Crawler Type Construction Equipment Using Flexible Multibody Dynamics (탄성 다물체 해석기법을 이용한 크롤러형 건설장비의 주행 및 선회 동특성 해석)

  • 김형근;서민석
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.5 no.1
    • /
    • pp.101-109
    • /
    • 1997
  • A tool for the dynamic simulation and design technique of the excavator plays an important role in the prediction of dynamic behavior of the excavator in the initial design stage. In this paper, a flexible multibody dynamic analysis model including track of the crawler type excavator is developed using DADS and ANSYS. Through the driving simulation of the excavator travelling over rough road track, frequency characteristics of the upper frame and cabin are obtained, and the reaction forces acting on the track rollers are also presented for the fatigue life estimation. The effect of boom vibration modes on the joint reaction forces and accelerations is presented from the swing simulation.

  • PDF

Design and Implementation of Web Crawler with Real-Time Keyword Extraction based on the RAKE Algorithm

  • Zhang, Fei;Jang, Sunggyun;Joe, Inwhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.395-398
    • /
    • 2017
  • We propose a web crawler system with keyword extraction function in this paper. Researches on the keyword extraction in existing text mining are mostly based on databases which have already been grabbed by documents or corpora, but the purpose of this paper is to establish a real-time keyword extraction system which can extract the keywords of the corresponding text and store them into the database together while grasping the text of the web page. In this paper, we design and implement a crawler combining RAKE keyword extraction algorithm. It can extract keywords from the corresponding content while grasping the content of web page. As a result, the performance of the RAKE algorithm is improved by increasing the weight of the important features (such as the noun appearing in the title). The experimental results show that this method is superior to the existing method and it can extract keywords satisfactorily.

An Implementation and Performance Evaluation of Fast Web Crawler with Python

  • Kim, Cheong Ghil
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.3
    • /
    • pp.140-143
    • /
    • 2019
  • The Internet has been expanded constantly and greatly such that we are having vast number of web pages with dynamic changes. Especially, the fast development of wireless communication technology and the wide spread of various smart devices enable information being created at speed and changed anywhere, anytime. In this situation, web crawling, also known as web scraping, which is an organized, automated computer system for systematically navigating web pages residing on the web and for automatically searching and indexing information, has been inevitably used broadly in many fields today. This paper aims to implement a prototype web crawler with Python and to improve the execution speed using threads on multicore CPU. The results of the implementation confirmed the operation with crawling reference web sites and the performance improvement by evaluating the execution speed on the different thread configurations on multicore CPU.

Running stability analysis of the Semi-Crawler Type Mini-Forwarder by Using a Dynamic Analysis Program (동역학분석 프로그램을 이용한 반궤도식 임내작업차의 주행안정성 분석)

  • Kim, Jae-Hwan;Park, Sang-Jun
    • Journal of Korean Society of Forest Science
    • /
    • v.104 no.1
    • /
    • pp.98-103
    • /
    • 2015
  • This study was conducted to analyze the running stability of a semi-crawler type mini-forwarder. The running stability analysis was performed by using a dynamic analysis program, RecurDyn. Physical properties of the semi-crawler type mini-forwarder was performed by using 3D CAD modeler, AutoCAD 3D. As a result from the computer simulation of stationary sideways overturning, it was found that the semi-crawler type mini-forwarder runs safely on a road with a slope not bigger than $20^{\circ}$ regardless whether it is empty or loaded, but in case of a road with a slope bigger than $20^{\circ}$, it is assumed that it is difficult for the car to run safely due to some dangers. In addition, it was found that the critical slope of its sideways overturning gets much smaller when empty since the location of its gravity center is elevated and much higher when it is loaded. As a result from the computer simulation of its hill-climbing ability, since the running speed is unstable in case of a road with a vertical slope not smaller than $28^{\circ}$, it is assumed that it is safe to drive it on a road with a slope not bigger than $28^{\circ}$. Taking a look at the result from an analysis of the running safety when it passes an obstacle, it was observed that a front tire comes off the ground when the running speed of the car is 5 and 4 km per hour respectively when it is empty and loaded while the gravity center of the front tire is watched. When taking a look at the changes in the location of the gravity center of the rear wheel crawler shaft, it was not found that the shaft comes off the ground at the test speeds both when it is empty and loaded.

Development and Validation of Simulation Model for Traction Power and Driving Torque Prediction of Upland Multipurpose Platform (밭농업용 다목적 플랫폼의 견인동력 및 구동토크 예측을 위한 시뮬레이션 모델 개발 및 검증)

  • Hyeon Ho Jeon;Seung Min Baek;Seung Yun Baek;Yi Su Hong;Taek Jin Kim;Yong Choi;Young Keun Kim;Sang Hee Lee;Yong Joo Kim
    • Journal of Drive and Control
    • /
    • v.20 no.1
    • /
    • pp.16-26
    • /
    • 2023
  • Although the upland field area of Korea is high as 44.8%, the platform optimized for the upland field is insufficient. It is necessary to develop an optimized platform for the upland field because the upland field environment is an irregular environment with many slopes. In addition, due to the characteristic of agricultural operations, the traction power and torque of the platform have to be sufficient. Therefore, in this study, a simulation model that can predict the traction power and driving torque of a crawler-type platform for the upland field was developed and validated using the specifications of the crawler platform. The simulation model was developed using Amesim (19.1, Siemens, Germany). The development of the model was conducted using the specifications of the platform. A measurement system was developed to validate the simulation model. The traction power data of the simulation model was validated with the traction force and vehicle speed. The driving torque data of the simulation model was validated with the torque of the sprocket on the crawler system. As a result of the analysis, the error between measurement and simulation results occurred within 10%, and it was determined that the traction power and driving torque prediction of the crawler platform using this model was possible.

Web crawler Improvement and Dynamic process Design and Implementation for Effective Data Collection (효과적인 데이터 수집을 위한 웹 크롤러 개선 및 동적 프로세스 설계 및 구현)

  • Wang, Tae-su;Song, JaeBaek;Son, Dayeon;Kim, Minyoung;Choi, Donggyu;Jang, Jongwook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.11
    • /
    • pp.1729-1740
    • /
    • 2022
  • Recently, a lot of data has been generated according to the diversity and utilization of information, and the importance of big data analysis to collect, store, process and predict data has increased, and the ability to collect only necessary information is required. More than half of the web space consists of text, and a lot of data is generated through the organic interaction of users. There is a crawling technique as a representative method for collecting text data, but many crawlers are being developed that do not consider web servers or administrators because they focus on methods that can obtain data. In this paper, we design and implement an improved dynamic web crawler that can efficiently fetch data by examining problems that may occur during the crawling process and precautions to be considered. The crawler, which improved the problems of the existing crawler, was designed as a multi-process, and the work time was reduced by 4 times on average.