• Title/Summary/Keyword: Web searching robot

Search Result 8, Processing Time 0.041 seconds

Efficient Design of Web Searching Robot Engine Using Distributed Processing Method with Javascript Function (자바스크립트 함수처리 기능을 포함한 분산처리 방식의 웹 수집 로봇의 설계)

  • Kim, Dae-Yu;Kim, Jung-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2595-2602
    • /
    • 2009
  • In this paper, we proposed and implemented web robot using active script engine with internet explore to process javascript function link, which is not processing in conventional web searching robot. This web searching robot is developed to measure collecting amount of web searching robot with processing of javascript function link. We analysed the architecture of web searching robot with google and naybot to develope web searching robot, implemented element of configuration applicable to searching robot and designed with distributed processing type. In addition to, we estimated the proposed web robot employing javascript processing model and analysed the comparison of collecting amount of web site board using javascript. We obtained the result of 1,000 web page collecting compared to conventional method in case of 1,000 web site board.

Design of Web Robot Engine Using Distributed Collection Model Processing (분산수집 모델을 이용한 웹 로봇의 설계 및 구현)

  • Kim, Dae-Yu;Kim, Jung-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.1
    • /
    • pp.115-121
    • /
    • 2010
  • As internet becomes widespread, a lot of information is opened to public and users of Internet can access effectively information using web searching service. To construct web searching service, the web searching method for collecting of information is needed to obtain web page view. As a number of web page view increases, it is necessary to collect information of high quality information to be searched, therefore, a variety of web engine for searching mechanism is developed. Method of link extraction with javascript in dynamic web page and design of web searching robot are presented m this paper. To evaluate performance analyzes, we fixed one searching model with the proposed method. The searching time takes 2 minute 67 sec for 299 web pages and 12.33 sec for 10 searching model.

A Study on Design and Development of Web Information Collection System Based Compare and Merge Method (웹 페이지 비교통합 기반의 정보 수집 시스템 설계 및 개발에 대한 연구)

  • Jang, Jin-Wook
    • Journal of Information Technology Services
    • /
    • v.13 no.1
    • /
    • pp.147-159
    • /
    • 2014
  • Recently, the quantity of information that is accessible from the Internet is being dramatically increased. Searching the Web for useful information has therefore become increasingly difficult. Thus, much research has been done on web robots which perform internet information filtering based on user interest. If a web site which users want to visit is found, its content is searched by following the searching list or Web sites links in order. This search process takes a long time according as the number of page or site increases so that its performance need to be improved. In order to minimize unnecessary search with web robots, this paper proposes an efficient information collection system based on compare and merge method. In the proposed system, a web robot initially collects information from web sites which users register. From the next visit to the web sites, the web robot compares what it collected with what the web sites have currently. If they are different, the web robot updates what it collected. Only updated web page information is classified according to subject and provided to users so that users can access the updated information quickly.

Implementation of Web Searching Robot for Detecting of Phishing and Pharming in Homepage (홈페이지에 삽입된 악성코드 및 피싱과 파밍 탐지를 위한 웹 로봇의 설계 및 구현)

  • Kim, Dae-Yu;Kim, Jung-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.11
    • /
    • pp.1993-1998
    • /
    • 2008
  • Web robot engine for searching web sever vulnerability and malicious code is proposed in this paper. The main web robot function is based on searching technology which is derived from analyses of private information threat. We implemented the detecting method for phishing, pharming and malicious code on homepage under vulnerable surroundings. We proposed a novel approachm which is independent of any specific phishing implementation. Our idea is to examine the anomalies in web pages.

Numerical Formula and Verification of Web Robot for Collection Speedup of Web Documents

  • Kim Weon;Kim Young-Ki;Chin Yong-Ok
    • Journal of Internet Computing and Services
    • /
    • v.5 no.6
    • /
    • pp.1-10
    • /
    • 2004
  • A web robot is a software that has abilities of tracking and collecting web documents on the Internet(l), The performance scalability of recent web robots reached the limit CIS the number of web documents on the internet has increased sharply as the rapid growth of the Internet continues, Accordingly, it is strongly demanded to study on the performance scalability in searching and collecting documents on the web. 'Design of web robot based on Multi-Agent to speed up documents collection ' rather than 'Sequentially executing Web Robot based on the existing Fork-Join method' and the results of analysis on its performance scalability is presented in the thesis, For collection speedup, a Multi-Agent based web robot performs the independent process for inactive URL ('Dead-links' URL), which is caused by overloaded web documents, temporary network or web-server disturbance, after dividing them into each agent. The agents consist of four component; Loader, Extractor, Active URL Scanner and inactive URL Scanner. The thesis models a Multi-Agent based web robot based on 'Amdahl's Law' to speed up documents collection, introduces a numerical formula for collection speedup, and verifies its performance improvement by comparing data from the formula with data from experiments based on the formula. Moreover, 'Dynamic URL Partition algorithm' is introduced and realized to minimize the workload of the web server by maximizing a interval of the web server which can be a collection target.

  • PDF

Development of an online robot education community based on Web 2.0 (웹2.0 기반 온라인 로봇교육 커뮤니티의 개발)

  • Sung, Young-Hoon;Ha, Seok-Wun
    • Journal of The Korean Association of Information Education
    • /
    • v.13 no.3
    • /
    • pp.273-280
    • /
    • 2009
  • The internet becomes a new communication tool in the knowledge and information society and the people are expanded at the place of information interchange and exchange of view. In recent robot education institutions provide their own official homepages to introduce the robot educational resources. But because they have restrictive searching the functions and providing general robot education resources and don't offer a place that teachers can express their thoughts and share common interests with other users, online community among teachers for robot education and users couldn't have built. In this paper, we propose an Online Robot Education Community(OREC) that teachers and users in different robot education institutions can interchange or share their technical information, learn robot techniques, participate in discussion of their experiences on work, share their common interests, and be provided updated latest news in real-time.

  • PDF

The comparative effectiveness and evaluation study of user groups of the various web search tools (다양한 형태의 웹 탐색도구의 이용자집단간 비교효용성 및 평가에 관한 연구)

  • 박일종;윤명순
    • Journal of Korean Library and Information Science Society
    • /
    • v.31 no.1
    • /
    • pp.87-114
    • /
    • 2000
  • The purpose of this study is offering appropriate system and training program to helf the system designer and the trainer in addition to analyze information use behavior about the web search tools and evaluate the estimated system by user groups. The results of the study are as follows $\circledS1$ It is desirable to consider age than other demographic variables in the case of web search tool. $\circledS2$ It is desirable to design Directory Search Tool in the case of web search tool which serves the student user group. $\circledS3$ An Intelligent Search Tool is more appropriate for the students who are using keyword search tool than any other tools. $\circledS4$ A discussion about standard classification of the web information should be accomplished soon because users feel confused in using web search tools due t o absence of standard mode of classification about classified item. $\circledS5$ Librarians need the cognition about data on internet s a source of information and need positive service and user training program about these information because student users hardly get help from librarians or library orientation for learning method to use web search tool. $\circledS6$ Internet use experience and years of computer use had effect on their use ability when using web search tool, whereas computer use experience, library use experience and Online Public Access Catalogs (OPAC) use experience had no effect on it. Especially, OPAC use experience had no effect on use ability of web search tool of student user group because student user groups had no information about internet and web search tool and they did not recognized the difference about search method between web search tool and OPAC. $\circledS7$In the case of web search tool, it si important to index the increasing web resource automatically by a searching robot. But in the case of student users, web search tool is much more needed to index by index expert due to the absence of ability about selecting and combining keyword.

  • PDF

Design of Web Searching Robot Engine Using Distributed Processing Method Application to Javascript Function Processing (자바스크립트 함수 처리가 가능한 분산처리 방식의 웹 수집 로봇의 설계)

  • Kim, Dae-Yu;Nanm, Ki-Hyo;Kim, Jung-Tae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.289-292
    • /
    • 2008
  • 기존의 웹 수집 로봇에서 처리하지 못하는 자바스크립트 함수 링크를 처리하기 위하여 인터넷 익스플로러의 "Active Script Engine"을 사용하였다. 또한 자바스크립트 함수 링크를 처리 하였을 경우 웹 수집 로봇의 수집량을 측정하기 위하여 웹 수집 로봇을 개발하였다. 웹 수집 로봇을 개발하기 위해서 구글봇과 네이봇 등 웹 수집 로봇의 구조를 파악하여, 수집 로봇에 활용되는 구성요소를 구현하고 분산처리형태의 웹 수집 로봇을 설계하여 개발했다. 또한 개발된 웹 로봇에 제안된 자바스크립트 처리 모델을 추가하여 성능 평가를 하였다. 성능평가 방법은 자바스크립트를 사용하는 웹사이트의 게시판을 대상으로 하여 웹 수집량을 비교 분석하는 것이다. 웹사이트 게시물 1000개인 경우, 일반 웹 로봇의 경우에는 1페이지밖에 수집하지 못하였고, 제안된 웹 로봇의 경우 1000개 이상의 웹 페이지를 수집하는 결과를 얻었다.

  • PDF