• Title/Summary/Keyword: Web page

Search Result 674, Processing Time 0.033 seconds

Web Page Similarity based on Size and Frequency of Tokens (토큰 크기 및 출현 빈도에 기반한 웹 페이지 유사도)

  • Lee, Eun-Joo;Jung, Woo-Sung
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.263-275
    • /
    • 2012
  • It is becoming hard to maintain web applications because of high complexity and duplication of web pages. However, most of research about code clone is focusing on code hunks, and their target is limited to a specific language. Thus, we propose GSIM, a language-independent statistical approach to detect similar pages based on scarcity and frequency of customized tokens. The tokens, which can be obtained from pages splitted by a set of given separators, are defined as atomic elements for calculating similarity between two pages. In this paper, the domain definition for web applications and algorithms for collecting tokens, making matrics, calculating similarity are given. We also conducted experiments on open source codes for evaluation, with our GSIM tool. The results show the applicability of the proposed method and the effects of parameters such as threshold, toughness, length of tokens, on their quality and performance.

Estimation of Web Page Change Behavior (웹 문서 변경 예측)

  • Kim, Sung-Jin
    • Journal of Internet Computing and Services
    • /
    • v.8 no.4
    • /
    • pp.149-158
    • /
    • 2007
  • This paper presents the estimation methods computing the probabilities of how many times web pages are downloaded and modified, respectively, in the future crawls. The methods can make web database administrators avoid unnecessarily requesting undownloadable and unmodified web pages in a page group. We postulated that the change behavior of web pages is strongly related to the past change behavior. We gather the change histories of approximately three million web pages at two-day intervals for 100 days, and estimated the future change behavior of those pages. Our estimation, which was evaluated by actual change behavior of the pages, worked well.

  • PDF

Asynchronous Web Crawling Algorithm (링크 분석을 통한 비동기 웹 페이지 크롤링 알고리즘)

  • Won, Dong-Hyun;Park, Hyuk-Gyu;Kang, Yun-Jeong;Lee, Min-Hye
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.364-366
    • /
    • 2022
  • The web uses an asynchronous web method to provide various information having different processing speeds together. The asynchronous method has the advantage of being able to respond to other events even before the task is completed, but a typical crawler has difficulty collecting information provided asynchronously by collecting point-of-visit information on a web page. In addition, asynchronous web pages often do not change their web address even if the page content is changed, making it difficult to crawl. In this paper, we propose a web crawling algorithm considering asynchronous page movement by analyzing links in the web. With the proposed algorithm, it was possible to collect dictionary information on TTA terms that provide information asynchronously.

  • PDF

Implementation of Web-page & Development of Size Informational Model on Fashion Electronic Commerce (패션전자상거래 치수정보모델 개발 및 웹페이지 구현)

  • Kang, Myoung-Hui;Nam, Yun-Ja;Choi, Young-Lim
    • Fashion & Textile Research Journal
    • /
    • v.13 no.2
    • /
    • pp.205-214
    • /
    • 2011
  • The purpose of this study is to develop a size information providing model which is easy recognition and utilization for customer. This study also implemented web page to apply the size-informational model. Web page implemented using Apache Web Server and JAVA client-side scripting. Research result on the actual condition of fashion electronic commerce, most of the firms are used the old named same with period of 1980. On the same named-code, they are used different sizing systems by firms or items. Size interval is used 2~5 cm, different by firms. In the size information, is provided only named-code(55, 66 etc.) or garment size, and is confusing whether the marked is body size or garment size. Many of the marked size information were wrong. The sizing system of KS K5001(2009) is not used well. These problems are increased a lose customer and firm by return, exchange, mending-cost, stock, etc. Therefore, the problems should be improved by providing correct and detailed information of size and garment, as well as standardization of sizing systems based on KS K5001.

A Micro-Webpage Stored in NFC Tag (NFC태그에 저장 가능한 마이크로 웹페이지)

  • Choi, BokDong;Eun, SeongBae
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.1
    • /
    • pp.1-7
    • /
    • 2012
  • A Smartphone has an ability accessing Internet by URL stored in NFC(Near Field Communication) Tag for storing the information of items, blogs and web pages. Because the system works through the Internet with URL, however, it needs to pay some costs like communication fee and time. If we can store the web page on the tags, we can save the communication overhead. But they have too small memory to store it. In this paper, we introduce the Micro-Webpage technology which can be stored in NFC tag or QR(Quick Response) code. To make a Micro-Webpage, we remove control tags from the web page to leave a user original content. The removed control tags are stored in our smartphone application as a template. The user content is also compressed to a smaller one by an lossless compression algorithm. When a tag is read, the stored content is decompressed and, it is combined with the template to make the original web page. We have implemented a prototype of Micro-Webpage system on Android platform and confirmed that the prototype has reasonable performance improvements in saving memory and loading web page time.

A Study of Web 2.0 Trend & Service View (웹 2.0 전망 및 서비스 동향에 관한 연구)

  • Kim, Seong-Ho;Kim, Heung-Do
    • Journal of Digital Convergence
    • /
    • v.5 no.2
    • /
    • pp.135-154
    • /
    • 2007
  • Web 2.0 is a platform which produce various service offered data from many people and it is activity open space depend on user focused community in diversity web environment. Web 2.0 is simple link structure with web site connectivity in informations such as page to page in other words, it is meaning of developing of semantical and circumstantial connection. The core of web 2.0 is user participation. RSS, Mash-up, UCC are main service of web 2.0 which make user directly participate from center of service offer to center of user manufactured an informations and then they offer re-information to other areas, also user freely express what they want through UCC and blogger. However, recently it comes the argument of skeptic of web 2.0 problem. Thus, In this research I will try to give an right comprehension and will try to have more accurate prediction of web 2.0 with the study of open service of web which is the key of web 2.0 success in a future.

  • PDF

Web Server Design For Web Page Fingerprinting (Web Page Fingerprinting을 위한 Web Server 구현)

  • Park, Su-Bin;Cho, Dong-Sub
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06b
    • /
    • pp.375-379
    • /
    • 2008
  • 디지털 핑거프린팅(Digital Fingerprinting) 기술은 구매자의 정보를 인지할 수 없는 방법으로 삽입하는 방법이다. 본 논문에서는 핑거프린팅 기법을 사용하여 웹 서버에 요청이 들어온 모든 웹 페이지에 핑거프린팅 기술의 조건을 충족시키는 방법으로, 정보를 삽입하여 보여 지는 웹 페이지의 변화 없이 사용자의 IP정보를 웹 페이지에 포함시켜 전송할 수 있는 알고리즘을 제시하고 웹페이지 핑거프린팅을 해주는 서버이용의 장점을 알아보도록 한다.

  • PDF

An Enhancing Caching Technique by the SOP(Shared Object Page) for Content Adaptation Systems (콘텐츠 적응화 시스템에 SOP(Shared Object Page)를 도입한 개선된 캐싱 기법)

  • Jang, Seo-Young;Jeong, Ho-Yeong;Kang, Su-Yong;Cha, Jae-Hyeok
    • Journal of Digital Contents Society
    • /
    • v.8 no.1
    • /
    • pp.41-50
    • /
    • 2007
  • People access web contain via PC and many other devices. In other words, not only they access information by a PC connected internet, but also they get information through a mobile phone, a PDA even D-TV. In this article, to resolve the problem, we suppose new web caching mechanism called 'SOP(Shared Object Page)'based on applying of meta data of web page information and storing adapted objects.

  • PDF

A Study on Referential Integrity Constraint on Web Page (웹 페이지에서 참조무결성 제약조건에 관한 연구)

  • 유남현;손철수;김원중
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.569-572
    • /
    • 2004
  • The Web site which many people manage and maintenance much HTML page is so difficult to solve broken link like as dangling reference and inaccurate contents reference. That is, in the case we update or delete connections structure between HTML documents, it is difficult to guarantee referential integrity of Web Pages. In this chapter, we surveyed and defined integrity constraints condition between the parent page and child page, and studied about method to express by extension UML. WWW, Constraints, UML

  • PDF

Design and Implementation of Web-based Text Summarization System for Mobile Device (이동 단말을 위한 웹 기반 텍스트 요약 시스템의 설계 및 구현)

  • Cha, Ji-Eun;Chun, Seung-Man;Park, Jong-Tae
    • The KIPS Transactions:PartC
    • /
    • v.16C no.6
    • /
    • pp.725-730
    • /
    • 2009
  • Recently, there has been increasing interest to web access through mobile host due to the explosion of internet mobile terminal such as smart phone. However, small displays of mobile hosts make it difficult to browse the full content of a web page at a time. In order to overcome these limitation, we have designed and implemented Web-based text summarization system. The proposed system can summarize the text for the Web page in which abundant text exist in a page. This can reduce the amount of data transmission and minimize the unnecessary data output during browsing at mobile host. Through implementation, we have confirmed the functions of the proposed system.