• Title/Summary/Keyword: Web Page

Search Result 675, Processing Time 0.028 seconds

Implementation of Multi-Protocol Interface for Web-based Collaborative Service (웹 기반 공동작업을 위한 다중 프로토콜 인터페이스 방법의 구현)

  • 이은령;김지용;설동명;김두현;임기욱
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.2
    • /
    • pp.340-351
    • /
    • 2003
  • We introduce our experiences of the design and implementation of the Page Together system that has expanded hyperlink metaphor to utilize human resources in the web. This system supports that a user connects with others in the web, communicates through video/audio channel, navigates same web pages simultaneously and cooperates some work on Internet. For these functions, it comprises Collaborative Browsing Module (CBM), Multimedia Conferencing Module(MCM) Data Conferencing Module(I)CM) and Multi Protocol Interface(MPI). We adopted three standard protocols, IEC, H.323 and T.120 for each nodule and it allows developers to use them easily. We also defined MPI to synchronize information of session among modules. Each module exchanges information each other in session creating process and session terminating process. After a session is created once, each module works independently as its won protocol. Interferences among modules are reduced as minimizing to exchange information. We also introduce a web site that provides web board service based on the Page Together system. A user may post a notice with a link to himself/herself on our web board. After then, if someone read that notice and has any question about it, he or she can try to connect to the writer as clicking the link in that notice and communicate each other. This service site shows that our system can be applied to diverse internet services such as distance teaming and distance conference.

  • PDF

Research on Training and Implementation of Deep Learning Models for Web Page Analysis (웹페이지 분석을 위한 딥러닝 모델 학습과 구현에 관한 연구)

  • Jung Hwan Kim;Jae Won Cho;Jin San Kim;Han Jin Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.517-524
    • /
    • 2024
  • This study aims to train and implement a deep learning model for the fusion of website creation and artificial intelligence, in the era known as the AI revolution following the launch of the ChatGPT service. The deep learning model was trained using 3,000 collected web page images, processed based on a system of component and layout classification. This process was divided into three stages. First, prior research on AI models was reviewed to select the most appropriate algorithm for the model we intended to implement. Second, suitable web page and paragraph images were collected, categorized, and processed. Third, the deep learning model was trained, and a serving interface was integrated to verify the actual outcomes of the model. This implemented model will be used to detect multiple paragraphs on a web page, analyzing the number of lines, elements, and features in each paragraph, and deriving meaningful data based on the classification system. This process is expected to evolve, enabling more precise analysis of web pages. Furthermore, it is anticipated that the development of precise analysis techniques will lay the groundwork for research into AI's capability to automatically generate perfect web pages.

Main Content Extraction from Web Pages Based on Node Characteristics

  • Liu, Qingtang;Shao, Mingbo;Wu, Linjing;Zhao, Gang;Fan, Guilin;Li, Jun
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.2
    • /
    • pp.39-48
    • /
    • 2017
  • Main content extraction of web pages is widely used in search engines, web content aggregation and mobile Internet browsing. However, a mass of irrelevant information such as advertisement, irrelevant navigation and trash information is included in web pages. Such irrelevant information reduces the efficiency of web content processing in content-based applications. The purpose of this paper is to propose an automatic main content extraction method of web pages. In this method, we use two indicators to describe characteristics of web pages: text density and hyperlink density. According to continuous distribution of similar content on a page, we use an estimation algorithm to judge if a node is a content node or a noisy node based on characteristics of the node and neighboring nodes. This algorithm enables us to filter advertisement nodes and irrelevant navigation. Experimental results on 10 news websites revealed that our algorithm could achieve a 96.34% average acceptable rate.

Multimedia UCC Services as a Web 2.0 and Consumer Participation (웹2.0의 동영상 UCC 서비스현황과 소비자 참여)

  • Kim, Yeon-Jeong;Park, Sun-Young
    • Journal of Families and Better Life
    • /
    • v.26 no.1
    • /
    • pp.95-105
    • /
    • 2008
  • This paper identifies current status and key determinants of participation in multimedia UCC as a web2.0 paradigm. Significant factors composed of IT digital convergence environments and young generation's value, attitude to connecting to web (human relation, visual expressionism, arousal, et al). This paper analyze multimedia UCC service status & current status of participation level of UCC users. The research implemented to analysis customer click streaming data (inter-temporary page-view, unique user)of small-mid multimedia UCC Company and to 2nd data(ww.rankey.com) for page-view and unique user to participation tendency by age and sex about total participation amount of multimedia UCC. In case of young generation be familiar with new internet service, Internet web space meaning important information seeking media and 1 person media able to connect to new web network as prosumer. In UCC centered internet business, web based customers implemented the role of prosumer as generate web contents and consuming to net-working.

The Use of Reinforcement Learning and The Reference Page Selection Method to improve Web Spidering Performance (웹 탐색 성능 향상을 위한 강화학습 이용과 기준 페이지 선택 기법)

  • 이기철;이선애
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.3
    • /
    • pp.331-340
    • /
    • 2002
  • The web world is getting so huge and untractable that without an intelligent information extractor we would get more and more helpless. Conventional web spidering techniques for general purpose search engine may be too slow for the specific search engines, which concentrate only on specific areas or keywords. In this paper a new model for improving web spidering capabilities is suggested and experimented. How to select adequate reference web pages from the initial web Page set relevant to a given specific area (or keywords) can be very important to reduce the spidering speed. Our reference web page selection method DOPS dynamically and orthogonally selects web pages, and it can also decide the appropriate number of reference pages, using a newly defined measure. Even for a very specific area, this method worked comparably well almost at the level of experts. If we consider that experts cannot work on a huge initial page set, and they still have difficulty in deciding the optimal number of the reference web pages, this method seems to be very promising. We also applied reinforcement learning to web environment, and DOPS-based reinforcement learning experiments shows that our method works quite favorably in terms of both the number of hyper links and time.

  • PDF

HTML Text Extraction Using Tag Path and Text Appearance Frequency (태그 경로 및 텍스트 출현 빈도를 이용한 HTML 본문 추출)

  • Kim, Jin-Hwan;Kim, Eun-Gyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1709-1715
    • /
    • 2021
  • In order to accurately extract the necessary text from the web page, the method of specifying the tag and style attributes where the main contents exist to the web crawler has a problem in that the logic for extracting the main contents. This method needs to be modified whenever the web page configuration is changed. In order to solve this problem, the method of extracting the text by analyzing the frequency of appearance of the text proposed in the previous study had a limitation in that the performance deviation was large depending on the collection channel of the web page. Therefore, in this paper, we proposed a method of extracting texts with high accuracy from various collection channels by analyzing not only the frequency of appearance of text but also parent tag paths of text nodes extracted from the DOM tree of web pages.

A development of the Conversion System of Mobile Web Contents for the Web Service (웹서비스를 위한 모바일 웹 컨텐츠 변환 시스템 개발)

  • 황인숙;김윤중;김승연
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.11b
    • /
    • pp.517-520
    • /
    • 2003
  • 본 논문에서는 XML Page로부터 닷넷 모바일 Web Page로의 변환 기능을 제공하는 웹서비스 프로바이더를 설계ㆍ구현하였다. 웹서비스 프로바이더는 플랫폼과 프로그래밍언어의 제약 없이 누구나 사용 가능하다는 장점을 가지고 있다. 본 논문에서 구현한 웹서비스 프로바이더로 제공하는 변환 라이브러리는 어휘분석기 모들과 구문 지향 번역기 모듈로 구성되어 있다. 어휘분석기 모들은 XML Page를 입력받아 각 태그들을 분석하고 토큰을 출력한다. 구문 지향 번역기 모듈은 어휘분석기에서 출력된 토큰들을 입력받아 문법을 체크하고, Parsing이 완료되면 대응되는 모바일 .NET 컨트롤로 변환하여 최종적인 닷넷 모바일 Web Page를 생성한다. 어휘분석기 모듈과 구문 지향 번역기 모들은 각각 어휘분석기 생성기(LEX)와 파서 생성기(YACC)를 사용하여 구현하였다. LEX와 YACC의 사용은 태그를 확장할 경우 변경된 부분만 설계함으로써 유연하게 대처할 수 있는 장점을 가지고 있다.

  • PDF

Design and Implementation of Web Crawler with Real-Time Keyword Extraction based on the RAKE Algorithm

  • Zhang, Fei;Jang, Sunggyun;Joe, Inwhee
    • Annual Conference of KIPS
    • /
    • 2017.11a
    • /
    • pp.395-398
    • /
    • 2017
  • We propose a web crawler system with keyword extraction function in this paper. Researches on the keyword extraction in existing text mining are mostly based on databases which have already been grabbed by documents or corpora, but the purpose of this paper is to establish a real-time keyword extraction system which can extract the keywords of the corresponding text and store them into the database together while grasping the text of the web page. In this paper, we design and implement a crawler combining RAKE keyword extraction algorithm. It can extract keywords from the corresponding content while grasping the content of web page. As a result, the performance of the RAKE algorithm is improved by increasing the weight of the important features (such as the noun appearing in the title). The experimental results show that this method is superior to the existing method and it can extract keywords satisfactorily.

Design and Implementation for Web Page Visualization Techniques using Android Smartphone

  • Song, Teuk-Seob;Choi, Seung-Ho
    • Journal of information and communication convergence engineering
    • /
    • v.8 no.4
    • /
    • pp.416-420
    • /
    • 2010
  • SMARTPHONE has become a big issue in South Korea and worldwide. In this work we will propose techniques that display web contents according to user preference and service provider' interest for small screen devices such as smartphones. A number of works and research have been conducted to generate suitable web contents for small screen devices. But such previous attempts have not failed to reflect the interest of users and service providers. Our proposed system simulates the Google Android Emulator, which is a free and well known open source.

Design and Implementation of Real-time ECG Monitoring System for Personal Health Records (개인건강기록을 위한 실시간 심전도 모니터링 시스템 설계 및 구현)

  • Kim, Heung Ki;Cho, Jin Soo
    • Journal of the Semiconductor & Display Technology
    • /
    • v.11 no.3
    • /
    • pp.45-50
    • /
    • 2012
  • In this paper, we propose a real-time ECG monitoring system for personal health records. This study aims to provide services that help patients to monitor their own physical condition and manage their own health records consistently, whereas existing medical services are Medical Institute-Centric model. The system is composed of web server, smart phone, and ECG meter, and web page. Without time and space restraints, It provides us with managing personal health records by performing patient's ECG measurement and real-time monitoring. And also Real-time bidirectional communication between smart phone and web page can be performed rapidly by applying the ECG monitoring with WebSocket Technology that follows HTML5 standard. Through this system, It can handle patient in need immediately.