• Title/Summary/Keyword: web content extraction

Search Result 38, Processing Time 0.025 seconds

Novel Intent based Dimension Reduction and Visual Features Semi-Supervised Learning for Automatic Visual Media Retrieval

  • kunisetti, Subramanyam;Ravichandran, Suban
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.230-240
    • /
    • 2022
  • Sharing of online videos via internet is an emerging and important concept in different types of applications like surveillance and video mobile search in different web related applications. So there is need to manage personalized web video retrieval system necessary to explore relevant videos and it helps to peoples who are searching for efficient video relates to specific big data content. To evaluate this process, attributes/features with reduction of dimensionality are computed from videos to explore discriminative aspects of scene in video based on shape, histogram, and texture, annotation of object, co-ordination, color and contour data. Dimensionality reduction is mainly depends on extraction of feature and selection of feature in multi labeled data retrieval from multimedia related data. Many of the researchers are implemented different techniques/approaches to reduce dimensionality based on visual features of video data. But all the techniques have disadvantages and advantages in reduction of dimensionality with advanced features in video retrieval. In this research, we present a Novel Intent based Dimension Reduction Semi-Supervised Learning Approach (NIDRSLA) that examine the reduction of dimensionality with explore exact and fast video retrieval based on different visual features. For dimensionality reduction, NIDRSLA learns the matrix of projection by increasing the dependence between enlarged data and projected space features. Proposed approach also addressed the aforementioned issue (i.e. Segmentation of video with frame selection using low level features and high level features) with efficient object annotation for video representation. Experiments performed on synthetic data set, it demonstrate the efficiency of proposed approach with traditional state-of-the-art video retrieval methodologies.

Video Browsing Service Using An Efficient Scene Change Detection (효율적인 장면전환 검출을 이용한 비디오 브라우징 서비스)

  • Seong-Yoon Shin;Yang-Won Rhee
    • Journal of Internet Computing and Services
    • /
    • v.3 no.2
    • /
    • pp.69-77
    • /
    • 2002
  • Recently, Digital video is one of the important information media delivered on the Internet and playing an increasingly important role in multimedia. This paper proposes a Video Browsing Service(VBS) that provides both the video content retrieval and the video browsing by the real-time user interface on Web, For the scene segmentation and key frame extraction of video sequence, we proposes an efficient scene change detection method that combines the RGB color histogram with the $x^2$(Chi Square) histogram. Resulting key frames are linked by both physical and logical indexing, This system involves the video editing and retrieval function of a VCR's, Three elements that are the date, the field and the subject are used for video browsing. A Video Browsing Service is implemented with MySQL, PHP and JMF under Apache Web Server.

  • PDF

The Design and Implementation of OWL Ontology Construction System through Information Extraction of Unstructured Documents (비정형 문서의 정보추출을 통한 OWL 온톨로지 구축 시스템의 설계 및 구현)

  • Jo, Dae Woong;Choi, Ji Woong;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.10
    • /
    • pp.23-33
    • /
    • 2014
  • The development of the information retrieval field is evolving to the research field searching accurately for the information from thing finding rapidly a large amount of information. Personalization and the semantic web technology is a key technology. The automatic indexing technology about the web document and throughput go beyond the research stage and show up as the practical service. However, there is a lack of research on the document information retrieval field about the attached document type of except the web document. In this paper, we illustrate about the method in which it analyzed the text content of the unstructured documents prepared in the text, word, hwp form and it how to construction OWL ontology. To build TBox of the document ontology and the resources which can be obtained from the document is selected, and we implement with the system in order to utilize as the instant of the constructed document ontology. It is effectually usable in the information retrieval and document management system using the semantic technology of the correspondence document as the ontology automatic construction of this kind of the unstructured documents.

Video Browsing Service (비디오 브라우징 서비스)

  • Shin, Seong-Yoon;Shin, Kwang-Sung;Lee, Hyun-Chang;Jin, Chan-Yong;Rhee, Yang-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.139-140
    • /
    • 2012
  • This paper proposes a Video Browsing Service that provides both the video content retrieval and the video browsing by the real-time user interface on Web. For the scene segmentation and key frame extraction of video sequence, we proposes an efficient scene change detection method that combine the RGB color histogram with the ${\chi}2$ histogram.

  • PDF

News Video Browser (뉴스 비디오 브라우저)

  • Shin, Seong-Yoon;Kang, Oh-Hyung;Kim, Hyung-Jin;Jang, Dai-Hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.336-337
    • /
    • 2021
  • In this paper, we propose a video browsing service that provides both video content search and video browsing through a real-time user interface on the web. We propose an efficient scene change detection method that combines an RGB color histogram and a 𝛘2 histogram for scene segmentation and key frame extraction of image sequences.

  • PDF

An Optimized e-Lecture Video Search and Indexing framework

  • Medida, Lakshmi Haritha;Ramani, Kasarapu
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.87-96
    • /
    • 2021
  • The demand for e-learning through video lectures is rapidly increasing due to its diverse advantages over the traditional learning methods. This led to massive volumes of web-based lecture videos. Indexing and retrieval of a lecture video or a lecture video topic has thus proved to be an exceptionally challenging problem. Many techniques listed by literature were either visual or audio based, but not both. Since the effects of both the visual and audio components are equally important for the content-based indexing and retrieval, the current work is focused on both these components. A framework for automatic topic-based indexing and search depending on the innate content of the lecture videos is presented. The text from the slides is extracted using the proposed Merged Bounding Box (MBB) text detector. The audio component text extraction is done using Google Speech Recognition (GSR) technology. This hybrid approach generates the indexing keywords from the merged transcripts of both the video and audio component extractors. The search within the indexed documents is optimized based on the Naïve Bayes (NB) Classification and K-Means Clustering models. This optimized search retrieves results by searching only the relevant document cluster in the predefined categories and not the whole lecture video corpus. The work is carried out on the dataset generated by assigning categories to the lecture video transcripts gathered from e-learning portals. The performance of search is assessed based on the accuracy and time taken. Further the improved accuracy of the proposed indexing technique is compared with the accepted chain indexing technique.

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

A Study on Features Analysis for Retrieving Image Containing Personal Information on the Web (인터넷상에서 개인식별정보가 포함된 영상 검색을 위한 특징정보 분석에 관한 연구)

  • Kim, Jong-Bae
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.3
    • /
    • pp.91-101
    • /
    • 2011
  • Internet is becoming increasingly popular due to the rapid development of information and communication technology. There has been a convenient social activities such as the mutual exchange of information, e-commerce, internet banking, etc. through cyberspace on a computer. However, by using the convenience of the internet, the personal IDs(identity card, driving license, passport, student ID, etc.) represented by the electronic media are exposed on the internet frequently. Therefore, this study propose a feature extraction method to analyze the characteristics of image files containing personal information and a image retrieval method to find the images using the extracted features. The proposed method selects the feature information from color, texture, and shape of the images, and the images as searched by similarity analysis between feature information. The result which it experiments from the image which it acquires from the web-based image DB and correct image retrieval rate is 89%, the computing time per frame is 0.17 seconds. The proposed method can be efficiently apply a system to search the image files containing personal information and to determine the criteria of exposure of personal information.

A Study on the Advanced Electronic Book System Based in Web (웹기반의 전자원문 관리 시스템에 관한 연구)

  • Nam, Young-Joon;Jeong, Eui-Seob;Yoo, Jae-Young;Cho, Hyun-Yang
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.16 no.2
    • /
    • pp.139-156
    • /
    • 2005
  • In this paper, we design and implement electronic book system providing web-based interface for the ebook. The aim of this study is to optimize the effective reading and management of electronic text for its users(readers and librarians). Advanced functions of the electronic book system are the following: 1) Electronic book system is not dependent to specific software and tool. 2) Electronic book system is able to. minimize images(table, image, icon etc) to improve the meaning and readability of information. 3) Electronic book system is able to reduce the effort for indexing extraction and constructing the table of content. 4) The system is able to collect the user log files that are created during the process of reading ebook from various points of view. 5) When reading, the system uses the DRM through decoding and encoding the ebook.

  • PDF

A WWW Images Automatic Annotation Based On Multi-cues Integration (멀티-큐 통합을 기반으로 WWW 영상의 자동 주석)

  • Shin, Seong-Yoon;Moon, Hyung-Yoon;Rhee, Yang-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.4
    • /
    • pp.79-86
    • /
    • 2008
  • As the rapid development of the Internet, the embedded images in HTML web pages nowadays become predominant. For its amazing function in describing the content and attracting attention, images become substantially important in web pages. All these images consist a considerable database. What's more, the semantic meanings of images are well presented by the surrounding text and links. But only a small minority of these images have precise assigned keyphrases. and manually assigning keyphrases to existing images is very laborious. Therefore it is highly desirable to automate the keyphrases extraction process. In this paper, we first introduce WWW image annotation methods, based on low level features, page tags, overall word frequency and local word frequency. Then we put forward our method of multi-cues integration image annotation. Also, show multi-cue image annotation method is more superior than other method through an experiment.

  • PDF