• Title/Summary/Keyword: Semantic segment

Search Result 26, Processing Time 0.029 seconds

Toward a Structural and Semantic Metadata Framework for Efficient Browsing and Searching of Web Videos

  • Kim, Hyun-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.1
    • /
    • pp.227-243
    • /
    • 2017
  • This study proposed a structural and semantic framework for the characterization of events and segments in Web videos that permits content-based searches and dynamic video summarization. Although MPEG-7 supports multimedia structural and semantic descriptions, it is not currently suitable for describing multimedia content on the Web. Thus, the proposed metadata framework that was designed considering Web environments provides a thorough yet simple way to describe Web video contents. Precisely, the metadata framework was constructed on the basis of Chatman's narrative theory, three multimedia metadata formats (PBCore, MPEG-7, and TV-Anytime), and social metadata. It consists of event information, eventGroup information, segment information, and video (program) information. This study also discusses how to automatically extract metadata elements including structural and semantic metadata elements from Web videos.

Boundary-Aware Dual Attention Guided Liver Segment Segmentation Model

  • Jia, Xibin;Qian, Chen;Yang, Zhenghan;Xu, Hui;Han, Xianjun;Ren, Hao;Wu, Xinru;Ma, Boyang;Yang, Dawei;Min, Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.16-37
    • /
    • 2022
  • Accurate liver segment segmentation based on radiological images is indispensable for the preoperative analysis of liver tumor resection surgery. However, most of the existing segmentation methods are not feasible to be used directly for this task due to the challenge of exact edge prediction with some tiny and slender vessels as its clinical segmentation criterion. To address this problem, we propose a novel deep learning based segmentation model, called Boundary-Aware Dual Attention Liver Segment Segmentation Model (BADA). This model can improve the segmentation accuracy of liver segments with enhancing the edges including the vessels serving as segment boundaries. In our model, the dual gated attention is proposed, which composes of a spatial attention module and a semantic attention module. The spatial attention module enhances the weights of key edge regions by concerning about the salient intensity changes, while the semantic attention amplifies the contribution of filters that can extract more discriminative feature information by weighting the significant convolution channels. Simultaneously, we build a dataset of liver segments including 59 clinic cases with dynamically contrast enhanced MRI(Magnetic Resonance Imaging) of portal vein stage, which annotated by several professional radiologists. Comparing with several state-of-the-art methods and baseline segmentation methods, we achieve the best results on this clinic liver segment segmentation dataset, where Mean Dice, Mean Sensitivity and Mean Positive Predicted Value reach 89.01%, 87.71% and 90.67%, respectively.

Efficient Fast Motion Estimation algorithm and Image Segmentation For Low-bit-rate Video Coding (저 전송율 비디오 부호화를 위한 효율적인 고속 움직임추정 알고리즘과 영상 분할기법)

  • 이병석;한수영;이동규;이두수
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.211-214
    • /
    • 2001
  • This paper presents an efficient fast motion estimation algorithm and image segmentation method for low bit-rate coding. First, with region split information, the algorithm splits the image having homogeneous and semantic regions like face and semantic regions in image. Then, in these regions, We find the motion vector using adaptive search window adjustment. Additionally, with this new segment based fast motion estimation, we reduce blocking artifacts by intensively coding our interesting region(face or arm) in input image. The simulation results show the improvement in coding performance and image quality.

  • PDF

Semantic Segmentation of Urban Scenes Using Location Prior Information (사전위치정보를 이용한 도심 영상의 의미론적 분할)

  • Wang, Jeonghyeon;Kim, Jinwhan
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.3
    • /
    • pp.249-257
    • /
    • 2017
  • This paper proposes a method to segment urban scenes semantically based on location prior information. Since major scene elements in urban environments such as roads, buildings, and vehicles are often located at specific locations, using the location prior information of these elements can improve the segmentation performance. The location priors are defined in special 2D coordinates, referred to as road-normal coordinates, which are perpendicular to the orientation of the road. With the help of depth information to each element, all the possible pixels in the image are projected into these coordinates and the learned prior information is applied to those pixels. The proposed location prior can be modeled by defining a unary potential of a conditional random field (CRF) as a sum of two sub-potentials: an appearance feature-based potential and a location potential. The proposed method was validated using publicly available KITTI dataset, which has urban images and corresponding 3D depth measurements.

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

Crack segmentation in high-resolution images using cascaded deep convolutional neural networks and Bayesian data fusion

  • Tang, Wen;Wu, Rih-Teng;Jahanshahi, Mohammad R.
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.221-235
    • /
    • 2022
  • Manual inspection of steel box girders on long span bridges is time-consuming and labor-intensive. The quality of inspection relies on the subjective judgements of the inspectors. This study proposes an automated approach to detect and segment cracks in high-resolution images. An end-to-end cascaded framework is proposed to first detect the existence of cracks using a deep convolutional neural network (CNN) and then segment the crack using a modified U-Net encoder-decoder architecture. A Naïve Bayes data fusion scheme is proposed to reduce the false positives and false negatives effectively. To generate the binary crack mask, first, the original images are divided into 448 × 448 overlapping image patches where these image patches are classified as cracks versus non-cracks using a deep CNN. Next, a modified U-Net is trained from scratch using only the crack patches for segmentation. A customized loss function that consists of binary cross entropy loss and the Dice loss is introduced to enhance the segmentation performance. Additionally, a Naïve Bayes fusion strategy is employed to integrate the crack score maps from different overlapping crack patches and to decide whether a pixel is crack or not. Comprehensive experiments have demonstrated that the proposed approach achieves an 81.71% mean intersection over union (mIoU) score across 5 different training/test splits, which is 7.29% higher than the baseline reference implemented with the original U-Net.

A Knowledge-based Model for Semantic Oriented Contextual Advertising

  • Maree, Mohammed;Hodrob, Rami;Belkhatir, Mohammed;Alhashmi, Saadat M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.2122-2140
    • /
    • 2020
  • Proper and precise embedding of commercial ads within Webpages requires Ad-hoc analysis and understanding of their content. By the successful implementation of this step, both publishers and advertisers gain mutual benefits through increasing their revenues on the one hand, and improving user experience on the other. In this research work, we propose a novel multi-level context-based ads serving approach through which ads will be served at generic publisher websites based on their contextual relevance. In the proposed approach, knowledge encoded in domain-specific and generic semantic repositories is exploited in order to analyze and segment Webpages into sets of contextually-relevant segments. Semantically-enhanced indexes are also constructed to index ads based on their textual descriptions provided by advertisers. A modified cosine similarity matching algorithm is employed to embed each ad from the Ads repository into one or more contextually-relevant segments. In order to validate our proposal, we have implemented a prototype of an ad serving system with two datasets that consist of (11429 ads and 93 documents) and (11000 documents and 15 ads), respectively. To demonstrate the effectiveness of the proposed techniques, we experimentally tested the proposed method and compared the produced results against five baseline metrics that can be used in the context of ad serving systems. In addition, we compared the results produced by our system with other state-of-the-art models. Findings demonstrate that the accuracy of conventional ad matching techniques has improved by exploiting the proposed semantically-enhanced context-based ad serving model.

Grading System of Movie Review through the Use of An Appraisal Dictionary and Computation of Semantic Segments (감정어휘 평가사전과 의미마디 연산을 이용한 영화평 등급화 시스템)

  • Ko, Min-Su;Shin, Hyo-Pil
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.669-696
    • /
    • 2010
  • Assuming that the whole meaning of a document is a composition of the meanings of each part, this paper proposes to study the automatic grading of movie reviews which contain sentimental expressions. This will be accomplished by calculating the values of semantic segments and performing data classification for each review. The ARSSA(The Automatic Rating System for Sentiment analysis using an Appraisal dictionary) system is an effort to model decision making processes in a manner similar to that of the human mind. This aims to resolve the discontinuity between the numerical ranking and textual rationalization present in the binary structure of the current review rating system: {rate: review}. This model can be realized by performing analysis on the abstract menas extracted from each review. The performance of this system was experimentally calculated by performing a 10-fold Cross-Validation test of 1000 reviews obtained from the Naver Movie site. The system achieved an 85% F1 Score when compared to predefined values using a predefined appraisal dictionary.

  • PDF

Dynamic Query Processing Using Description-Based Semantic Prefetching Scheme in Location-Based Services (위치 기반 서비스에서 서술 기반의 시멘틱 프리페칭 기법을 이용한 동적 질의 처리)

  • Kang, Sang-Won;Song, Ui-Sung
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.448-464
    • /
    • 2007
  • Location-Based Services (LBSs) provide results to queries according to the location of the client issuing the query. In LBS, techniques such as caching and prefetching are effective approaches to reducing the data transmission from a server and query response time. However, they can lead to cache inefficiency and network overload due to the client's mobility and query pattern. To solve these drawbacks, we propose a semantic prefetching (SP) scheme using prefetching segment concept and improved cache replacement policies. When a mobile client enters a new service area, called semantic prefetching area, proposed scheme fetches the necessary semantic information from the server in advance. The mobile client maintains the information in its own cache for query processing of location-dependent data (LDD) in mobile computing environment. The performance of the proposed scheme is investigated in relation to various environmental variables, such as the mobility and query pattern of user, the distributions of LDDs and applied cache replacement strategies. Simulation results show that the proposed scheme is more efficient than the well-known existing scheme for range query and nearest neighbor query. In addition, applying the two queries dynamically to query processing improves the performance of the proposed scheme.

Corneal Ulcer Region Detection With Semantic Segmentation Using Deep Learning

  • Im, Jinhyuk;Kim, Daewon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.1-12
    • /
    • 2022
  • Traditional methods of measuring corneal ulcers were difficult to present objective basis for diagnosis because of the subjective judgment of the medical staff through photographs taken with special equipment. In this paper, we propose a method to detect the ulcer area on a pixel basis in corneal ulcer images using a semantic segmentation model. In order to solve this problem, we performed the experiment to detect the ulcer area based on the DeepLab model which has the highest performance in semantic segmentation model. For the experiment, the training and test data were selected and the backbone network of DeepLab model which set as Xception and ResNet, respectively were evaluated and compared the performances. We used Dice similarity coefficient and IoU value as an indicator to evaluate the performances. Experimental results show that when 'crop & resized' images are added to the dataset, it segment the ulcer area with an average accuracy about 93% of Dice similarity coefficient on the DeepLab model with ResNet101 as the backbone network. This study shows that the semantic segmentation model used for object detection also has an ability to make significant results when classifying objects with irregular shapes such as corneal ulcers. Ultimately, we will perform the extension of datasets and experiment with adaptive learning methods through future studies so that they can be implemented in real medical diagnosis environment.