• Title/Summary/Keyword: visual saliency

Search Result 64, Processing Time 0.022 seconds

An Intelligent Display Scheme of Soccer Video for Multimedia Mobile Devices (멀티미디어 이동형 단말을 위한 축구경기 비디오의 지능적 디스플레이 방법)

  • Seo Kee-Won;Kim Chang-Ick
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.207-221
    • /
    • 2006
  • A fully automatic and computationally efficient method is proposed for intelligent display of soccer video on small multimedia mobile devices. The rapid progress of the multimedia signal processing has contributed to the extensive use of multimedia devices with a small LCD panel. With these emerging small mobile devices, the video sequences captured for standard- or HDTV broadcasting may give the small-display-viewers uncomfortable experiences in understanding what is happening in a scene. For instance, in a soccer video sequence taken by a long-shot camera technique, the tiny objects (e.g., soccer ball and players) may not be clearly viewed on the small LCD panel. Thus, an intelligent display technique is needed for small-display-viewers. To this end, one of the key technologies is to determine region of interest (ROI), which is a part of the scene that viewers pay more attention to than other regions. In this paper, the focus is on soccer video display for mobile devices. Instead of taking visual saliency into account, we take domain-specific approach to exploit the characteristics of the soccer video. The proposed scheme includes three modules; ground color learning, shot classification, and ROI determination. The experimental results show the propose scheme is capable of intelligent video display on mobile devices.

Segmentation of Objects of Interest for Video Content Analysis (동영상 내용 분석을 위한 관심 객체 추출)

  • Park, So-Jung;Kim, Min-Hwan
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.8
    • /
    • pp.967-980
    • /
    • 2007
  • Video objects of interest play an important role in representing the video content and are useful for improving the performance of video retrieval and compression. The objects of interest may be a main object in describing contents of a video shot or a core object that a video producer wants to represent in the video shot. We know that any object attracting one's eye much in the video shot may not be an object of interest and a non-moving object may be an object of interest as well as a moving one. However it is not easy to define an object of interest clearly, because procedural description of human interest is difficult. In this paper, a set of four filtering conditions for extracting moving objects of interest is suggested, which is defined by considering variation of location, size, and moving pattern of moving objects in a video shot. Non-moving objects of interest are also defined as another set of four extracting conditions that are related to saliency of color/texture, location, size, and occurrence frequency of static objects in a video shot. On a test with 50 video shots, the segmentation method based on the two sets of conditions could extract the moving and non-moving objects of interest chosen manually on accuracy of 84%.

  • PDF

Image-based Soft Drink Type Classification and Dietary Assessment System Using Deep Convolutional Neural Network with Transfer Learning

  • Rubaiya Hafiz;Mohammad Reduanul Haque;Aniruddha Rakshit;Amina khatun;Mohammad Shorif Uddin
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.158-168
    • /
    • 2024
  • There is hardly any person in modern times who has not taken soft drinks instead of drinking water. The rate of people taking soft drinks being surprisingly high, researchers around the world have cautioned from time to time that these drinks lead to weight gain, raise the risk of non-communicable diseases and so on. Therefore, in this work an image-based tool is developed to monitor the nutritional information of soft drinks by using deep convolutional neural network with transfer learning. At first, visual saliency, mean shift segmentation, thresholding and noise reduction technique, collectively known as 'pre-processing' are adopted to extract the location of drinks region. After removing backgrounds and segment out only the desired area from image, we impose Discrete Wavelength Transform (DWT) based resolution enhancement technique is applied to improve the quality of image. After that, transfer learning model is employed for the classification of drinks. Finally, nutrition value of each drink is estimated using Bag-of-Feature (BoF) based classification and Euclidean distance-based ratio calculation technique. To achieve this, a dataset is built with ten most consumed soft drinks in Bangladesh. These images were collected from imageNet dataset as well as internet and proposed method confirms that it has the ability to detect and recognize different types of drinks with an accuracy of 98.51%.

Extraction of Landmarks Using Building Attribute Data for Pedestrian Navigation Service (보행자 내비게이션 서비스를 위한 건물 속성정보를 이용한 랜드마크 추출)

  • Kim, Jinhyeong;Kim, Jiyoung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.37 no.1
    • /
    • pp.203-215
    • /
    • 2017
  • Recently, interest in Pedestrian Navigation Service (PNS) is being increased due to the diffusion of smart phone and the improvement of location determination technology and it is efficient to use landmarks in route guidance for pedestrians due to the characteristics of pedestrians' movement and success rate of path finding. Accordingly, researches on extracting landmarks have been progressed. However, preceding researches have a limit that they only considered the difference between buildings and did not consider visual attention of maps in display of PNS. This study improves this problem by defining building attributes as local variable and global variable. Local variables reflect the saliency of buildings by representing the difference between buildings and global variables reflects the visual attention by representing the inherent characteristics of buildings. Also, this study considers the connectivity of network and solves the overlapping problem of landmark candidate groups by network voronoi diagram. To extract landmarks, we defined building attribute data based on preceding researches. Next, we selected a choice point for pedestrians in pedestrian network data, and determined landmark candidate groups at each choice point. Building attribute data were calculated in the extracted landmark candidate groups and finally landmarks were extracted by principal component analysis. We applied the proposed method to a part of Gwanak-gu, Seoul and this study evaluated the extracted landmarks by making a comparison with labels and landmarks used by portal sites such as the NAVER and the DAUM. In conclusion, 132 landmarks (60.3%) among 219 landmarks of the NAVER and the DAUM were extracted by the proposed method and we confirmed that 228 landmarks which there are not labels or landmarks in the NAVER and the DAUM were helpful to determine a change of direction in path finding of local level.