• Title/Summary/Keyword: scenes clustering

Search Result 24, Processing Time 0.028 seconds

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

Text Detection and Binarization using Color Variance and an Improved K-means Color Clustering in Camera-captured Images (카메라 획득 영상에서의 색 분산 및 개선된 K-means 색 병합을 이용한 텍스트 영역 추출 및 이진화)

  • Song Young-Ja;Choi Yeong-Woo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.205-214
    • /
    • 2006
  • Texts in images have significant and detailed information about the scenes, and if we can automatically detect and recognize those texts in real-time, it can be used in various applications. In this paper, we propose a new text detection method that can find texts from the various camera-captured images and propose a text segmentation method from the detected text regions. The detection method proposes color variance as a detection feature in RGB color space, and the segmentation method suggests an improved K-means color clustering in RGB color space. We have tested the proposed methods using various kinds of document style and natural scene images captured by digital cameras and mobile-phone camera, and we also tested the method with a portion of ICDAR[1] contest images.

Abnormal Behavior Recognition Based on Spatio-temporal Context

  • Yang, Yuanfeng;Li, Lin;Liu, Zhaobin;Liu, Gang
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.612-628
    • /
    • 2020
  • This paper presents a new approach for detecting abnormal behaviors in complex surveillance scenes where anomalies are subtle and difficult to distinguish due to the intricate correlations among multiple objects' behaviors. Specifically, a cascaded probabilistic topic model was put forward for learning the spatial context of local behavior and the temporal context of global behavior in two different stages. In the first stage of topic modeling, unlike the existing approaches using either optical flows or complete trajectories, spatio-temporal correlations between the trajectory fragments in video clips were modeled by the latent Dirichlet allocation (LDA) topic model based on Markov random fields to obtain the spatial context of local behavior in each video clip. The local behavior topic categories were then obtained by exploiting the spectral clustering algorithm. Based on the construction of a dictionary through the process of local behavior topic clustering, the second phase of the LDA topic model learns the correlations of global behaviors and temporal context. In particular, an abnormal behavior recognition method was developed based on the learned spatio-temporal context of behaviors. The specific identification method adopts a top-down strategy and consists of two stages: anomaly recognition of video clip and anomalous behavior recognition within each video clip. Evaluation was performed using the validity of spatio-temporal context learning for local behavior topics and abnormal behavior recognition. Furthermore, the performance of the proposed approach in abnormal behavior recognition improved effectively and significantly in complex surveillance scenes.

Visual Model of Pattern Design Based on Deep Convolutional Neural Network

  • Jingjing Ye;Jun Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.311-326
    • /
    • 2024
  • The rapid development of neural network technology promotes the neural network model driven by big data to overcome the texture effect of complex objects. Due to the limitations in complex scenes, it is necessary to establish custom template matching and apply it to the research of many fields of computational vision technology. The dependence on high-quality small label sample database data is not very strong, and the machine learning system of deep feature connection to complete the task of texture effect inference and speculation is relatively poor. The style transfer algorithm based on neural network collects and preserves the data of patterns, extracts and modernizes their features. Through the algorithm model, it is easier to present the texture color of patterns and display them digitally. In this paper, according to the texture effect reasoning of custom template matching, the 3D visualization of the target is transformed into a 3D model. The high similarity between the scene to be inferred and the user-defined template is calculated by the user-defined template of the multi-dimensional external feature label. The convolutional neural network is adopted to optimize the external area of the object to improve the sampling quality and computational performance of the sample pyramid structure. The results indicate that the proposed algorithm can accurately capture the significant target, achieve more ablation noise, and improve the visualization results. The proposed deep convolutional neural network optimization algorithm has good rapidity, data accuracy and robustness. The proposed algorithm can adapt to the calculation of more task scenes, display the redundant vision-related information of image conversion, enhance the powerful computing power, and further improve the computational efficiency and accuracy of convolutional networks, which has a high research significance for the study of image information conversion.

Unsupervised Motion Learning for Abnormal Behavior Detection in Visual Surveillance (영상감시시스템에서 움직임의 비교사학습을 통한 비정상행동탐지)

  • Jeong, Ha-Wook;Chang, Hyung-Jin;Choi, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.5
    • /
    • pp.45-51
    • /
    • 2011
  • In this paper, we propose an unsupervised learning method for modeling motion trajectory patterns effectively. In our approach, observations of an object on a trajectory are treated as words in a document for latent dirichlet allocation algorithm which is used for clustering words on the topic in natural language process. This allows clustering topics (e.g. go straight, turn left, turn right) effectively in complex scenes, such as crossroads. After this procedure, we learn patterns of word sequences in each cluster using Baum-Welch algorithm used to find the unknown parameters in a hidden markov model. Evaluation of abnormality can be done using forward algorithm by comparing learned sequence and input sequence. Results of experiments show that modeling of semantic region is robust against noise in various scene.

Target Object Detection Based on Robust Feature Extraction (강인한 특징 추출에 기반한 대상물체 검출)

  • Jang, Seok-Woo;Huh, Moon-Haeng
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.12
    • /
    • pp.7302-7308
    • /
    • 2014
  • Detecting target objects robustly in natural environments is a difficult problem in the computer vision and image processing areas. This paper suggests a method of robustly detecting target objects in the environments where reflection exists. The suggested algorithm first captures scenes with a stereo camera and extracts the line and corner features representing the target objects. This method then eliminates the reflected features among the extracted ones using a homographic transform. Subsequently, the method robustly detects the target objects by clustering only real features. The experimental results showed that the suggested algorithm effectively detects the target objects in reflection environments rather than existing algorithms.

On the Recognition of the Occluded Objects Using Matching Probability (정합확률을 이용한 겹쳐진 물체의 인식에 대하여)

  • Nam, Ki-Gon;lee, Soo-Dong;Lee, Ryang-Sung
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.1
    • /
    • pp.20-28
    • /
    • 1989
  • The recognition of partially occluded objects is of prime importance for industrial machine vision applications and to solve real provlems in factory automation. This paper describes a method tc solve the problem of occlusion in a two dimensional scene. The technique consists of three steps: searching of border, extracting of line segments and clustering of hypotheses by matching probability. Computer simulation results have been tested for 20 scenes contained the 80 models, and have obtained 95% of properly correct recognition rate on the average.

  • PDF

The Abstraction Retrieval System of Cultural Videos using Scene Change Detection (장면전환검출을 이용한 교양비디오 개요 검색 시스템)

  • Kang Oh-Hyung;Lee Ji-Hyun;Rhee Yang-Won
    • The KIPS Transactions:PartB
    • /
    • v.12B no.7 s.103
    • /
    • pp.761-766
    • /
    • 2005
  • This paper proposes a video model for the implementation of the cultural video database system. We have utilized an efficient scene change detection method that segments cultural video into semantic units for efficient indexing and retrieval of video. Since video has a large volume and needs to be played for a longer time, it implies difficulty of viewing the entire video. To solve this Problem. the cultural video abstraction was made to save the time and widen the choices of video the video abstract is the summarization of scenes, which includes important events produced by setting up the abstraction rule.

Level 3 Type Land Use Land Cover (LULC) Characteristics Based on Phenological Phases of North Korea (생물계절 상 분석을 통한 Level 3 type 북한 토지피복 특성)

  • Yu, Jae-Shim;Park, Chong-Hwa;Lee, Seung-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.4
    • /
    • pp.457-466
    • /
    • 2011
  • The objectives of this study are to produce level 3 type LULC map and analysis of phenological features of North Korea, ISODATA clustering of the 88scenes of MVC of MODIS NDVI in 2008 and 8scenes in 2009 was carried out. Analysis of phenological phases based mapping method was conducted, In level 2 type map, the confusion matrix was summarized and Kappa coefficient was calculated. Total of 27 typical habitat types that represent the dominant species or vegetation density that cover land surface of North Korea in 2008 were made. The total of 27 classes includes the 17 forest biotopes, 7 different croplands, 2 built up types and one water body. Dormancy phase of winter (${\sigma}^2$ = 0.348) and green up phase in spring (${\sigma}^2$ = 0.347) displays phenological dynamics when much vegetation growth changes take place. Overall accuracy is (851/955) 85.85% and Kappa coefficient is 0.84. Phenological phase based mapping method was possible to minimize classification error when analyzing the inaccessible land of North Korea.

Vegetation Cover Type Mapping Over The Korean Peninsula Using Multitemporal AVHRR Data (시계열(時系列) AVHRR 위성자료(衛星資料)를 이용한 한반도 식생분포(植生分布) 구분(區分))

  • Lee, Kyu-Sung
    • Journal of Korean Society of Forest Science
    • /
    • v.83 no.4
    • /
    • pp.441-449
    • /
    • 1994
  • The two reflective channels(red and near infrared spectrum) of advanced very high resolution radiometer(AVHRR) data were used to classify primary vegetation cover types in the Korean Peninsula. From the NOAA-11 satellite data archive of 1991, 27 daytime scenes of relatively minimum cloud coverage were obtained. After the initial radiometric calibration, normalized difference vegetation index(NDVI) was calculated for each of the 27 data sets. Four or five daily NDVI data were then overlaid for each of the six months starting from February to November and the maximum value of NDVI was retained for every pixel location to make a monthly composite. The six bands of monthly NDVI composite were nearly cloud free and used for the computer classification of vegetation cover. Based on the temporal signatures of different vegetation cover types, which were generated by an unsupervised block clustering algorithm, every pixel was classified into one of the six cover type categories. The classification result was evaluated by both qualitative interpretation and quantitative comparison with existing forest statistics. Considering frequent data acquisition, low data cost and volume, and large area coverage, it is believed that AVHRR data are effective for vegetation cover type mapping at regional scale.

  • PDF