• Title/Summary/Keyword: temporal feature

Search Result 314, Processing Time 0.031 seconds

Acoustic scene classification using recurrence quantification analysis (재발량 분석을 이용한 음향 상황 인지)

  • Park, Sangwook;Choi, Woohyun;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.42-48
    • /
    • 2016
  • Since a variety of sound occur in same place and similar sound occurs in other places, the performance of acoustic scene classification is not guaranteed in case of insufficient training data. A Bag of Words (BOW) based histogram feature is foreseen as a method to overcome the problem. However, since the histogram features is made by using a feature distribution, the ordering of sequence of features is ignored. A temporal information such as periodicity and stationarity are also important for acoustic scene classification. In this paper, temporal features about a periodicity and a stationarity are extracted by using a recurrent quantification analysis. In the experiment, performance of the proposed method is shown better than other baseline methods.

An Adaptive ROI Detection System for Spatiotemporal Features (시.공간특징에 대해 적응할 수 있는 ROI 탐지 시스템)

  • Park Min-Chul;Cheoi Kyung-Joo
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.1
    • /
    • pp.41-53
    • /
    • 2006
  • In this paper, an adaptive ROI(region of interest) detection system for spatialtemporal features is proposed. It utilizes spatiotemporal features for the purpose of detecting ROI. It is assumed that motion representing temporal visual conspicuity between adjacent frames takes higher priority over spatial visual conspicuity. Because objects or regions in motion usually draw stronger attention than others in motion pictures. In case of still images visual features that constitute topographic feature maps are used as spatial features. Comparative experiments with a human subjective evaluation show that correct detection rate of visual attention region is improved by exploiting both spatial and temporal features compared to the case of exploiting either feature.

  • PDF

The Change Detection from High-resolution Satellite Imagery Using Floating Window Method (이동창 방식에 의한 고해상도 위성영상에서의 변화탐지)

  • Im, Yeong-Jae;Ye, Cheol-Su;Kim, Gyeong-Ok
    • 한국지형공간정보학회:학술대회논문집
    • /
    • 2002.11a
    • /
    • pp.117-122
    • /
    • 2002
  • Change detection is a useful technology that can be applied to various fields, taking temporal change information with the comparison and analysis among multi-temporal satellite images. Especially, change detection that utilizes high-resolution satellite imagery can be implemented to extract useful change information for many purposes, such as the environmental inspection, the circumstantial analysis of disaster damage, the inspection of illegal building, and the military use, which cannot be achieved by lower middle-resolution satellite imagery. However, because of the special characteristics that result from high-resolution satellite imagery, it cannot use a pixel-based method that is used for low-resolution satellite imagery. Therefore, it must be used a feature-based algorithm based on the geographical and morphological feature. This paper presents the system that builds the change map by digitizing the boundary of the changed object. In this system, we can make the change map using manual or semi-automatic digitizing through the user interface implemented with a floating window that enables to detect the sign of the change, such as the construction or dismantlement, more efficiently.

  • PDF

Vision-Based Activity Recognition Monitoring Based on Human-Object Interaction at Construction Sites

  • Chae, Yeon;Lee, Hoonyong;Ahn, Changbum R.;Jung, Minhyuk;Park, Moonseo
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.877-885
    • /
    • 2022
  • Vision-based activity recognition has been widely attempted at construction sites to estimate productivity and enhance workers' health and safety. Previous studies have focused on extracting an individual worker's postural information from sequential image frames for activity recognition. However, various trades of workers perform different tasks with similar postural patterns, which degrades the performance of activity recognition based on postural information. To this end, this research exploited a concept of human-object interaction, the interaction between a worker and their surrounding objects, considering the fact that trade workers interact with a specific object (e.g., working tools or construction materials) relevant to their trades. This research developed an approach to understand the context from sequential image frames based on four features: posture, object, spatial features, and temporal feature. Both posture and object features were used to analyze the interaction between the worker and the target object, and the other two features were used to detect movements from the entire region of image frames in both temporal and spatial domains. The developed approach used convolutional neural networks (CNN) for feature extractors and activity classifiers and long short-term memory (LSTM) was also used as an activity classifier. The developed approach provided an average accuracy of 85.96% for classifying 12 target construction tasks performed by two trades of workers, which was higher than two benchmark models. This experimental result indicated that integrating a concept of the human-object interaction offers great benefits in activity recognition when various trade workers coexist in a scene.

  • PDF

Classification of Multi-temporal SAR Data by Using Data Transform Based Features and Multiple Classifiers (자료변환 기반 특징과 다중 분류자를 이용한 다중시기 SAR자료의 분류)

  • Yoo, Hee Young;Park, No-Wook;Hong, Sukyoung;Lee, Kyungdo;Kim, Yeseul
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.3
    • /
    • pp.205-214
    • /
    • 2015
  • In this study, a novel land-cover classification framework for multi-temporal SAR data is presented that can combine multiple features extracted through data transforms and multiple classifiers. At first, data transforms using principle component analysis (PCA) and 3D wavelet transform are applied to multi-temporal SAR dataset for extracting new features which were different from original dataset. Then, three different classifiers including maximum likelihood classifier (MLC), neural network (NN) and support vector machine (SVM) are applied to three different dataset including data transform based features and original backscattering coefficients, and as a result, the diverse preliminary classification results are generated. These results are combined via a majority voting rule to generate a final classification result. From an experiment with a multi-temporal ENVISAT ASAR dataset, every preliminary classification result showed very different classification accuracy according to the used feature and classifier. The final classification result combining nine preliminary classification results showed the best classification accuracy because each preliminary classification result provided complementary information on land-covers. The improvement of classification accuracy in this study was mainly attributed to the diversity from combining not only different features based on data transforms, but also different classifiers. Therefore, the land-cover classification framework presented in this study would be effectively applied to the classification of multi-temporal SAR data and also be extended to multi-sensor remote sensing data fusion.

Analysis of Relationships between Features Extracted from SAR Data and Land-cover Classes (SAR 자료에서 추출한 특징들과 토지 피복 항목 사이의 연관성 분석)

  • Park, No-Wook;Chi, Kwang-Hoon;Lee, Hoon-Yol
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.4
    • /
    • pp.257-272
    • /
    • 2007
  • This paper analyzed relationships between various features from SAR data with multiple acquisition dates and mode (frequency, polarization and incidence angles), and land-cover classes. Two typical types of features were extracted by considering acquisition conditions of currently available SAR data. First, coherence, temporal variability and principal component transform-based features were extracted from multi-temporal and single mode SAR data. C-band ERS-1/2, ENVISAT ASAR and Radarsat-1, and L-band JERS-1 SAR data were used for those features and different characteristics of different SAR sensor data were discussed in terms of land-cover discrimination capability. Overall, tandem coherence showed the best discrimination capability among various features. Long-term coherence from C-band SAR data provided a useful information on the discrimination of urban areas from other classes. Paddy fields showed the highest temporal variability values in all SAR sensor data. Features from principal component transform contained particular information relevant to specific land-cover class. As features for multiple mode SAR data acquired at similar dates, polarization ratio and multi-channel variability were also considered. VH/VV polarization ratio was a useful feature for the discrimination of forest and dry fields in which the distributions of coherence and temporal variability were significantly overlapped. It would be expected that the case study results could be useful information on improvement of classification accuracy in land-cover classification with SAR data, provided that the main findings of this paper would be confirmed by extensive case studies based on multi-temporal SAR data with various modes and ground-based SAR experiments.

Mapping Burned Forests Using a k-Nearest Neighbors Classifier in Complex Land Cover (k-Nearest Neighbors 분류기를 이용한 복합 지표 산불피해 영역 탐지)

  • Lee, Hanna ;Yun, Konghyun;Kim, Gihong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.883-896
    • /
    • 2023
  • As human activities in Korea are spread throughout the mountains, forest fires often affect residential areas, infrastructure, and other facilities. Hence, it is necessary to detect fire-damaged areas quickly to enable support and recovery. Remote sensing is the most efficient tool for this purpose. Fire damage detection experiments were conducted on the east coast of Korea. Because this area comprises a mixture of forest and artificial land cover, data with low resolution are not suitable. We used Sentinel-2 multispectral instrument (MSI) data, which provide adequate temporal and spatial resolution, and the k-nearest neighbor (kNN) algorithm in this study. Six bands of Sentinel-2 MSI and two indices of normalized difference vegetation index (NDVI) and normalized burn ratio (NBR) were used as features for kNN classification. The kNN classifier was trained using 2,000 randomly selected samples in the fire-damaged and undamaged areas. Outliers were removed and a forest type map was used to improve classification performance. Numerous experiments for various neighbors for kNN and feature combinations have been conducted using bi-temporal and uni-temporal approaches. The bi-temporal classification performed better than the uni-temporal classification. However, the uni-temporal classification was able to detect severely damaged areas.

Detection of Moving Objects in Crowded Scenes using Trajectory Clustering via Conditional Random Fields Framework (Conditional Random Fields 구조에서 궤적군집화를 이용한 혼잡 영상의 이동 객체 검출)

  • Kim, Hyeong-Ki;Lee, Gwang-Gook;Kim, Whoi-Yul
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.8
    • /
    • pp.1128-1141
    • /
    • 2010
  • This paper proposes a method of moving object detection in crowded scene using clustered trajectory. Unlike previous appearance based approaches, the proposed method employes motion information only to isolate moving objects. In the proposed method, feature points are extracted from input frames first and then feature tracking is followed to create feature trajectories. Based on an assumption that feature points originated from the same objects shows similar motion as the object moves, the proposed method detects moving objects by clustering trajectories of similar motions. For this purpose an energy function based on spatial proximity, motion coherence, and temporal continuity is defined to measure the similarity between two trajectories and the clustering is achieved by minimizing the energy function in CRFs (conditional random fields). Compared to previous methods, which are unable to separate falsely merged trajectories during the clustering process, the proposed method is able to rearrange the falsely merged trajectories during iteration because the clustering is solved my energy minimization in CRFs. Experiment results with three different crowded scenes show about 94% detection rate with 7% false alarm rate.

Dynamic Hand Gesture Recognition Using CNN Model and FMM Neural Networks (CNN 모델과 FMM 신경망을 이용한 동적 수신호 인식 기법)

  • Kim, Ho-Joon
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.95-108
    • /
    • 2010
  • In this paper, we present a hybrid neural network model for dynamic hand gesture recognition. The model consists of two modules, feature extraction module and pattern classification module. We first propose a modified CNN(convolutional Neural Network) a pattern recognition model for the feature extraction module. Then we introduce a weighted fuzzy min-max(WFMM) neural network for the pattern classification module. The data representation proposed in this research is a spatiotemporal template which is based on the motion information of the target object. To minimize the influence caused by the spatial and temporal variation of the feature points, we extend the receptive field of the CNN model to a three-dimensional structure. We discuss the learning capability of the WFMM neural networks in which the weight concept is added to represent the frequency factor in training pattern set. The model can overcome the performance degradation which may be caused by the hyperbox contraction process of conventional FMM neural networks. From the experimental results of human action recognition and dynamic hand gesture recognition for remote-control electric home appliances, the validity of the proposed models is discussed.

Combining Dynamic Time Warping and Single Hidden Layer Feedforward Neural Networks for Temporal Sign Language Recognition

  • Thi, Ngoc Anh Nguyen;Yang, Hyung-Jeong;Kim, Sun-Hee;Kim, Soo-Hyung
    • International Journal of Contents
    • /
    • v.7 no.1
    • /
    • pp.14-22
    • /
    • 2011
  • Temporal Sign Language Recognition (TSLR) from hand motion is an active area of gesture recognition research in facilitating efficient communication with deaf people. TSLR systems consist of two stages: a motion sensing step which extracts useful features from signers' motion and a classification process which classifies these features as a performed sign. This work focuses on two of the research problems, namely unknown time varying signal of sign languages in feature extraction stage and computing complexity and time consumption in classification stage due to a very large sign sequences database. In this paper, we propose a combination of Dynamic Time Warping (DTW) and application of the Single hidden Layer Feedforward Neural networks (SLFNs) trained by Extreme Learning Machine (ELM) to cope the limitations. DTW has several advantages over other approaches in that it can align the length of the time series data to a same prior size, while ELM is a useful technique for classifying these warped features. Our experiment demonstrates the efficiency of the proposed method with the recognition accuracy up to 98.67%. The proposed approach can be generalized to more detailed measurements so as to recognize hand gestures, body motion and facial expression.