• Title/Summary/Keyword: spatial recognition

Search Result 491, Processing Time 0.022 seconds

Multiscale Spatial Position Coding under Locality Constraint for Action Recognition

  • Yang, Jiang-feng;Ma, Zheng;Xie, Mei
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.4
    • /
    • pp.1851-1863
    • /
    • 2015
  • – In the paper, to handle the problem of traditional bag-of-features model ignoring the spatial relationship of local features in human action recognition, we proposed a Multiscale Spatial Position Coding under Locality Constraint method. Specifically, to describe this spatial relationship, we proposed a mixed feature combining motion feature and multi-spatial-scale configuration. To utilize temporal information between features, sub spatial-temporal-volumes are built. Next, the pooled features of sub-STVs are obtained via max-pooling method. In classification stage, the Locality-Constrained Group Sparse Representation is adopted to utilize the intrinsic group information of the sub-STV features. The experimental results on the KTH, Weizmann, and UCF sports datasets show that our action recognition system outperforms the classical local ST feature-based recognition systems published recently.

Using Spatial Pyramid Based Local Descriptor for Face Recognition (공간 계층적 구조 기반 지역 기술자 활용 얼굴인식 기술)

  • Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.758-768
    • /
    • 2017
  • In this paper, we present a novel method to extract face representation based on multi-resolution spatial pyramid. In our method, a face is subdivided into increasingly finer sub-regions (local regions) and represented at multiple levels of histogram representations. To cope with misaligned problem, patch-based local descriptor extraction has been also developed in a novel way. To preserve multiple levels of detail in local characteristics and also encode holistic spatial configuration, histograms from all levels of spatial pyramid are integrated by using dimensionality reduction and feature combination, leading to our spatial-pyramid face feature representation. We incorporate our proposed face features into general face recognition pipeline and achieve state-of-the-art results on challenging face recognition problems.

A Spatial Regularization of LDA for Face Recognition

  • Park, Lae-Jeong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.2
    • /
    • pp.95-100
    • /
    • 2010
  • This paper proposes a new spatial regularization of Fisher linear discriminant analysis (LDA) to reduce the overfitting due to small size sample (SSS) problem in face recognition. Many regularized LDAs have been proposed to alleviate the overfitting by regularizing an estimate of the within-class scatter matrix. Spatial regularization methods have been suggested that make the discriminant vectors spatially smooth, leading to mitigation of the overfitting. As a generalized version of the spatially regularized LDA, the proposed regularized LDA utilizes the non-uniformity of spatial correlation structures in face images in adding a spatial smoothness constraint into an LDA framework. The region-dependent spatial regularization is advantageous for capturing the non-flat spatial correlation structure within face image as well as obtaining a spatially smooth projection of LDA. Experimental results on public face databases such as ORL and CMU PIE show that the proposed regularized LDA performs well especially when the number of training images per individual is quite small, compared with other regularized LDAs.

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.141-148
    • /
    • 2010
  • In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.

  • PDF

Differential Effects of Scopolamine on Memory Processes in the Object Recognition Test and the Morris Water Maze Test in Mice

  • Kim, Dong-Hyun;Ryu, Jong-Hoon
    • Biomolecules & Therapeutics
    • /
    • v.16 no.3
    • /
    • pp.173-178
    • /
    • 2008
  • Several lines of evidence indicate that scopolamine as a nonselective muscarinic antagonist disrupts object recognition performance and spatial working memory when administered systemically. In the present study, we investigated the different effects of scopolamine on acquisition, consolidation, and retrieval phases of object recognition performance and spatial working memory using the object recognition and the Morris water maze tasks in mice. In the acquisition phase test, scopolamine decreased recognition index on object recognition task and the trial 1 to trial 2 differences on Morris water maze task. In the consolidation and retrieval phase tests, scopolamine also decreased recognition index on object recognition task, where as scopolamine did not exhibited any effects on the Morris water maze task.

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Effects of JPEG Compression on Joint Transform Correlator

  • Widjaja, Joewono;Suripon, Ubon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1662-1665
    • /
    • 2004
  • A real-time joint transform correlator by using JPEG-compressed reference images is proposed as practical solution to storage problem and improvement of processing time of automatic target recognition system [1]. Effects of compression on recognition performance of join transform correlator are quantitatively investigated under situations where the target is suffered from noise and has contrast difference with respect to the reference. Two images with different spatial-frequency contents and contrast were used as the test scenes. The simulation results show that, the recognition performance of joint transform correlator by using the compressed reference images with high spatial-frequency components is more sensitive to noise and contrast difference than the low spatial-frequency image.

  • PDF

Utilization of Visual Context for Robust Object Recognition in Intelligent Mobile Robots (지능형 이동 로봇에서 강인 물체 인식을 위한 영상 문맥 정보 활용 기법)

  • Kim, Sung-Ho;Kim, Jun-Sik;Kweon, In-So
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.1
    • /
    • pp.36-45
    • /
    • 2006
  • In this paper, we introduce visual contexts in terms of types and utilization methods for robust object recognition with intelligent mobile robots. One of the core technologies for intelligent robots is visual object recognition. Robust techniques are strongly required since there are many sources of visual variations such as geometric, photometric, and noise. For such requirements, we define spatial context, hierarchical context, and temporal context. According to object recognition domain, we can select such visual contexts. We also propose a unified framework which can utilize the whole contexts and validates it in real working environment. Finally, we also discuss the future research directions of object recognition technologies for intelligent robots.

  • PDF

A Proposal of Shuffle Graph Convolutional Network for Skeleton-based Action Recognition

  • Jang, Sungjun;Bae, Han Byeol;Lee, HeanSung;Lee, Sangyoun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.4
    • /
    • pp.314-322
    • /
    • 2021
  • Skeleton-based action recognition has attracted considerable attention in human action recognition. Recent methods for skeleton-based action recognition employ spatiotemporal graph convolutional networks (GCNs) and have remarkable performance. However, most of them have heavy computational complexity for robust action recognition. To solve this problem, we propose a shuffle graph convolutional network (SGCN) which is a lightweight graph convolutional network using pointwise group convolution rather than pointwise convolution to reduce computational cost. Our SGCN is composed of spatial and temporal GCN. The spatial shuffle GCN contains pointwise group convolution and part shuffle module which enhances local and global information between correlated joints. In addition, the temporal shuffle GCN contains depthwise convolution to maintain a large receptive field. Our model achieves comparable performance with lowest computational cost and exceeds the performance of baseline at 0.3% and 1.2% on NTU RGB+D and NTU RGB+D 120 datasets, respectively.

A Recognition Method for Korean Spatial Background in Historical Novels (한국어 역사 소설에서 공간적 배경 인식 기법)

  • Kim, Seo-Hee;Kim, Seung-Hoon
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.245-253
    • /
    • 2016
  • Background in a novel is most important elements with characters and events, and means time, place and situation that characters appeared. Among the background, spatial background can help conveys topic of a novel. So, it may be helpful for choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels. In case of English text, It can be recognize spatial background easily because it use upper and lower case and words used with the spatial information such as Bank, University and City. But, in case Korean text, it is difficult to recognize that spatial background because there is few information about usage of letter. In the previous studies, they use machine learning or dictionaries and rules to recognize about spatial information in text such as news and text messages. In this paper, we build a nation dictionaries that refer to information such as 'Korean history' and 'Google maps.' We Also propose a method for recognizing spatial background based on patterns of postposition in Korean sentences comparing to previous works. We are grasp using of postposition with spatial background because Korean characteristics. And we propose a method based on result of morpheme analyze and frequency in a novel text for raising accuracy about recognizing spatial background. The recognized spatial background can help readers to grasp the atmosphere of a novel and to understand the events and atmosphere through recognition of the spatial background of the scene that characters appeared.