• 제목/요약/키워드: Video recognition

검색결과 679건 처리시간 0.03초

Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM

  • Kamal, Shaharyar;Jalal, Ahmad;Kim, Daijin
    • Journal of Electrical Engineering and Technology
    • /
    • 제11권6호
    • /
    • pp.1857-1862
    • /
    • 2016
  • Human activity recognition using depth information is an emerging and challenging technology in computer vision due to its considerable attention by many practical applications such as smart home/office system, personal health care and 3D video games. This paper presents a novel framework of 3D human body detection, tracking and recognition from depth video sequences using spatiotemporal features and modified HMM. To detect human silhouette, raw depth data is examined to extract human silhouette by considering spatial continuity and constraints of human motion information. While, frame differentiation is used to track human movements. Features extraction mechanism consists of spatial depth shape features and temporal joints features are used to improve classification performance. Both of these features are fused together to recognize different activities using the modified hidden Markov model (M-HMM). The proposed approach is evaluated on two challenging depth video datasets. Moreover, our system has significant abilities to handle subject's body parts rotation and body parts missing which provide major contributions in human activity recognition.

미디어 편집을 위한 인물 식별 및 검색 기법 (Character Recognition and Search for Media Editing)

  • 박용석;김현식
    • 방송공학회논문지
    • /
    • 제27권4호
    • /
    • pp.519-526
    • /
    • 2022
  • 동영상 콘텐츠 편집 시 등장인물을 구분하고 식별하는 작업은 많은 시간과 노력이 요구되는 작업이다. 노동 집약적 특성이 있는 미디어 편집 작업 시 인공지능 기술을 활용하면 미디어 제작 시간을 획기적으로 줄일 수 있어 창작과정의 효율성 향상에 도움을 줄 수 있다. 본 논문에서는 동영상 편집을 위한 인물 식별 및 검색 작업을 자동화하기 위해 다수의 인공지능 기술을 혼합하여 활용하는 기법을 제안한다. 객체 검출, 얼굴 검출, 자세 예측 기법을 사용하여 인물 객체에 대한 특징 정보를 수집하고, 수집된 정보를 바탕으로 얼굴 인식, 색 공간 분석 기법 등을 활용하여 인물 객체 식별 정보를 생성한다. 인물 특징 및 식별 정보는 편집 대상 영상의 각 프레임에 대해서 수집되며 영상 편집을 위한 프레임 단위 검색을 위한 메타데이터로 사용된다.

장애인복지를 위한 정보통신의 발전방향 (New developmental direction of telecommunications for Disabilities Welfare)

  • 박민수
    • 한국정보통신학회논문지
    • /
    • 제4권1호
    • /
    • pp.35-43
    • /
    • 2000
  • 본 연구는 장애인이 일반인과 똑같이 정보사회에 적응시키기 위해 장애인복지를 위한 정보통신의 발전방향을 연구하였다. 연구의 방법으로는 델파이기법을 도입하였으며, 조사방법으로는 문헌적조사방법과 면담조사방법을 병행하여 연구분석의 틀에 의해 분석하였다. 정보통신에 대한 장애인의 문제점을 살펴보면, 정보통신 접근의 불편, 보편적 서비스 저하, 장애인의 PC 통신 이용 저조, 장애인복지의 낙후, 정보통신요금의 부담, 정보화교육의 부족, 장애인정보의 부족, 장애인복지 정책결정에 장애인 미참여 등으로 분석되었다. 장애인에 필요한 정보통신기술을 살펴보면, 지체장애인에게는 음성인식기술ㆍ화상인식기술ㆍ호흡압력감지기술, 시각장애인에게는 표시기술ㆍ음성인식기술ㆍ문자인식기술ㆍ지적변환처리기술ㆍ화상인식음성합성기술, 청각ㆍ언어장애인에게는 음성신호처리기술ㆍ음성인식기술ㆍ기적변환처리기술ㆍ문자인식기술ㆍ화상인식기술ㆍ음성합성기술이 필요하다. 장애인복지를 위한 정보통신을 발전시키기 위해서는 장애인정보통신위원회의 구성, 보편적 서비스의 제공, 정보화교육의 실시, 연구개발의 지원, 중소정보통신기업의 지원 육성, 소프트웨어산업의 육성, 장애인용 표준화 작업 추진이 요구된다.

  • PDF

Image Processing for Video Images of Buoy Motion

  • Kim, Baeck-Oon;Cho, Hong-Yeon
    • Ocean Science Journal
    • /
    • 제40권4호
    • /
    • pp.213-220
    • /
    • 2005
  • In this paper, image processing technique that reduces video images of buoy motion to yield time series of image coordinates of buoy objects will be investigated. The buoy motion images are noisy due to time-varying brightness as well as non-uniform background illumination. The occurrence of boats, wakes, and wind-induced white caps interferes significantly in recognition of buoy objects. Thus, semi-automated procedures consisting of object recognition and image measurement aspects will be conducted. These offer more satisfactory results than a manual process. Spectral analysis shows that the image coordinates of buoy objects represent wave motion well, indicating its usefulness in the analysis of wave characteristics.

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

  • Sediqi, Khwaja Monib;Lee, Hyo Jong
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2019년도 춘계학술발표대회
    • /
    • pp.455-457
    • /
    • 2019
  • In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.

Video augmentation technique for human action recognition using genetic algorithm

  • Nida, Nudrat;Yousaf, Muhammad Haroon;Irtaza, Aun;Velastin, Sergio A.
    • ETRI Journal
    • /
    • 제44권2호
    • /
    • pp.327-338
    • /
    • 2022
  • Classification models for human action recognition require robust features and large training sets for good generalization. However, data augmentation methods are employed for imbalanced training sets to achieve higher accuracy. These samples generated using data augmentation only reflect existing samples within the training set, their feature representations are less diverse and hence, contribute to less precise classification. This paper presents new data augmentation and action representation approaches to grow training sets. The proposed approach is based on two fundamental concepts: virtual video generation for augmentation and representation of the action videos through robust features. Virtual videos are generated from the motion history templates of action videos, which are convolved using a convolutional neural network, to generate deep features. Furthermore, by observing an objective function of the genetic algorithm, the spatiotemporal features of different samples are combined, to generate the representations of the virtual videos and then classified through an extreme learning machine classifier on MuHAVi-Uncut, iXMAS, and IAVID-1 datasets.

Recognition of Human Facial Expression in a Video Image using the Active Appearance Model

  • Jo, Gyeong-Sic;Kim, Yong-Guk
    • Journal of Information Processing Systems
    • /
    • 제6권2호
    • /
    • pp.261-268
    • /
    • 2010
  • Tracking human facial expression within a video image has many useful applications, such as surveillance and teleconferencing, etc. Initially, the Active Appearance Model (AAM) was proposed for facial recognition; however, it turns out that the AAM has many advantages as regards continuous facial expression recognition. We have implemented a continuous facial expression recognition system using the AAM. In this study, we adopt an independent AAM using the Inverse Compositional Image Alignment method. The system was evaluated using the standard Cohn-Kanade facial expression database, the results of which show that it could have numerous potential applications.

Temporal matching prior network for vehicle license plate detection and recognition in videos

  • Yoo, Seok Bong;Han, Mikyong
    • ETRI Journal
    • /
    • 제42권3호
    • /
    • pp.411-419
    • /
    • 2020
  • In real-world intelligent transportation systems, accuracy in vehicle license plate detection and recognition is considered quite critical. Many algorithms have been proposed for still images, but their accuracy on actual videos is not satisfactory. This stems from several problematic conditions in videos, such as vehicle motion blur, variety in viewpoints, outliers, and the lack of publicly available video datasets. In this study, we focus on these challenges and propose a license plate detection and recognition scheme for videos based on a temporal matching prior network. Specifically, to improve the robustness of detection and recognition accuracy in the presence of motion blur and outliers, forward and bidirectional matching priors between consecutive frames are properly combined with layer structures specifically designed for plate detection. We also built our own video dataset for the deep training of the proposed network. During network training, we perform data augmentation based on image rotation to increase robustness regarding the various viewpoints in videos.

비디오에서 양방향 문맥 정보를 이용한 상호 협력적인 위치 및 물체 인식 (Collaborative Place and Object Recognition in Video using Bidirectional Context Information)

  • 김성호;권인소
    • 로봇학회논문지
    • /
    • 제1권2호
    • /
    • pp.172-179
    • /
    • 2006
  • In this paper, we present a practical place and object recognition method for guiding visitors in building environments. Recognizing places or objects in real world can be a difficult problem due to motion blur and camera noise. In this work, we present a modeling method based on the bidirectional interaction between places and objects for simultaneous reinforcement for the robust recognition. The unification of visual context including scene context, object context, and temporal context is also. The proposed system has been tested to guide visitors in a large scale building environment (10 topological places, 80 3D objects).

  • PDF

USB 카메라 영상에서 DP 매칭을 이용한 사용자의 손 동작 인식 (Hand Gesture Recognition using DP Matching from USB Camera Video)

  • 하진영;변민우;김진식
    • 산업기술연구
    • /
    • 제29권A호
    • /
    • pp.47-54
    • /
    • 2009
  • In this paper, we proposed hand detection and hand gesture recognition from USB camera video. Firstly, we extract hand region extraction using skin color information from a difference images. Background image is initially stored and extracted from the input images in order to reduce problems from complex backgrounds. After that, 16-directional chain code sequence is computed from the tracking of hand motion. These chain code sequences are compared with pre-trained models using DP matching. Our hand gesture recognition system can be used to control PowerPoint slides or applied to multimedia education systems. We got 92% hand region extraction accuracy and 82.5% gesture recognition accuracy, respectively.

  • PDF