• Title/Summary/Keyword: kinect

Search Result 409, Processing Time 0.021 seconds

Detection of User Behavior Using Real-Time User Joints and YOLOv3 (실시간 사용자 관절과 YOLOv3를 이용한 사용자 행동 검출)

  • Oh, Ye-Jun;Kim, Sang-Joon;Choi, Hee-Jo;Park, Goo-Man
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.228-231
    • /
    • 2021
  • 인물의 행동 및 이동을 인식하는 것은 다양한 분야에서 활용될 수 있다. 사람의 행동을 파악하여 니즈를 예상하고 맞춤형 콘텐츠를 제공하거나 행동을 예측하여 범죄나 폭력을 예방하는 등 여러 방면으로 활용 가능하다. 그러나 이동과 현재 위치 정보만으로 인물의 행동을 예측하기에는 한계가 있다. 본 논문에서는 실시간으로 사람의 이동과 행동을 인식하기 위해 Kinect v2가 제공하는 관절 정보와 YOLOv3를 이용하여 실시간으로 사람의 행동을 인식하는 시스템을 제작하였다.

  • PDF

3D Human Keypoint Detection With RGB and Depth Image (RGB 이미지와 Depth 이미지를 이용한 3D 휴먼 키포인트 탐지)

  • Jeong, Keunseok;Lee, Yegi;Yoon, Kyoungro
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.239-241
    • /
    • 2021
  • 2019 발생한 COVID-19로 인하여 전 세계 사람들의 여가 활동이 제한되면서 건강관리를 위해 홈 트레이닝에 많은 관심을 기울이고 있다. 뿐만 아니라 최근 컴퓨팅 기술의 발전에 따라 사람의 행동을 눈으로 직접 판단했던 작업을 컴퓨터가 키포인트 탐지를 통해 인간의 행동을 이해하려는 많은 연구가 진행되고 있다. 이에 따라 본 논문은 Azure Kinect를 이용하여 촬영한 RGB 이미지와 Depth 이미지를 이용하여 3D 키포인트를 추정한다. RGB 이미지는 2D 키포인트 탐지기를 이용하여 2차원 공간에서의 좌표를 탐지한다. 앞서 탐지한 2D 좌표를 Depth 이미지에 투영하여 추출한 3D 키포인트의 깊이 값을 이용하여 3D 키포인트 탐지에 대한 연구 개발하였다.

  • PDF

A Design and Implementation of English Word Learning Application (영어 단어 학습 애플리케이션 설계 및 구현)

  • Lee, Won Joo;Lee, Ki Won;Lee, Min Cheol;Lee, Jin Ho;Heo, Min Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.59-60
    • /
    • 2022
  • 본 논문에서는 유아 영어 단어 학습 애플리케이션을 설계하고 구현한다. 이 애플리케이션은 키넥트 센서의 음성 인식 기능을 활용하여 동물과 음식 분야의 단어 학습 기능을 제공한다. 화면에 출력된 이미지에 해당하는 영어 단어를 말하면 키넥트 센서에서 그 음성을 인식하여 해당 단어의 발음이 정확한지 판별한다. 주어진 시간 내에 다양한 단어를 정확하게 발음함으로써 높은 점수를 취득하도록 구현한다.

  • PDF

An Intelligent Emotion Recognition Model Using Facial and Bodily Expressions

  • Jae Kyeong Kim;Won Kuk Park;Il Young Choi
    • Asia pacific journal of information systems
    • /
    • v.27 no.1
    • /
    • pp.38-53
    • /
    • 2017
  • As sensor technologies and image processing technologies make collecting information on users' behavior easy, many researchers have examined automatic emotion recognition based on facial expressions, body expressions, and tone of voice, among others. Specifically, many studies have used normal cameras in the multimodal case using facial and body expressions. Thus, previous studies used a limited number of information because normal cameras generally produce only two-dimensional images. In the present research, we propose an artificial neural network-based model using a high-definition webcam and Kinect to recognize users' emotions from facial and bodily expressions when watching a movie trailer. We validate the proposed model in a naturally occurring field environment rather than in an artificially controlled laboratory environment. The result of this research will be helpful in the wide use of emotion recognition models in advertisements, exhibitions, and interactive shows.

The Individual Discrimination Location Tracking Technology for Multimodal Interaction at the Exhibition (전시 공간에서 다중 인터랙션을 위한 개인식별 위치 측위 기술 연구)

  • Jung, Hyun-Chul;Kim, Nam-Jin;Choi, Lee-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.19-28
    • /
    • 2012
  • After the internet era, we are moving to the ubiquitous society. Nowadays the people are interested in the multimodal interaction technology, which enables audience to naturally interact with the computing environment at the exhibitions such as gallery, museum, and park. Also, there are other attempts to provide additional service based on the location information of the audience, or to improve and deploy interaction between subjects and audience by analyzing the using pattern of the people. In order to provide multimodal interaction service to the audience at the exhibition, it is important to distinguish the individuals and trace their location and route. For the location tracking on the outside, GPS is widely used nowadays. GPS is able to get the real time location of the subjects moving fast, so this is one of the important technologies in the field requiring location tracking service. However, as GPS uses the location tracking method using satellites, the service cannot be used on the inside, because it cannot catch the satellite signal. For this reason, the studies about inside location tracking are going on using very short range communication service such as ZigBee, UWB, RFID, as well as using mobile communication network and wireless lan service. However these technologies have shortcomings in that the audience needs to use additional sensor device and it becomes difficult and expensive as the density of the target area gets higher. In addition, the usual exhibition environment has many obstacles for the network, which makes the performance of the system to fall. Above all these things, the biggest problem is that the interaction method using the devices based on the old technologies cannot provide natural service to the users. Plus the system uses sensor recognition method, so multiple users should equip the devices. Therefore, there is the limitation in the number of the users that can use the system simultaneously. In order to make up for these shortcomings, in this study we suggest a technology that gets the exact location information of the users through the location mapping technology using Wi-Fi and 3d camera of the smartphones. We applied the signal amplitude of access point using wireless lan, to develop inside location tracking system with lower price. AP is cheaper than other devices used in other tracking techniques, and by installing the software to the user's mobile device it can be directly used as the tracking system device. We used the Microsoft Kinect sensor for the 3D Camera. Kinect is equippedwith the function discriminating the depth and human information inside the shooting area. Therefore it is appropriate to extract user's body, vector, and acceleration information with low price. We confirm the location of the audience using the cell ID obtained from the Wi-Fi signal. By using smartphones as the basic device for the location service, we solve the problems of additional tagging device and provide environment that multiple users can get the interaction service simultaneously. 3d cameras located at each cell areas get the exact location and status information of the users. The 3d cameras are connected to the Camera Client, calculate the mapping information aligned to each cells, get the exact information of the users, and get the status and pattern information of the audience. The location mapping technique of Camera Client decreases the error rate that occurs on the inside location service, increases accuracy of individual discrimination in the area through the individual discrimination based on body information, and establishes the foundation of the multimodal interaction technology at the exhibition. Calculated data and information enables the users to get the appropriate interaction service through the main server.

HEVC Encoder Optimization using Depth Information (깊이정보를 이용한 HEVC의 인코더 고속화 방법)

  • Lee, Yoon Jin;Bae, Dong In;Park, Gwang Hoon
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.640-655
    • /
    • 2014
  • Many of today's video systems have additional depth camera to provide extra features such as 3D support. Thanks to these changes made in multimedia system, it is now much easier to obtain depth information of the video. Depth information can be used in various areas such as object classification, background area recognition, and so on. With depth information, we can achieve even higher coding efficiency compared to only using conventional method. Thus, in this paper, we propose the 2D video coding algorithm which uses depth information on top of the next generation 2D video codec HEVC. Background area can be recognized with depth information and by performing HEVC with it, coding complexity can be reduced. If current CU is background area, we propose the following three methods, 1) Earlier stop split structure of CU with PU SKIP mode, 2) Limiting split structure of CU with CU information in temporal position, 3) Limiting the range of motion searching. We implement our proposal using HEVC HM 12.0 reference software. With these methods results shows that encoding complexity is reduced more than 40% with only 0.5% BD-Bitrate loss. Especially, in case of video acquired through the Kinect developed by Microsoft Corp., encoding complexity is reduced by max 53% without a loss of quality. So, it is expected that these techniques can apply real-time online communication, mobile or handheld video service and so on.

Template-Matching-based High-Speed Face Tracking Method using Depth Information (깊이 정보를 이용한 템플릿 매칭 기반의 고속 얼굴 추적 방법)

  • Kim, Wooyoul;Seo, Youngho;Kim, Dongwook
    • Journal of Broadcast Engineering
    • /
    • v.18 no.3
    • /
    • pp.349-361
    • /
    • 2013
  • This paper proposes a fast face tracking method with only depth information. It is basically a template matching method, but it uses a early termination scheme and a sparse search scheme to reduce the execution time to solve the problem of a template matching method, large execution time. Also a refinement process with the neighboring pixels is incorporated to alleviate the tracking error. The depth change of the face being tracked is compensated by predicting the depth of the face and resizing the template. Also the search area is adjusted on the basis of the resized template. With home-made test sequences, the parameters to be used in face tracking are determined empirically. Then the proposed algorithm and the extracted parameters are applied to the other home-made test sequences and a MPEG multi-view test sequence. The experimental results showed that the average tracking error and the execution time for the home-made sequences by Kinect ($640{\times}480$) were about 3% and 2.45ms, while the MPEG test sequence ($1024{\times}768$) showed about 1% of tracking error and 7.46ms of execution time.

Depth Upsampling Method Using Total Generalized Variation (일반적 총변이를 이용한 깊이맵 업샘플링 방법)

  • Hong, Su-Min;Ho, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.21 no.6
    • /
    • pp.957-964
    • /
    • 2016
  • Acquisition of reliable depth maps is a critical requirement in many applications such as 3D videos and free-viewpoint TV. Depth information can be obtained from the object directly using physical sensors, such as infrared ray (IR) sensors. Recently, Time-of-Flight (ToF) range camera including KINECT depth camera became popular alternatives for dense depth sensing. Although ToF cameras can capture depth information for object in real time, but are noisy and subject to low resolutions. Recently, filter-based depth up-sampling algorithms such as joint bilateral upsampling (JBU) and noise-aware filter for depth up-sampling (NAFDU) have been proposed to get high quality depth information. However, these methods often lead to texture copying in the upsampled depth map. To overcome this limitation, we formulate a convex optimization problem using higher order regularization for depth map upsampling. We decrease the texture copying problem of the upsampled depth map by using edge weighting term that chosen by the edge information. Experimental results have shown that our scheme produced more reliable depth maps compared with previous methods.

Augmented Reality Authoring Tool with Marker & Gesture Interactive Features (마커 및 제스처 상호작용이 가능한 증강현실 저작도구)

  • Shim, Jinwook;Kong, Minje;Kim, Hayoung;Chae, Seungho;Jeong, Kyungho;Seo, Jonghoon;Han, Tack-Don
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.6
    • /
    • pp.720-734
    • /
    • 2013
  • In this paper, we suggest an augmented reality authoring tool system that users can easily make augmented reality contents using hand gesture and marker-based interaction methods. The previous augmented reality authoring tools are focused on augmenting a virtual object and to interact with this kind of augmented reality contents, user used the method utilizing marker or sensor. We want to solve this limited interaction method problem by applying marker based interaction method and gesture interaction method using depth sensing camera, Kinect. In this suggested system, user can easily develop simple form of marker based augmented reality contents through interface. Also, not just providing fragmentary contents, this system provides methods that user can actively interact with augmented reality contents. This research provides two interaction methods, one is marker based method using two markers and the other is utilizing marker occlusion. In addition, by recognizing and tracking user's bare hand, this system provides gesture interaction method which can zoom-in, zoom-out, move and rotate object. From heuristic evaluation about authoring tool and compared usability about marker and gesture interaction, this study confirmed a positive result.