• Title/Summary/Keyword: keypoints

Search Result 69, Processing Time 0.022 seconds

Sign2Gloss2Text-based Sign Language Translation with Enhanced Spatial-temporal Information Centered on Sign Language Movement Keypoints (수어 동작 키포인트 중심의 시공간적 정보를 강화한 Sign2Gloss2Text 기반의 수어 번역)

  • Kim, Minchae;Kim, Jungeun;Kim, Ha Young
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1535-1545
    • /
    • 2022
  • Sign language has completely different meaning depending on the direction of the hand or the change of facial expression even with the same gesture. In this respect, it is crucial to capture the spatial-temporal structure information of each movement. However, sign language translation studies based on Sign2Gloss2Text only convey comprehensive spatial-temporal information about the entire sign language movement. Consequently, detailed information (facial expression, gestures, and etc.) of each movement that is important for sign language translation is not emphasized. Accordingly, in this paper, we propose Spatial-temporal Keypoints Centered Sign2Gloss2Text Translation, named STKC-Sign2 Gloss2Text, to supplement the sequential and semantic information of keypoints which are the core of recognizing and translating sign language. STKC-Sign2Gloss2Text consists of two steps, Spatial Keypoints Embedding, which extracts 121 major keypoints from each image, and Temporal Keypoints Embedding, which emphasizes sequential information using Bi-GRU for extracted keypoints of sign language. The proposed model outperformed all Bilingual Evaluation Understudy(BLEU) scores in Development(DEV) and Testing(TEST) than Sign2Gloss2Text as the baseline, and in particular, it proved the effectiveness of the proposed methodology by achieving 23.19, an improvement of 1.87 based on TEST BLEU-4.

Keypoints-Based 2D Virtual Try-on Network System

  • Pham, Duy Lai;Ngyuen, Nhat Tan;Chung, Sun-Tae
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.186-203
    • /
    • 2020
  • Image-based Virtual Try-On Systems are among the most potential solution for virtual fitting which tries on a target clothes into a model person image and thus have attracted considerable research efforts. In many cases, current solutions for those fails in achieving naturally looking virtual fitted image where a target clothes is transferred into the body area of a model person of any shape and pose while keeping clothes context like texture, text, logo without distortion and artifacts. In this paper, we propose a new improved image-based virtual try-on network system based on keypoints, which we name as KP-VTON. The proposed KP-VTON first detects keypoints in the target clothes and reliably predicts keypoints in the clothes of a model person image by utilizing a dense human pose estimation. Then, through TPS transformation calculated by utilizing the keypoints as control points, the warped target clothes image, which is matched into the body area for wearing the target clothes, is obtained. Finally, a new try-on module adopting Attention U-Net is applied to handle more detailed synthesis of virtual fitted image. Extensive experiments on a well-known dataset show that the proposed KP-VTON performs better the state-of-the-art virtual try-on systems.

Performance Comparison and Analysis between Keypoints Extraction Algorithms using Drone Images (드론 영상을 이용한 특징점 추출 알고리즘 간의 성능 비교)

  • Lee, Chung Ho;Kim, Eui Myoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.2
    • /
    • pp.79-89
    • /
    • 2022
  • Images taken using drones have been applied to fields that require rapid decision-making as they can quickly construct high-quality 3D spatial information for small regions. To construct spatial information based on drone images, it is necessary to determine the relationship between images by extracting keypoints between adjacent drone images and performing image matching. Therefore, in this study, three study regions photographed using a drone were selected: a region where parking lots and a lake coexisted, a downtown region with buildings, and a field region of natural terrain, and the performance of AKAZE (Accelerated-KAZE), BRISK (Binary Robust Invariant Scalable Keypoints), KAZE, ORB (Oriented FAST and Rotated BRIEF), SIFT (Scale Invariant Feature Transform), and SURF (Speeded Up Robust Features) algorithms were analyzed. The performance of the keypoints extraction algorithms was compared with the distribution of extracted keypoints, distribution of matched points, processing time, and matching accuracy. In the region where the parking lot and lake coexist, the processing speed of the BRISK algorithm was fast, and the SURF algorithm showed excellent performance in the distribution of keypoints and matched points and matching accuracy. In the downtown region with buildings, the processing speed of the AKAZE algorithm was fast and the SURF algorithm showed excellent performance in the distribution of keypoints and matched points and matching accuracy. In the field region of natural terrain, the keypoints and matched points of the SURF algorithm were evenly distributed throughout the image taken by drone, but the AKAZE algorithm showed the highest matching accuracy and processing speed.

Deep Learning-based Keypoint Filtering for Remote Sensing Image Registration (원격 탐사 영상 정합을 위한 딥러닝 기반 특징점 필터링)

  • Sung, Jun-Young;Lee, Woo-Ju;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.26 no.1
    • /
    • pp.26-38
    • /
    • 2021
  • In this paper, DLKF (Deep Learning Keypoint Filtering), the deep learning-based keypoint filtering method for the rapidization of the image registration method for remote sensing images is proposed. The complexity of the conventional feature-based image registration method arises during the feature matching step. To reduce this complexity, this paper proposes to filter only the keypoints detected in the artificial structure among the keypoints detected in the keypoint detector by ensuring that the feature matching is matched with the keypoints detected in the artificial structure of the image. For reducing the number of keypoints points as preserving essential keypoints, we preserve keypoints adjacent to the boundaries of the artificial structure, and use reduced images, and crop image patches overlapping to eliminate noise from the patch boundary as a result of the image segmentation method. the proposed method improves the speed and accuracy of registration. To verify the performance of DLKF, the speed and accuracy of the conventional keypoints extraction method were compared using the remote sensing image of KOMPSAT-3 satellite. Based on the SIFT-based registration method, which is commonly used in households, the SURF-based registration method, which improved the speed of the SIFT method, improved the speed by 2.6 times while reducing the number of keypoints by about 18%, but the accuracy decreased from 3.42 to 5.43. Became. However, when the proposed method, DLKF, was used, the number of keypoints was reduced by about 82%, improving the speed by about 20.5 times, while reducing the accuracy to 4.51.

Detection of video editing points using facial keypoints (얼굴 특징점을 활용한 영상 편집점 탐지)

  • Joshep Na;Jinho Kim;Jonghyuk Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, various services using artificial intelligence(AI) are emerging in the media field as well However, most of the video editing, which involves finding an editing point and attaching the video, is carried out in a passive manner, requiring a lot of time and human resources. Therefore, this study proposes a methodology that can detect the edit points of video according to whether person in video are spoken by using Video Swin Transformer. First, facial keypoints are detected through face alignment. To this end, the proposed structure first detects facial keypoints through face alignment. Through this process, the temporal and spatial changes of the face are reflected from the input video data. And, through the Video Swin Transformer-based model proposed in this study, the behavior of the person in the video is classified. Specifically, after combining the feature map generated through Video Swin Transformer from video data and the facial keypoints detected through Face Alignment, utterance is classified through convolution layers. In conclusion, the performance of the image editing point detection model using facial keypoints proposed in this paper improved from 87.46% to 89.17% compared to the model without facial keypoints.

Finger-Knuckle-Print Verification Using Vector Similarity Matching of Keypoints (특징점간의 벡터 유사도 정합을 이용한 손가락 관절문 인증)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.9
    • /
    • pp.1057-1066
    • /
    • 2013
  • Personal verification using finger-knuckle-print(FKP) uses lines and creases at the finger-knuckle area, so the orientation information of texture is an important feature. In this paper, we propose an effective FKP verification method which extracts keypoints using SIFT algorithm and matches the keypoints by vector similarity. The vector is defined as a direction vector which connects a keypoint extracted from a query image and a corresponding keypoint extracted from a reference image. Since the direction vector is created by a pair of local keypoints, the direction vector itself represents only a local feature. However, it has an advantage of expanding a local feature to a global feature by comparing the vector similarity among vectors in two images. The experimental results show that the proposed method is superior to the previous methods based on orientation codes.

A Scheme for Matching Satellite Images Using SIFT (SIFT를 이용한 위성사진의 정합기법)

  • Kang, Suk-Chen;Whoang, In-Teck;Choi, Kwang-Nam
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.13-23
    • /
    • 2009
  • In this paper we propose an approach for localizing objects in satellite images. Our method exploits matching features based on description vectors. We applied Scale Invariant Feature Transform (SIFT) to object localization. First, we find keypoints of the satellite images and the objects and generate description vectors of the keypoints. Next, we calculate the similarity between description vectors, and obtain matched keypoints. Finally, we weight the adjacent pixels to the keypoints and determine the location of the matched object. The experiments of object localization by using SIFT show good results on various scale and affine transformed images. In this paper the proposed methods use Google Earth satellite images.

  • PDF

Multi-Object Tracking Based on Keypoints Using Homography in Mobile Environments (모바일 환경 Homography를 이용한 특징점 기반 다중 객체 추적)

  • Han, Woo ri;Kim, Young-Seop;Lee, Yong-Hwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.14 no.3
    • /
    • pp.67-72
    • /
    • 2015
  • This paper proposes an object tracking system based on keypoints using homography in mobile environments. The proposed system is based on markerless tracking, and there are four modules which are recognition, tracking, detecting and learning module. Recognition module detects and identifies an object to be matched on current frame correspond to the database using LSH through SURF, and then this module generates a standard object information. Tracking module tracks an object using homography information that generate by being matched on the learned object keypoints to the current object keypoints. Then update the window included the object for defining object's pose. Detecting module finds out the object based on having the best possible knowledge available among the learned objects information, when the system fails to track. The experimental results show that the proposed system is able to recognize and track objects with updating object's pose for the use of mobile platform.

Deep Local Multi-level Feature Aggregation Based High-speed Train Image Matching

  • Li, Jun;Li, Xiang;Wei, Yifei;Wang, Xiaojun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1597-1610
    • /
    • 2022
  • At present, the main method of high-speed train chassis detection is using computer vision technology to extract keypoints from two related chassis images firstly, then matching these keypoints to find the pixel-level correspondence between these two images, finally, detection and other steps are performed. The quality and accuracy of image matching are very important for subsequent defect detection. Current traditional matching methods are difficult to meet the actual requirements for the generalization of complex scenes such as weather, illumination, and seasonal changes. Therefore, it is of great significance to study the high-speed train image matching method based on deep learning. This paper establishes a high-speed train chassis image matching dataset, including random perspective changes and optical distortion, to simulate the changes in the actual working environment of the high-speed rail system as much as possible. This work designs a convolutional neural network to intensively extract keypoints, so as to alleviate the problems of current methods. With multi-level features, on the one hand, the network restores low-level details, thereby improving the localization accuracy of keypoints, on the other hand, the network can generate robust keypoint descriptors. Detailed experiments show the huge improvement of the proposed network over traditional methods.

Correction of Rotated Region in Medical Images Using SIFT Features (SIFT 특징을 이용한 의료 영상의 회전 영역 보정)

  • Kim, Ji-Hong;Jang, Ick-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.1
    • /
    • pp.17-24
    • /
    • 2015
  • In this paper, a novel scheme for correcting rotated region in medical images using SIFT(Scale Invariant Feature Transform) algorithm is presented. Using the feature extraction function of SIFT, the rotation angle of rotated object in medical images is calculated as follows. First, keypoints of both reference and rotated medical images are extracted by SIFT. Second, the matching process is performed to the keypoints located at the predetermined ROI(Region Of Interest) at which objects are not cropped or added by rotating the image. Finally, degrees of matched keypoints are calculated and the rotation angle of the rotated object is determined by averaging the difference of the degrees. The simulation results show that the proposed scheme has excellent performance for correcting the rotated region in medical images.