• Title/Summary/Keyword: Computer vision technology

검색결과 666건 처리시간 0.024초

Three-stream network with context convolution module for human-object interaction detection

  • Siadari, Thomhert S.;Han, Mikyong;Yoon, Hyunjin
    • ETRI Journal
    • /
    • 제42권2호
    • /
    • pp.230-238
    • /
    • 2020
  • Human-object interaction (HOI) detection is a popular computer vision task that detects interactions between humans and objects. This task can be useful in many applications that require a deeper understanding of semantic scenes. Current HOI detection networks typically consist of a feature extractor followed by detection layers comprising small filters (eg, 1 × 1 or 3 × 3). Although small filters can capture local spatial features with a few parameters, they fail to capture larger context information relevant for recognizing interactions between humans and distant objects owing to their small receptive regions. Hence, we herein propose a three-stream HOI detection network that employs a context convolution module (CCM) in each stream branch. The CCM can capture larger contexts from input feature maps by adopting combinations of large separable convolution layers and residual-based convolution layers without increasing the number of parameters by using fewer large separable filters. We evaluate our HOI detection method using two benchmark datasets, V-COCO and HICO-DET, and demonstrate its state-of-the-art performance.

범용 깊이 카메라를 이용한 인체 외형 비대칭 측정의 반복성 평가 (Repeatability Test for the Asymmetry Measurement of Human Appearance using General-purpose Depth Cameras)

  • 장준수
    • 동의생리병리학회지
    • /
    • 제30권3호
    • /
    • pp.184-189
    • /
    • 2016
  • Human appearance analysis is an important part of both eastern and western medicine fields, such as Sasang constitutional medicine, rehabilitation medicine, dental medicine, and etc. By the rapid growing of depth camera technology, 3D measuring becomes popular in many applications including medical area. In this study, the possibility of using depth cameras in asymmetry analysis of human appearance is examined. We introduce the development of 3D measurement system using 2 Microsoft Kinect depth cameras and fully automated asymmetry analysis algorithms based on computer vision technology. We compare the proposed automated method to the manual method, which is usually used in asymmetry analysis. As a measure of repeatability, standard deviations of asymmetry indices are examined by 10 times repeated experiments. Experimental results show that the standard deviation of the automated method (1.00mm for face, 1.22mm for body) is better than that of the manual method (2.06mm for face, 3.44mm for body) for the same 3D measurement. We conclude that the automated method using depth cameras can be successfully applicable to practical asymmetry analysis and contribute to reliable human appearance analysis.

Wood Classification of Japanese Fagaceae using Partial Sample Area and Convolutional Neural Networks

  • FATHURAHMAN, Taufik;GUNAWAN, P.H.;PRAKASA, Esa;SUGIYAMA, Junji
    • Journal of the Korean Wood Science and Technology
    • /
    • 제49권5호
    • /
    • pp.491-503
    • /
    • 2021
  • Wood identification is regularly performed by observing the wood anatomy, such as colour, texture, fibre direction, and other characteristics. The manual process, however, could be time consuming, especially when identification work is required at high quantity. Considering this condition, a convolutional neural networks (CNN)-based program is applied to improve the image classification results. The research focuses on the algorithm accuracy and efficiency in dealing with the dataset limitations. For this, it is proposed to do the sample selection process or only take a small portion of the existing image. Still, it can be expected to represent the overall picture to maintain and improve the generalisation capabilities of the CNN method in the classification stages. The experiments yielded an incredible F1 score average up to 93.4% for medium sample area sizes (200 × 200 pixels) on each CNN architecture (VGG16, ResNet50, MobileNet, DenseNet121, and Xception based). Whereas DenseNet121-based architecture was found to be the best architecture in maintaining the generalisation of its model for each sample area size (100, 200, and 300 pixels). The experimental results showed that the proposed algorithm can be an accurate and reliable solution.

균형 잡힌 데이터 증강 기반 영상 감정 분류에 관한 연구 (A Study on Visual Emotion Classification using Balanced Data Augmentation)

  • 정치윤;김무섭
    • 한국멀티미디어학회논문지
    • /
    • 제24권7호
    • /
    • pp.880-889
    • /
    • 2021
  • In everyday life, recognizing people's emotions from their frames is essential and is a popular research domain in the area of computer vision. Visual emotion has a severe class imbalance in which most of the data are distributed in specific categories. The existing methods do not consider class imbalance and used accuracy as the performance metric, which is not suitable for evaluating the performance of the imbalanced dataset. Therefore, we proposed a method for recognizing visual emotion using balanced data augmentation to address the class imbalance. The proposed method generates a balanced dataset by adopting the random over-sampling and image transformation methods. Also, the proposed method uses the Focal loss as a loss function, which can mitigate the class imbalance by down weighting the well-classified samples. EfficientNet, which is the state-of-the-art method for image classification is used to recognize visual emotion. We compare the performance of the proposed method with that of conventional methods by using a public dataset. The experimental results show that the proposed method increases the F1 score by 40% compared with the method without data augmentation, mitigating class imbalance without loss of classification accuracy.

Human Activity Recognition with LSTM Using the Egocentric Coordinate System Key Points

  • Wesonga, Sheilla;Park, Jang-Sik
    • 한국산업융합학회 논문집
    • /
    • 제24권6_1호
    • /
    • pp.693-698
    • /
    • 2021
  • As technology advances, there is increasing need for research in different fields where this technology is applied. On of the most researched topic in computer vision is Human activity recognition (HAR), which has widely been implemented in various fields which include healthcare, video surveillance and education. We therefore present in this paper a human activity recognition system based on scale and rotation while employing the Kinect depth sensors to obtain the human skeleton joints. In contrast to previous approaches that use joint angles, in this paper we propose that each limb has an angle with the X, Y, Z axes which we employ as feature vectors. The use of the joint angles makes our system scale invariant. We further calculate the body relative direction in the egocentric coordinates in order to provide the rotation invariance. For the system parameters, we employ 8 limbs with their corresponding angles each having the X, Y, Z axes from the coordinate system as feature vectors. The extracted features are finally trained and tested with the Long short term memory (LSTM) Network which gives us an average accuracy of 98.3%.

A Study on Image Labeling Technique for Deep-Learning-Based Multinational Tanks Detection Model

  • Kim, Taehoon;Lim, Dongkyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권4호
    • /
    • pp.58-63
    • /
    • 2022
  • Recently, the improvement of computational processing ability due to the rapid development of computing technology has greatly advanced the field of artificial intelligence, and research to apply it in various domains is active. In particular, in the national defense field, attention is paid to intelligent recognition among machine learning techniques, and efforts are being made to develop object identification and monitoring systems using artificial intelligence. To this end, various image processing technologies and object identification algorithms are applied to create a model that can identify friendly and enemy weapon systems and personnel in real-time. In this paper, we conducted image processing and object identification focused on tanks among various weapon systems. We initially conducted processing the tanks' image using a convolutional neural network, a deep learning technique. The feature map was examined and the important characteristics of the tanks crucial for learning were derived. Then, using YOLOv5 Network, a CNN-based object detection network, a model trained by labeling the entire tank and a model trained by labeling only the turret of the tank were created and the results were compared. The model and labeling technique we proposed in this paper can more accurately identify the type of tank and contribute to the intelligent recognition system to be developed in the future.

Capturing research trends in structural health monitoring using bibliometric analysis

  • Yeom, Jaesun;Jeong, Seunghoo;Woo, Han-Gyun;Sim, Sung-Han
    • Smart Structures and Systems
    • /
    • 제29권2호
    • /
    • pp.361-374
    • /
    • 2022
  • As civil infrastructure has continued to age worldwide, its structural integrity has been threatened owing to material deteriorations and continual loadings from the external environment. Structural Health Monitoring (SHM) has emerged as a cost-efficient method for ensuring structural safety and durability. As SHM research has gradually addressed an increasing number of structure-related problems, it has become difficult to understand the changing research topic trends. Although previous review papers have analyzed research trends on specific SHM topics, these studies have faced challenges in providing (1) consistent insights regarding macroscopic SHM research trends, (2) empirical evidence for research topic changes in overall SHM fields, and (3) methodological validations for the insights. To overcome these challenges, this study proposes a framework tailored to capturing the trends of research topics in SHM through a bibliometric and network analysis. The framework is applied to track SHM research topics over 15 years by identifying both quantitative and relational changes in the author keywords provided from representative SHM journals. The results of this study confirm that overall SHM research has become diversified and multi-disciplinary. Especially, the rapidly growing research topics are tightly related to applying machine learning and computer vision techniques to solve SHM-related issues. In addition, the research topic network indicates that damage detection and vibration control have been both steadily and actively studied in SHM research.

Jointly Learning of Heavy Rain Removal and Super-Resolution in Single Images

  • ;김문철
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2020년도 추계학술대회
    • /
    • pp.113-117
    • /
    • 2020
  • Images were taken under various weather such as rain, haze, snow often show low visibility, which can dramatically decrease accuracy of some tasks in computer vision: object detection, segmentation. Besides, previous work to enhance image usually downsample the image to receive consistency features but have not yet good upsample algorithm to recover original size. So, in this research, we jointly implement removal streak in heavy rain image and super resolution using a deep network. We put forth a 2-stage network: a multi-model network followed by a refinement network. The first stage using rain formula in the single image and two operation layers (addition, multiplication) removes rain streak and noise to get clean image in low resolution. The second stage uses refinement network to recover damaged background information as well as upsample, and receive high resolution image. Our method improves visual quality image, gains accuracy in human action recognition task in datasets. Extensive experiments show that our network outperforms the state of the art (SoTA) methods.

  • PDF

DISTANCE MEASUREMENT IN THE AEC/FM INDUSTRY: AN OVERVIEW OF TECHNOLOGIES

  • Jasmine Hines;Abbas Rashidi;Ioannis Brilakis
    • 국제학술발표논문집
    • /
    • The 5th International Conference on Construction Engineering and Project Management
    • /
    • pp.616-623
    • /
    • 2013
  • One of the oldest, most common engineering problems is measuring the dimensions of different objects and the distances between locations. In AEC/FM, related uses vary from large-scale applications such as measuring distances between cities to small-scale applications such as measuring the depth of a crack or the width of a welded joint. Within the last few years, advances in applying new technologies have prompted the development of new measuring devices such as ultrasound and laser-based measurers. Because of wide varieties in type, associated costs, and levels of accuracy, the selection of an optimal measuring technology is challenging for construction engineers and facility managers. To tackle this issue, we present an overview of various measuring technologies adopted by experts in the area of AEC/FM. As the next step, to evaluate the performance of these technologies, we select one indoor and one outdoor case and measure several dimensions using six categories of technologies: tapes, total stations, laser measurers, ultrasound devices, laser scanners, and image-based technologies. Then we evaluate the results according to various metrics such as accuracy, ease of use, operation time, associated costs, compare these results, and recommend optimal technologies for specific applications. The results also revealed that in most applications, computer vision-based technologies outperform traditional devices in terms of ease of use, associated costs, and accuracy.

  • PDF

YOLO와 OpenCV기술을 활용한 현수막 단속 자동화 시스템 방안 (Banner Control Automation System Using YOLO and OpenCV)

  • 김덕원;이지훈
    • 반도체디스플레이기술학회지
    • /
    • 제22권4호
    • /
    • pp.48-52
    • /
    • 2023
  • From the past to the present, banners are consistently used as effective advertising means. In the case of Korea, there are frequent situations in which hidden advertisements are installed. As a result, such hidden advertisement materials may damage urban aesthetics and moreover, incur unnecessary manpower consumption and waste of money. The proposed method classifies the detected banners into good banner and bad banner. The classification results are based on whether the relevant banners are installed in compliance with legal guidelines. In the process, YOLO and Open Computer Vision library are used to determine from various perspectives whether banners in CCTV images comply with the guidelines. YOLO is used to detect the banner area in CCTV images, and OpenCV is used to detect the color values in the area for color comparison. If a banner is detected in the video, the proposed method calculates the location of the banner and the distance from the designated bulletin to determine whether it was installed within the designated location, and then compares whether the color used in the banner is complied with local government guidelines.

  • PDF