• Title/Summary/Keyword: Computer Vision

Search Result 2,208, Processing Time 0.026 seconds

Keypoint-based Fast CU Depth Decision for HEVC Intra Coding (HEVC 인트라 부호화를 위한 특징점 기반의 고속 CU Depth 결정)

  • Kim, Namuk;Lim, Sung-Chang;Ko, Hyunsuk;Jeon, Byeungwoo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.2
    • /
    • pp.89-96
    • /
    • 2016
  • The High Efficiency Video Coding (MPEG-H HEVC/ITU-T H.265) is the newest video coding standard which has the quadtree-structured coding unit (CU). The quadtree-structure splits a CU adaptively, and its optimum CU depth can be determined by rate-distortion optimization. Such HEVC encoding requires very high computational complexity for CU depth decision. Motivated that the blob detection, which is a well-known algorithm in computer vision, detects keypoints in pictures and decision of CU depth needs to consider high frequency energy distribution, in this paper, we propose to utilize these keypoints for fast CU depth decision. Experimental results show that 20% encoding time can be saved with only slightly increasing BDBR by 0.45% on all intra case.

Cracks Detection of Concrete Slab Surface using ART2 based Quantization (ART2 기반 양자화를 이용한 콘크리트 슬래브 표면의 균열 검출)

  • Kim, Kwang-Baek;Cho, Jae-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.10
    • /
    • pp.1897-1902
    • /
    • 2008
  • In computer vision analysis of detecting concrete slab surface cracks, there are many difficulties to overcome. Target images often have defamations due to the light condition and other external environment. Another difficulties in detecting concrete crack image is that there is no clear distinction in intensity between the crack and the surface since the surface is often irregular. In this paper, we apply ART2 based quantization in order to classify target concrete slab surface images into several areas with respect to the light intensity. From those quantized areas, we investigate the distribution of real cracks and noises. Then, we extract candidate crack areas after applying noise removal process to areas which have be th oracle and noises. Finally, crack areas are recognized by using morphological features of cracks from such candidate areas. In experiment with real world concrete slab structure images, our algorithm has advantage in recognizing accuracy of cracks to other algorithms especially in relatively brighter areas of concrete surface.

Relational matching for solving initial approximation (관계영상정합을 이용한 초기근사값 결정)

  • 조우석
    • Korean Journal of Remote Sensing
    • /
    • v.12 no.1
    • /
    • pp.43-59
    • /
    • 1996
  • The objective of this research is to investigate the potential of relational matching in one of the fundamental photogrammetric processes, that is initial approximation problem. The automatic relative orientation procedures of aerial stereopairs have been investigated. The fact that the existing methods suffer from approximations, distortions (geometric and radiometric), occlusions, and breaklines is the motivation to investigate relational matching which appears to be a much more general solution. An elegant way of solving the initial approximation problem by using distinct(special) relationship from relational description is suggested and experimented. As for evaluation function, the cost function was implemented. The detection of erroneous matching is incorporated as a part of proposed relational matching scheme. Experiments with real urban area images where large numbers of repetitive patterns, breaklines, and occluded areas are present prove the feasibility of implementation of the proposed relational matching scheme. The investigation of relational matching in the domain of image matching problem provides advantages and disadvantages over the existing image matching methods and shows the future area of development and implementation of relational matching in the field of digital photogrammetry.

A review on deep learning-based structural health monitoring of civil infrastructures

  • Ye, X.W.;Jin, T.;Yun, C.B.
    • Smart Structures and Systems
    • /
    • v.24 no.5
    • /
    • pp.567-585
    • /
    • 2019
  • In the past two decades, structural health monitoring (SHM) systems have been widely installed on various civil infrastructures for the tracking of the state of their structural health and the detection of structural damage or abnormality, through long-term monitoring of environmental conditions as well as structural loadings and responses. In an SHM system, there are plenty of sensors to acquire a huge number of monitoring data, which can factually reflect the in-service condition of the target structure. In order to bridge the gap between SHM and structural maintenance and management (SMM), it is necessary to employ advanced data processing methods to convert the original multi-source heterogeneous field monitoring data into different types of specific physical indicators in order to make effective decisions regarding inspection, maintenance and management. Conventional approaches to data analysis are confronted with challenges from environmental noise, the volume of measurement data, the complexity of computation, etc., and they severely constrain the pervasive application of SHM technology. In recent years, with the rapid progress of computing hardware and image acquisition equipment, the deep learning-based data processing approach offers a new channel for excavating the massive data from an SHM system, towards autonomous, accurate and robust processing of the monitoring data. Many researchers from the SHM community have made efforts to explore the applications of deep learning-based approaches for structural damage detection and structural condition assessment. This paper gives a review on the deep learning-based SHM of civil infrastructures with the main content, including a brief summary of the history of the development of deep learning, the applications of deep learning-based data processing approaches in the SHM of many kinds of civil infrastructures, and the key challenges and future trends of the strategy of deep learning-based SHM.

Passive 3D motion optical data in shaking table tests of a SRG-reinforced masonry wall

  • De Canio, Gerardo;de Felice, Gianmarco;De Santis, Stefano;Giocoli, Alessandro;Mongelli, Marialuisa;Paolacci, Fabrizio;Roselli, Ivan
    • Earthquakes and Structures
    • /
    • v.10 no.1
    • /
    • pp.53-71
    • /
    • 2016
  • Unconventional computer vision and image processing techniques offer significant advantages for experimental applications to shaking table testing, as they allow the overcoming of most typical problems of traditional sensors, such as encumbrance, limitations in the number of devices, range restrictions and risk of damage of the instruments in case of specimen failure. In this study, a 3D motion optical system was applied to analyze shake table tests carried out, up to failure, on a natural-scale masonry structure retrofitted with steel reinforced grout (SRG). The system makes use of wireless passive spherical retro-reflecting markers positioned on several points of the specimen, whose spatial displacements are recorded by near-infrared digital cameras. Analyses in the time domain allowed the monitoring of the deformations of the wall and of crack development through a displacement data processing (DDP) procedure implemented ad hoc. Fundamental frequencies and modal shapes were calculated in the frequency domain through an integrated methodology of experimental/operational modal analysis (EMA/OMA) techniques with 3D finite element analysis (FEA). Meaningful information on the structural response (e.g., displacements, damage development, and dynamic properties) were obtained, profitably integrating the results from conventional measurements. Furthermore, the comparison between 3D motion system and traditional instruments (i.e., displacement transducers and accelerometers) permitted a mutual validation of both experimental data and measurement methods.

Development a Meal Support System for the Visually Impaired Using YOLO Algorithm (YOLO알고리즘을 활용한 시각장애인용 식사보조 시스템 개발)

  • Lee, Gun-Ho;Moon, Mi-Kyeong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.5
    • /
    • pp.1001-1010
    • /
    • 2021
  • Normal people are not deeply aware of their dependence on sight when eating. However, since the visually impaired do not know what kind of food is on the table, the assistant next to them holds the blind spoon and explains the position of the food in a clockwise direction, front and rear, left and right, etc. In this paper, we describe the development of a meal assistance system that recognizes each food image and announces the name of the food by voice when a visually impaired person looks at their table using a smartphone camera. This system extracts the food on which the spoon is placed through the YOLO model that has learned the image of food and tableware (spoon), recognizes what the food is, and notifies it by voice. Through this system, it is expected that the visually impaired will be able to eat without the help of a meal assistant, thereby increasing their self-reliance and satisfaction.

Extraction of Workers and Heavy Equipment and Muliti-Object Tracking using Surveillance System in Construction Sites (건설 현장 CCTV 영상을 이용한 작업자와 중장비 추출 및 다중 객체 추적)

  • Cho, Young-Woon;Kang, Kyung-Su;Son, Bo-Sik;Ryu, Han-Guk
    • Journal of the Korea Institute of Building Construction
    • /
    • v.21 no.5
    • /
    • pp.397-408
    • /
    • 2021
  • The construction industry has the highest occupational accidents/injuries and has experienced the most fatalities among entire industries. Korean government installed surveillance camera systems at construction sites to reduce occupational accident rates. Construction safety managers are monitoring potential hazards at the sites through surveillance system; however, the human capability of monitoring surveillance system with their own eyes has critical issues. A long-time monitoring surveillance system causes high physical fatigue and has limitations in grasping all accidents in real-time. Therefore, this study aims to build a deep learning-based safety monitoring system that can obtain information on the recognition, location, identification of workers and heavy equipment in the construction sites by applying multiple object tracking with instance segmentation. To evaluate the system's performance, we utilized the Microsoft common objects in context and the multiple object tracking challenge metrics. These results prove that it is optimal for efficiently automating monitoring surveillance system task at construction sites.

Recent Trends in Human Pose Estimation Based on a Single Image (단일 이미지에 기반을 둔 사람의 포즈 추정에 대한 연구 동향)

  • Cho, Jungchan
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.5
    • /
    • pp.31-42
    • /
    • 2019
  • With the recent development of deep learning technology, remarkable achievements have been made in many research areas of computer vision. Deep learning has also made dramatic improvement in two-dimensional or three-dimensional human pose estimation based on a single image, and many researchers have been expanding the scope of this problem. The human pose estimation is one of the most important research fields because there are various applications, especially it is a key factor in understanding the behavior, state, and intention of people in image or video analysis. Based on this background, this paper surveys research trends in estimating human poses based on a single image. Because there are various research results for robust and accurate human pose estimation, this paper introduces them in two separated subsections: 2D human pose estimation and 3D human pose estimation. Moreover, this paper summarizes famous data sets used in this field and introduces various studies which utilize human poses to solve their own problem.

Crowd Behavior Detection using Convolutional Neural Network (컨볼루션 뉴럴 네트워크를 이용한 군중 행동 감지)

  • Ullah, Waseem;Ullah, Fath U Min;Baik, Sung Wook;Lee, Mi Young
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.6
    • /
    • pp.7-14
    • /
    • 2019
  • The automatic monitoring and detection of crowd behavior in the surveillance videos has obtained significant attention in the field of computer vision due to its vast applications such as security, safety and protection of assets etc. Also, the field of crowd analysis is growing upwards in the research community. For this purpose, it is very necessary to detect and analyze the crowd behavior. In this paper, we proposed a deep learning-based method which detects abnormal activities in surveillance cameras installed in a smart city. A fine-tuned VGG-16 model is trained on publicly available benchmark crowd dataset and is tested on real-time streaming. The CCTV camera captures the video stream, when abnormal activity is detected, an alert is generated and is sent to the nearest police station to take immediate action before further loss. We experimentally have proven that the proposed method outperforms over the existing state-of-the-art techniques.

Deep Learning-based Real-Time Super-Resolution Architecture Design (경량화된 딥러닝 구조를 이용한 실시간 초고해상도 영상 생성 기술)

  • Ahn, Saehyun;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.167-174
    • /
    • 2021
  • Recently, deep learning technology is widely used in various computer vision applications, such as object recognition, classification, and image generation. In particular, the deep learning-based super-resolution has been gaining significant performance improvement. Fast super-resolution convolutional neural network (FSRCNN) is a well-known model as a deep learning-based super-resolution algorithm that output image is generated by a deconvolutional layer. In this paper, we propose an FPGA-based convolutional neural networks accelerator that considers parallel computing efficiency. In addition, the proposed method proposes Optimal-FSRCNN, which is modified the structure of FSRCNN. The number of multipliers is compressed by 3.47 times compared to FSRCNN. Moreover, PSNR has similar performance to FSRCNN. We developed a real-time image processing technology that implements on FPGA.