• Title/Summary/Keyword: Object detection and tracking

Search Result 444, Processing Time 0.036 seconds

Design of a Vision Chip for Edge Detection with an Elimination Function of Output Offset due to MOSFET Mismatch (MOSFET의 부정합에 의한 출력옵셋 제거기능을 가진 윤곽검출용 시각칩의 설계)

  • Park, Jong-Ho;Kim, Jung-Hwan;Lee, Min-Ho;Shin, Jang-Kyoo
    • Journal of Sensor Science and Technology
    • /
    • v.11 no.5
    • /
    • pp.255-262
    • /
    • 2002
  • Human retina is able to detect the edge of an object effectively. We designed a CMOS vision chip by modeling cells of the retina as hardwares involved in edge detection. There are several fluctuation factors which affect characteristics of MOSFETs during CMOS fabrication process and this effect appears as output offset of the vision chip which is composed of pixel arrays and readout circuits. The vision chip detecting edge information from input image is used for input stage of other systems. Therefore, the output offset of a vision chip determine the efficiency of the entire performance of a system. In order to eliminate the offset at the output stage, we designed a vision chip by using CDS(Correlated Double Sampling) technique. Using standard CMOS process, it is possible to integrate with other circuits. Having reliable output characteristics, this chip can be used at the input stage for many applications, like targe tracking system, fingerprint recognition system, human-friendly robot system and etc.

Automatic Detection of Dissimilar Regions through Multiple Feature Analysis (다중의 특징 분석을 통한 비 유사 영역의 자동적인 검출)

  • Jang, Seok-Woo;Jung, Myunghee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.160-166
    • /
    • 2020
  • As mobile-based hardware technology develops, many kinds of applications are also being developed. In addition, there is an increasing demand to automatically check that the interface of these applications works correctly. In this paper, we describe a method for accurately detecting faulty images from applications by comparing major characteristics from input color images. For this purpose, our method first extracts major characteristics of the input image, then calculates the differences in the extracted major features, and decides if the test image is a normal image or a faulty image dissimilar to the reference image. Experiment results show that the suggested approach robustly determines similar and dissimilar images by comparing major characteristics from input color images. The suggested method is expected to be useful in many real application areas related to computer vision, like video indexing, object detection and tracking, image surveillance, and so on.

Offline In-Hand 3D Modeling System Using Automatic Hand Removal and Improved Registration Method (자동 손 제거와 개선된 정합방법을 이용한 오프라인 인 핸드 3D 모델링 시스템)

  • Kang, Junseok;Yang, Hyeonseok;Lim, Hwasup;Ahn, Sang Chul
    • Journal of the HCI Society of Korea
    • /
    • v.12 no.3
    • /
    • pp.13-23
    • /
    • 2017
  • In this paper, we propose a new in-hand 3D modeling system that improves user convenience. Since traditional modeling systems are inconvenient to use, an in-hand modeling system has been studied, where an object is handled by hand. However, there is also a problem that it requires additional equipment or specific constraints to remove hands for good modeling. In this paper, we propose a contact state change detection algorithm for automatic hand removal and improved ICP algorithm that enables outlier handling and additionally uses color for accurate registration. The proposed algorithm enables accurate modeling without additional equipment or any constraints. Through experiments using real data, we show that it is possible to accomplish accurate modeling under the general conditions without any constraint by using the proposed system.

Background Subtraction Algorithm Based on Multiple Interval Pixel Sampling (다중 구간 샘플링에 기반한 배경제거 알고리즘)

  • Lee, Dongeun;Choi, Young Kyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.1
    • /
    • pp.27-34
    • /
    • 2013
  • Background subtraction is one of the key techniques for automatic video content analysis, especially in the tasks of visual detection and tracking of moving object. In this paper, we present a new sample-based technique for background extraction that provides background image as well as background model. To handle both high-frequency and low-frequency events at the same time, multiple interval background models are adopted. The main innovation concerns the use of a confidence factor to select the best model from the multiple interval background models. To our knowledge, it is the first time that a confidence factor is used for merging several background models in the field of background extraction. Experimental results revealed that our approach based on multiple interval sampling works well in complicated situations containing various speed moving objects with environmental changes.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.

A study on the design of an efficient hardware and software mixed-mode image processing system for detecting patient movement (환자움직임 감지를 위한 효율적인 하드웨어 및 소프트웨어 혼성 모드 영상처리시스템설계에 관한 연구)

  • Seungmin Jung;Euisung Jung;Myeonghwan Kim
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.29-37
    • /
    • 2024
  • In this paper, we propose an efficient image processing system to detect and track the movement of specific objects such as patients. The proposed system extracts the outline area of an object from a binarized difference image by applying a thinning algorithm that enables more precise detection compared to previous algorithms and is advantageous for mixed-mode design. The binarization and thinning steps, which require a lot of computation, are designed based on RTL (Register Transfer Level) and replaced with optimized hardware blocks through logic circuit synthesis. The designed binarization and thinning block was synthesized into a logic circuit using the standard 180n CMOS library and its operation was verified through simulation. To compare software-based performance, performance analysis of binary and thinning operations was also performed by applying sample images with 640 × 360 resolution in a 32-bit FPGA embedded system environment. As a result of verification, it was confirmed that the mixed-mode design can improve the processing speed by 93.8% in the binary and thinning stages compared to the previous software-only processing speed. The proposed mixed-mode system for object recognition is expected to be able to efficiently monitor patient movements even in an edge computing environment where artificial intelligence networks are not applied.

Scaling Attack Method for Misalignment Error of Camera-LiDAR Calibration Model (카메라-라이다 융합 모델의 오류 유발을 위한 스케일링 공격 방법)

  • Yi-ji Im;Dae-seon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1099-1110
    • /
    • 2023
  • The recognition system of autonomous driving and robot navigation performs vision work such as object recognition, tracking, and lane detection after multi-sensor fusion to improve performance. Currently, research on a deep learning model based on the fusion of a camera and a lidar sensor is being actively conducted. However, deep learning models are vulnerable to adversarial attacks through modulation of input data. Attacks on the existing multi-sensor-based autonomous driving recognition system are focused on inducing obstacle detection by lowering the confidence score of the object recognition model.However, there is a limitation that an attack is possible only in the target model. In the case of attacks on the sensor fusion stage, errors in vision work after fusion can be cascaded, and this risk needs to be considered. In addition, an attack on LIDAR's point cloud data, which is difficult to judge visually, makes it difficult to determine whether it is an attack. In this study, image scaling-based camera-lidar We propose an attack method that reduces the accuracy of LCCNet, a fusion model (camera-LiDAR calibration model). The proposed method is to perform a scaling attack on the point of the input lidar. As a result of conducting an attack performance experiment by size with a scaling algorithm, an average of more than 77% of fusion errors were caused.

Evaluation of Video Codec AI-based Multiple tasks (인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법)

  • Kim, Shin;Lee, Yegi;Yoon, Kyoungro;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.273-282
    • /
    • 2022
  • MPEG-VCM(Video Coding for Machine) aims to standardize video codec for machines. VCM provides data sets and anchors, which provide reference data for comparison, for several machine vision tasks including object detection, object segmentation, and object tracking. The evaluation template can be used to compare compression and machine vision task performance between anchor data and various proposed video codecs. However, performance comparison is carried out separately for each machine vision task, and information related to performance evaluation of multiple machine vision tasks on a single bitstream is not provided currently. In this paper, we propose a performance evaluation method of a video codec for AI-based multi-tasks. Based on bits per pixel (BPP), which is the measure of a single bitstream size, and mean average precision(mAP), which is the accuracy measure of each task, we define three criteria for multi-task performance evaluation such as arithmetic average, weighted average, and harmonic average, and to calculate the multi-tasks performance results based on the mAP values. In addition, as the dynamic range of mAP may very different from task to task, performance results for multi-tasks are calculated and evaluated based on the normalized mAP in order to prevent a problem that would be happened because of the dynamic range.

A Bursty Traffics Friendly MAC Protocol in Wireless Sensor Networks (무선센서 네트워크에서 버스티 트래픽에 적합한 MAC 프로토콜)

  • Lee, Jin-young;Kim, Seong-cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.772-778
    • /
    • 2018
  • Due to the recent advances in computing, communication and micro-electromechanical technology, Wireless Sensor Networks (WSNs) applications have been extended from military to many commercial areas such as object tracking, wire detection, and vehicular sensor networks. In some applications bursty data from many sensor nodes may be generated and the generated data from the monitoring area may be sent in a limited time to the final destination, sink node. In this paper, we present a BTF-MAC protocol adequate for WSNs applications in which bursty data packets are required to be transmitted in a limited time. The BTF-MAC is a synchronous duty-cycle MAC protocol and uses a slot-reserved and operational period extension mechanism adapted to the traffics. Our numerical analysis and simulation results show that BTF-MAC outperforms other related protocols such as DW-MAC and SR-MAC in terms of energy consumption and transmission delay.

Depthmap Generation with Registration of LIDAR and Color Images with Different Field-of-View (다른 화각을 가진 라이다와 칼라 영상 정보의 정합 및 깊이맵 생성)

  • Choi, Jaehoon;Lee, Deokwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.28-34
    • /
    • 2020
  • This paper proposes an approach to the fusion of two heterogeneous sensors with two different fields-of-view (FOV): LIDAR and an RGB camera. Registration between data captured by LIDAR and an RGB camera provided the fusion results. Registration was completed once a depthmap corresponding to a 2-dimensional RGB image was generated. For this fusion, RPLIDAR-A3 (manufactured by Slamtec) and a general digital camera were used to acquire depth and image data, respectively. LIDAR sensor provided distance information between the sensor and objects in a scene nearby the sensor, and an RGB camera provided a 2-dimensional image with color information. Fusion of 2D image and depth information enabled us to achieve better performance with applications of object detection and tracking. For instance, automatic driver assistance systems, robotics or other systems that require visual information processing might find the work in this paper useful. Since the LIDAR only provides depth value, processing and generation of a depthmap that corresponds to an RGB image is recommended. To validate the proposed approach, experimental results are provided.