• Title/Summary/Keyword: Visual Detection

Search Result 874, Processing Time 0.021 seconds

Improving visual relationship detection using linguistic and spatial cues

  • Jung, Jaewon;Park, Jongyoul
    • ETRI Journal
    • /
    • v.42 no.3
    • /
    • pp.399-410
    • /
    • 2020
  • Detecting visual relationships in an image is important in an image understanding task. It enables higher image understanding tasks, that is, predicting the next scene and understanding what occurs in an image. A visual relationship comprises of a subject, a predicate, and an object, and is related to visual, language, and spatial cues. The predicate explains the relationship between the subject and object and can be categorized into different categories such as prepositions and verbs. A large visual gap exists although the visual relationship is included in the same predicate. This study improves upon a previous study (that uses language cues using two losses) and a spatial cue (that only includes individual information) by adding relative information on the subject and object of the extant study. The architectural limitation is demonstrated and is overcome to detect all zero-shot visual relationships. A new problem is discovered, and an explanation of how it decreases performance is provided. The experiment is conducted on the VRD and VG datasets and a significant improvement over previous results is obtained.

Real-Time Comprehensive Assistance for Visually Impaired Navigation

  • Amal Al-Shahrani;Amjad Alghamdi;Areej Alqurashi;Raghad Alzahrani;Nuha imam
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.1-10
    • /
    • 2024
  • Individuals with visual impairments face numerous challenges in their daily lives, with navigating streets and public spaces being particularly daunting. The inability to identify safe crossing locations and assess the feasibility of crossing significantly restricts their mobility and independence. Globally, an estimated 285 million people suffer from visual impairment, with 39 million categorized as blind and 246 million as visually impaired, according to the World Health Organization. In Saudi Arabia alone, there are approximately 159 thousand blind individuals, as per unofficial statistics. The profound impact of visual impairments on daily activities underscores the urgent need for solutions to improve mobility and enhance safety. This study aims to address this pressing issue by leveraging computer vision and deep learning techniques to enhance object detection capabilities. Two models were trained to detect objects: one focused on street crossing obstacles, and the other aimed to search for objects. The first model was trained on a dataset comprising 5283 images of road obstacles and traffic signals, annotated to create a labeled dataset. Subsequently, it was trained using the YOLOv8 and YOLOv5 models, with YOLOv5 achieving a satisfactory accuracy of 84%. The second model was trained on the COCO dataset using YOLOv5, yielding an impressive accuracy of 94%. By improving object detection capabilities through advanced technology, this research seeks to empower individuals with visual impairments, enhancing their mobility, independence, and overall quality of life.

Accurate PCB Outline Extraction and Corner Detection for High Precision Machine Vision (고정밀 머신 비전을 위한 정확한 PCB 윤곽선과 코너 검출)

  • Ko, Dong-Min;Choi, Kang-Sun
    • Journal of the Semiconductor & Display Technology
    • /
    • v.16 no.3
    • /
    • pp.53-58
    • /
    • 2017
  • Recently, advance in technology have increased the importance of visual inspection in semiconductor inspection areas. In PCB visual inspection, accurate line estimation is critical to the accuracy of the entire process, since it is utilized in preprocessing steps such as calibration and alignment. We propose a line estimation method that is differently weighted for the line candidates using a histogram of gradient information, when the position of the initial approximate corner points is known. Using the obtained line equation of the outline, corner points can be calculated accurately. The proposed method is compared with the existing method in terms of the accuracy of the detected corner points. The proposed method accurately detects corner points even when the existing method fails. For high-resolution frames of 3.5mega-pixels, the proposed method is performed in 89.01ms.

  • PDF

Visual Sensing of Fires Using Color and Dynamic Features (컬러와 동적 특징을 이용한 화재의 시각적 감지)

  • Do, Yong-Tae
    • Journal of Sensor Science and Technology
    • /
    • v.21 no.3
    • /
    • pp.211-216
    • /
    • 2012
  • Fires are the most common disaster and early fire detection is of great importance to minimize the consequent damage. Simple sensors including smoke detectors are widely used for the purpose but they are able to sense fires only at close proximity. Recently, due to the rapid advances of relevant technologies, vision-based fire sensing has attracted growing attention. In this paper, a novel visual sensing technique to automatically detect fire is presented. The proposed technique consists of multiple steps of image processing: pixel-level, block-level, and frame level. At the first step, fire flame pixel candidates are selected based on their color values in YIQ space from the image of a camera which is installed as a vision sensor at a fire scene. At the second step, the dynamic parts of flames are extracted by comparing two consecutive images. These parts are then represented in regularly divided image blocks to reduce pixel-level detection error and simplify following processing. Finally, the temporal change of the detected blocks is analyzed to confirm the spread of fire. The proposed technique was tested using real fire images and it worked quite reliably.

Robust Position Tracking for Position-Based Visual Servoing and Its Application to Dual-Arm Task (위치기반 비주얼 서보잉을 위한 견실한 위치 추적 및 양팔 로봇의 조작작업에의 응용)

  • Kim, Chan-O;Choi, Sung;Cheong, Joo-No;Yang, Gwang-Woong;Kim, Hong-Seo
    • The Journal of Korea Robotics Society
    • /
    • v.2 no.2
    • /
    • pp.129-136
    • /
    • 2007
  • This paper introduces a position-based robust visual servoing method which is developed for operation of a human-like robot with two arms. The proposed visual servoing method utilizes SIFT algorithm for object detection and CAMSHIFT algorithm for object tracking. While the conventional CAMSHIFT has been used mainly for object tracking in a 2D image plane, we extend its usage for object tracking in 3D space, by combining the results of CAMSHIFT for two image plane of a stereo camera. This approach shows a robust and dependable result. Once the robot's task is defined based on the extracted 3D information, the robot is commanded to carry out the task. We conduct several position-based visual servoing tasks and compare performances under different conditions. The results show that the proposed visual tracking algorithm is simple but very effective for position-based visual servoing.

  • PDF

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

Detection of Visual Attended Regions in Road Images for Assisting Safety Driving (안전 운전 지원을 위한 도로 영상에서 시각 주의 영역 검출)

  • Kim, Jong-Bae
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.1
    • /
    • pp.94-102
    • /
    • 2012
  • Recently entered into an aging socity as the number of elderly drivers is increasing. Traffic accidents of elderly drivers are caused by driver inattentions such as poor vehicle control due to aging, visual information retrieval problems caused by presbyopia, and objects identifying problems caused by low contrast sensitivity. In this paper, detection method of ROIs on the road is proposed. The proposed method creates the saliency map to detect the candidate ROIs from the input image. And, the input image is segmented to obtain the ROIs boundary. Finally, selective visual attention regions are detected according to the presence or absence of a segmented region with saliency pixels. Experimental results from a variety of outdoor environmental conditions, the proposed method presented a fast object detection and a high detection rate.

On-Road Succeeding Vehicle Detection using Characteristic Visual Features (시각적 특징들을 이용한 도로 상의 후방 추종 차량 인식)

  • Adhikari, Shyam Prasad;Cho, Hi-Tek;Yoo, Hyeon-Joong;Yang, Chang-Ju;Kim, Hyong-Suk
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.3
    • /
    • pp.636-644
    • /
    • 2010
  • A method for the detection of on-road succeeding vehicles using visual characteristic features like horizontal edges, shadow, symmetry and intensity is proposed. The proposed method uses the prominent horizontal edges along with the shadow under the vehicle to generate an initial estimate of the vehicle-road surface contact. Fast symmetry detection, utilizing the edge pixels, is then performed to detect the presence of vertically symmetric object, possibly vehicle, in the region above the initially estimated vehicle-road surface contact. A window defined by the horizontal and the vertical line obtained from above along with local perspective information provides a narrow region for the final search of the vehicle. A bounding box around the vehicle is extracted from the horizontal edges, symmetry histogram and a proposed squared difference of intensity measure. Experiments have been performed on natural traffic scenes obtained from a camera mounted on the side view mirror of a host vehicle demonstrate good and reliable performance of the proposed method.

Motion Estimation-based Human Fall Detection for Visual Surveillance

  • Kim, Heegwang;Park, Jinho;Park, Hasil;Paik, Joonki
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.327-330
    • /
    • 2016
  • Currently, the world's elderly population continues to grow at a dramatic rate. As the number of senior citizens increases, detection of someone falling has attracted increasing attention for visual surveillance systems. This paper presents a novel fall-detection algorithm using motion estimation and an integrated spatiotemporal energy map of the object region. The proposed method first extracts a human region using a background subtraction method. Next, we applied an optical flow algorithm to estimate motion vectors, and an energy map is generated by accumulating the detected human region for a certain period of time. We can then detect a fall using k-nearest neighbor (kNN) classification with the previously estimated motion information and energy map. The experimental results show that the proposed algorithm can effectively detect someone falling in any direction, including at an angle parallel to the camera's optical axis.

IR and SAR Sensor Fusion based Target Detection using BMVT-M (BMVT-M을 이용한 IR 및 SAR 융합기반 지상표적 탐지)

  • Lim, Yunji;Kim, Taehun;Kim, Sungho;Song, WooJin;Kim, Kyung-Tae;Kim, Sohyeon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.11
    • /
    • pp.1017-1026
    • /
    • 2015
  • Infrared (IR) target detection is one of the key technologies in Automatic Target Detection/Recognition (ATD/R) for military applications. However, IR sensors have limitations due to the weather sensitivity and atmospheric effects. In recent years, sensor information fusion study is an active research topic to overcome these limitations. SAR sensor is adopted to sensor fusion, because SAR is robust to various weather conditions. In this paper, a Boolean Map Visual Theory-Morphology (BMVT-M) method is proposed to detect targets in SAR and IR images. Moreover, we suggest the IR and SAR image registration and decision level fusion algorithm. The experimental results using OKTAL-SE synthetic images validate the feasibility of sensor fusion-based target detection.