통합 검색 | Korea Science

Improving visual relationship detection using linguistic and spatial cues

Jung, Jaewon;Park, Jongyoul
- ETRI Journal
- /
- 제42권3호
- /
- pp.399-410
- /
- 2020
Detecting visual relationships in an image is important in an image understanding task. It enables higher image understanding tasks, that is, predicting the next scene and understanding what occurs in an image. A visual relationship comprises of a subject, a predicate, and an object, and is related to visual, language, and spatial cues. The predicate explains the relationship between the subject and object and can be categorized into different categories such as prepositions and verbs. A large visual gap exists although the visual relationship is included in the same predicate. This study improves upon a previous study (that uses language cues using two losses) and a spatial cue (that only includes individual information) by adding relative information on the subject and object of the extant study. The architectural limitation is demonstrated and is overcome to detect all zero-shot visual relationships. A new problem is discovered, and an explanation of how it decreases performance is provided. The experiment is conducted on the VRD and VG datasets and a significant improvement over previous results is obtained.
https://doi.org/10.4218/etrij.2019-0093 인용 PDF KSCI

Real-Time Comprehensive Assistance for Visually Impaired Navigation

Amal Al-Shahrani;Amjad Alghamdi;Areej Alqurashi;Raghad Alzahrani;Nuha imam
- International Journal of Computer Science & Network Security
- /
- 제24권5호
- /
- pp.1-10
- /
- 2024
Individuals with visual impairments face numerous challenges in their daily lives, with navigating streets and public spaces being particularly daunting. The inability to identify safe crossing locations and assess the feasibility of crossing significantly restricts their mobility and independence. Globally, an estimated 285 million people suffer from visual impairment, with 39 million categorized as blind and 246 million as visually impaired, according to the World Health Organization. In Saudi Arabia alone, there are approximately 159 thousand blind individuals, as per unofficial statistics. The profound impact of visual impairments on daily activities underscores the urgent need for solutions to improve mobility and enhance safety. This study aims to address this pressing issue by leveraging computer vision and deep learning techniques to enhance object detection capabilities. Two models were trained to detect objects: one focused on street crossing obstacles, and the other aimed to search for objects. The first model was trained on a dataset comprising 5283 images of road obstacles and traffic signals, annotated to create a labeled dataset. Subsequently, it was trained using the YOLOv8 and YOLOv5 models, with YOLOv5 achieving a satisfactory accuracy of 84%. The second model was trained on the COCO dataset using YOLOv5, yielding an impressive accuracy of 94%. By improving object detection capabilities through advanced technology, this research seeks to empower individuals with visual impairments, enhancing their mobility, independence, and overall quality of life.
https://doi.org/10.22937/IJCSNS.2024.24.5.1 인용 PDF

고정밀 머신 비전을 위한 정확한 PCB 윤곽선과 코너 검출 (Accurate PCB Outline Extraction and Corner Detection for High Precision Machine Vision)

고동민;최강선
- 반도체디스플레이기술학회지
- /
- 제16권3호
- /
- pp.53-58
- /
- 2017
Recently, advance in technology have increased the importance of visual inspection in semiconductor inspection areas. In PCB visual inspection, accurate line estimation is critical to the accuracy of the entire process, since it is utilized in preprocessing steps such as calibration and alignment. We propose a line estimation method that is differently weighted for the line candidates using a histogram of gradient information, when the position of the initial approximate corner points is known. Using the obtained line equation of the outline, corner points can be calculated accurately. The proposed method is compared with the existing method in terms of the accuracy of the detected corner points. The proposed method accurately detects corner points even when the existing method fails. For high-resolution frames of 3.5mega-pixels, the proposed method is performed in 89.01ms.
PDF

컬러와 동적 특징을 이용한 화재의 시각적 감지 (Visual Sensing of Fires Using Color and Dynamic Features)

도용태
- 센서학회지
- /
- 제21권3호
- /
- pp.211-216
- /
- 2012
Fires are the most common disaster and early fire detection is of great importance to minimize the consequent damage. Simple sensors including smoke detectors are widely used for the purpose but they are able to sense fires only at close proximity. Recently, due to the rapid advances of relevant technologies, vision-based fire sensing has attracted growing attention. In this paper, a novel visual sensing technique to automatically detect fire is presented. The proposed technique consists of multiple steps of image processing: pixel-level, block-level, and frame level. At the first step, fire flame pixel candidates are selected based on their color values in YIQ space from the image of a camera which is installed as a vision sensor at a fire scene. At the second step, the dynamic parts of flames are extracted by comparing two consecutive images. These parts are then represented in regularly divided image blocks to reduce pixel-level detection error and simplify following processing. Finally, the temporal change of the detected blocks is analyzed to confirm the spread of fire. The proposed technique was tested using real fire images and it worked quite reliably.
https://doi.org/10.5369/JSST.2012.21.3.211 인용 PDF KSCI

위치기반 비주얼 서보잉을 위한 견실한 위치 추적 및 양팔 로봇의 조작작업에의 응용 (Robust Position Tracking for Position-Based Visual Servoing and Its Application to Dual-Arm Task)

김찬오;최성;정주노;양광웅;김홍석
- 로봇학회논문지
- /
- 제2권2호
- /
- pp.129-136
- /
- 2007
This paper introduces a position-based robust visual servoing method which is developed for operation of a human-like robot with two arms. The proposed visual servoing method utilizes SIFT algorithm for object detection and CAMSHIFT algorithm for object tracking. While the conventional CAMSHIFT has been used mainly for object tracking in a 2D image plane, we extend its usage for object tracking in 3D space, by combining the results of CAMSHIFT for two image plane of a stereo camera. This approach shows a robust and dependable result. Once the robot's task is defined based on the extracted 3D information, the robot is commanded to carry out the task. We conduct several position-based visual servoing tasks and compare performances under different conditions. The results show that the proposed visual tracking algorithm is simple but very effective for position-based visual servoing.
PDF

시각적 특징들을 이용한 도로 상의 후방 추종 차량 인식 (On-Road Succeeding Vehicle Detection using Characteristic Visual Features)

샴 아디카리;조휘택;유현중;양창주;김형석
- 전기학회논문지
- /
- 제59권3호
- /
- pp.636-644
- /
- 2010
A method for the detection of on-road succeeding vehicles using visual characteristic features like horizontal edges, shadow, symmetry and intensity is proposed. The proposed method uses the prominent horizontal edges along with the shadow under the vehicle to generate an initial estimate of the vehicle-road surface contact. Fast symmetry detection, utilizing the edge pixels, is then performed to detect the presence of vertically symmetric object, possibly vehicle, in the region above the initially estimated vehicle-road surface contact. A window defined by the horizontal and the vertical line obtained from above along with local perspective information provides a narrow region for the final search of the vehicle. A bounding box around the vehicle is extracted from the horizontal edges, symmetry histogram and a proposed squared difference of intensity measure. Experiments have been performed on natural traffic scenes obtained from a camera mounted on the side view mirror of a host vehicle demonstrate good and reliable performance of the proposed method.
https://doi.org/10.5370/KIEE.2010.59.3.636 인용 PDF KSCI

Motion Estimation-based Human Fall Detection for Visual Surveillance

Kim, Heegwang;Park, Jinho;Park, Hasil;Paik, Joonki
- IEIE Transactions on Smart Processing and Computing
- /
- 제5권5호
- /
- pp.327-330
- /
- 2016
Currently, the world's elderly population continues to grow at a dramatic rate. As the number of senior citizens increases, detection of someone falling has attracted increasing attention for visual surveillance systems. This paper presents a novel fall-detection algorithm using motion estimation and an integrated spatiotemporal energy map of the object region. The proposed method first extracts a human region using a background subtraction method. Next, we applied an optical flow algorithm to estimate motion vectors, and an energy map is generated by accumulating the detected human region for a certain period of time. We can then detect a fall using k-nearest neighbor (kNN) classification with the previously estimated motion information and energy map. The experimental results show that the proposed algorithm can effectively detect someone falling in any direction, including at an angle parallel to the camera's optical axis.
https://doi.org/10.5573/IEIESPC.2016.5.5.327 인용 PDF KSCI

BMVT-M을 이용한 IR 및 SAR 융합기반 지상표적 탐지 (IR and SAR Sensor Fusion based Target Detection using BMVT-M)

임윤지;김태훈;김성호;송우진;김경태;김소현
- 제어로봇시스템학회논문지
- /
- 제21권11호
- /
- pp.1017-1026
- /
- 2015
Infrared (IR) target detection is one of the key technologies in Automatic Target Detection/Recognition (ATD/R) for military applications. However, IR sensors have limitations due to the weather sensitivity and atmospheric effects. In recent years, sensor information fusion study is an active research topic to overcome these limitations. SAR sensor is adopted to sensor fusion, because SAR is robust to various weather conditions. In this paper, a Boolean Map Visual Theory-Morphology (BMVT-M) method is proposed to detect targets in SAR and IR images. Moreover, we suggest the IR and SAR image registration and decision level fusion algorithm. The experimental results using OKTAL-SE synthetic images validate the feasibility of sensor fusion-based target detection.
https://doi.org/10.5302/J.ICROS.2015.15.0147 인용 PDF KSCI

발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법 (Lip Reading Method Using CNN for Utterance Period Detection)

김용기;임종관;김미혜
- 디지털융복합연구
- /
- 제14권8호
- /
- pp.233-243
- /
- 2016
소음환경에서의 음성인식 문제점으로 인해 1990년대 중반부터 음성정보와 영양정보를 결합한 AVSR(Audio Visual Speech Recognition) 시스템이 제안되었고, Lip Reading은 AVSR 시스템에서 시각적 특징으로 사용되었다. 본 연구는 효율적인 AVSR 시스템을 구축하기 위해 입 모양만을 이용한 발화 단어 인식률을 극대화하는데 목적이 있다. 본 연구에서는 입 모양 인식을 위해 실험단어를 발화한 입력 영상으로부터 영상의 전처리 과정을 수행하고 입술 영역을 검출한다. 이후 DNN(Deep Neural Network)의 일종인 CNN(Convolution Neural Network)을 이용하여 발화구간을 검출하고, 동일한 네트워크를 사용하여 입 모양 특징 벡터를 추출하여 HMM(Hidden Markov Mode)으로 인식 실험을 진행하였다. 그 결과 발화구간 검출 결과는 91%의 인식률을 보임으로써 Threshold를 이용한 방법에 비해 높은 성능을 나타냈다. 또한 입모양 인식 실험에서 화자종속 실험은 88.5%, 화자 독립 실험은 80.2%로 이전 연구들에 비해 높은 결과를 보였다.
https://doi.org/10.14400/JDC.2016.14.8.233 인용 PDF KSCI

안전 운전 지원을 위한 도로 영상에서 시각 주의 영역 검출 (Detection of Visual Attended Regions in Road Images for Assisting Safety Driving)

김종배
- 전자공학회논문지SC
- /
- 제49권1호
- /
- pp.94-102
- /
- 2012
최근 고령 사회에 들어섬에 따라 고령 운전자의 수가 증가하는 추세이다. 고령 운전자의 교통사고 대부분이 차량 운전자의 부주의로 인해 발생한다. 이러한 부주의들에는 노화에 따른 느린 몸의 움직임으로 차량 조작 미숙, 노안으로 인한 좁은 시야로 낮은 시각정보 검색 문제 그리고 낮은 대비감도로 인한 물체 식별 문제 등으로 기인한다. 본 연구에서는 고령 운전자의 안전 운전 지원을 위해 도로 영상에서 시각적 주의를 가져야 하는 관심물체 영역들을 실시간으로 자동 검출하는 방법을 제안한다. 제안한 방법은 입력 영상으로부터 선택적 시각 주의를 갖는 관심물체후보 영역들을 실시간으로 검출하기 위해 칼라, 기울기, 그리고 밝기 특징정보들의 대비 변화 정도를 3차원으로 표현한 현저함 맵(Saliency map)을 생성하고, 동시에 입력 영상으로부터 물체들의 경계선 획득을 위해 mean-shift 알고리즘을 적용하여 영상을 분할한다. 그리고 분할된 영역에 속한 현저함 픽셀의 유무에 따른 선택적 시각 주의 영역을 검출한다. 제안한 방법을 다양한 실외 환경 조건에서 실험한 결과, 도로 상의 다양한 물체에 빠른 검출율과 함께 비교적 복잡한 도로 환경에서도 강임함을 알 수 있다.
PDF KSCI

검색결과 876건 처리시간 0.033초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)