• Title/Summary/Keyword: Open-world object detection

Search Result 6, Processing Time 0.017 seconds

Unveiling the Unseen: A Review on current trends in Open-World Object Detection (오픈 월드 객체 감지의 현재 트렌드에 대한 리뷰)

  • MUHAMMAD ALI IQBAL;Soo Kyun Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.335-337
    • /
    • 2024
  • This paper presents a new open-world object detection method emphasizing uncertainty representation in machine learning models. The focus is on adapting to real-world uncertainties, incrementally updating the model's knowledge repository for dynamic scenarios. Applications like autonomous vehicles benefit from improved multi-class classification accuracy. The paper reviews challenges in existing methodologies, stressing the need for universal detectors capable of handling unknown classes. Future directions propose collaboration, integration of language models, to improve the adaptability and applicability of open-world object detection.

  • PDF

Audio Marker Detection of the implementation for Effective Augmented Reality (효과적인 증강현실 구현을 위한 오디오 마커 검출)

  • Jeon, Soo-Jin;Kim, Young-Seop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.10 no.2
    • /
    • pp.121-124
    • /
    • 2011
  • Augmented Reality integrates virtual objects onto a real world so that it extends the human's sensibility of real-world. an Augmented Reality technology combines real and virtual object in a real environment, and runs interactive in real time, and is regarded as an emerging technology in a large part of the future of information technology. So the benefits for the various businesses are estimated to be very high. In this paper, combine ARToolkit with OpenAL we can provide audio to users. These proposed methodologies will contribute to a better immersive realization of the conventional Augmented Reality system.

A Study on the Image/Video Data Processing Methods for Edge Computing-Based Object Detection Service (에지 컴퓨팅 기반 객체탐지 서비스를 위한 이미지/동영상 데이터 처리 기법에 관한 연구)

  • Jang Shin Won;Yong-Geun Hong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.11
    • /
    • pp.319-328
    • /
    • 2023
  • Unlike cloud computing, edge computing technology analyzes and judges data close to devices and users, providing advantages such as real-time service, sensitive data protection, and reduced network traffic. EdgeX Foundry, a representative open source of edge computing platforms, is an open source-based edge middleware platform that provides services between various devices and IT systems in the real world. EdgeX Foundry provides a service for handling camera devices, along with a service for handling existing sensed data, which only supports simple streaming and camera device management and does not store or process image data obtained from the device inside EdgeX. This paper presents a technique that can store and process image data inside EdgeX by applying some of the services provided by EdgeX Foundry. Based on the proposed technique, a service pipeline for object detection services used core in the field of autonomous driving was created for experiments and performance evaluation, and then compared and analyzed with existing methods.

ONNX-based Runtime Performance Analysis: YOLO and ResNet (ONNX 기반 런타임 성능 분석: YOLO와 ResNet)

  • Jeong-Hyeon Kim;Da-Eun Lee;Su-Been Choi;Kyung-Koo Jun
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.89-100
    • /
    • 2024
  • In the field of computer vision, models such as You Look Only Once (YOLO) and ResNet are widely used due to their real-time performance and high accuracy. However, to apply these models in real-world environments, factors such as runtime compatibility, memory usage, computing resources, and real-time conditions must be considered. This study compares the characteristics of three deep model runtimes: ONNX Runtime, TensorRT, and OpenCV DNN, and analyzes their performance on two models. The aim of this paper is to provide criteria for runtime selection for practical applications. The experiments compare runtimes based on the evaluation metrics of time, memory usage, and accuracy for vehicle license plate recognition and classification tasks. The experimental results show that ONNX Runtime excels in complex object detection performance, OpenCV DNN is suitable for environments with limited memory, and TensorRT offers superior execution speed for complex models.

Domain Adaptive Fruit Detection Method based on a Vision-Language Model for Harvest Automation (작물 수확 자동화를 위한 시각 언어 모델 기반의 환경적응형 과수 검출 기술)

  • Changwoo Nam;Jimin Song;Yongsik Jin;Sang Jun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.2
    • /
    • pp.73-81
    • /
    • 2024
  • Recently, mobile manipulators have been utilized in agriculture industry for weed removal and harvest automation. This paper proposes a domain adaptive fruit detection method for harvest automation, by utilizing OWL-ViT model which is an open-vocabulary object detection model. The vision-language model can detect objects based on text prompt, and therefore, it can be extended to detect objects of undefined categories. In the development of deep learning models for real-world problems, constructing a large-scale labeled dataset is a time-consuming task and heavily relies on human effort. To reduce the labor-intensive workload, we utilized a large-scale public dataset as a source domain data and employed a domain adaptation method. Adversarial learning was conducted between a domain discriminator and feature extractor to reduce the gap between the distribution of feature vectors from the source domain and our target domain data. We collected a target domain dataset in a real-like environment and conducted experiments to demonstrate the effectiveness of the proposed method. In experiments, the domain adaptation method improved the AP50 metric from 38.88% to 78.59% for detecting objects within the range of 2m, and we achieved 81.7% of manipulation success rate.

Research on Ocular Data Analysis and Eye Tracking in Divers

  • Ye Jun Lee;Yong Kuk Kim;Da Young Kim;Jeongtack Min;Min-Kyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.43-51
    • /
    • 2024
  • This paper proposes a method for acquiring and analyzing ocular data using a special-purpose diver mask targeted at divers who primarily engage in underwater activities. This involves tracking the user's gaze with the help of a custom-built ocular dataset and a YOLOv8-nano model developed for this purpose. The model achieved an average processing time of 45.52ms per frame and successfully recognized states of eyes being open or closed with 99% accuracy. Based on the analysis of the ocular data, a gaze tracking algorithm was developed that can map to real-world coordinates. The validation of this algorithm showed an average error rate of about 1% on the x-axis and about 6% on the y-axis.