Search | Korea Science

Unveiling the Unseen: A Review on current trends in Open-World Object Detection (오픈 월드 객체 감지의 현재 트렌드에 대한 리뷰)

MUHAMMAD ALI IQBAL;Soo Kyun Kim
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2024.01a
- /
- pp.335-337
- /
- 2024
This paper presents a new open-world object detection method emphasizing uncertainty representation in machine learning models. The focus is on adapting to real-world uncertainties, incrementally updating the model's knowledge repository for dynamic scenarios. Applications like autonomous vehicles benefit from improved multi-class classification accuracy. The paper reviews challenges in existing methodologies, stressing the need for universal detectors capable of handling unknown classes. Future directions propose collaboration, integration of language models, to improve the adaptability and applicability of open-world object detection.
PDF

Audio Marker Detection of the implementation for Effective Augmented Reality (효과적인 증강현실 구현을 위한 오디오 마커 검출)

Jeon, Soo-Jin;Kim, Young-Seop
- Journal of the Semiconductor & Display Technology
- /
- v.10 no.2
- /
- pp.121-124
- /
- 2011
Augmented Reality integrates virtual objects onto a real world so that it extends the human's sensibility of real-world. an Augmented Reality technology combines real and virtual object in a real environment, and runs interactive in real time, and is regarded as an emerging technology in a large part of the future of information technology. So the benefits for the various businesses are estimated to be very high. In this paper, combine ARToolkit with OpenAL we can provide audio to users. These proposed methodologies will contribute to a better immersive realization of the conventional Augmented Reality system.
PDF KSCI

A Study on the Image/Video Data Processing Methods for Edge Computing-Based Object Detection Service (에지 컴퓨팅 기반 객체탐지 서비스를 위한 이미지/동영상 데이터 처리 기법에 관한 연구)

Jang Shin Won;Yong-Geun Hong
- KIPS Transactions on Computer and Communication Systems
- /
- v.12 no.11
- /
- pp.319-328
- /
- 2023
Unlike cloud computing, edge computing technology analyzes and judges data close to devices and users, providing advantages such as real-time service, sensitive data protection, and reduced network traffic. EdgeX Foundry, a representative open source of edge computing platforms, is an open source-based edge middleware platform that provides services between various devices and IT systems in the real world. EdgeX Foundry provides a service for handling camera devices, along with a service for handling existing sensed data, which only supports simple streaming and camera device management and does not store or process image data obtained from the device inside EdgeX. This paper presents a technique that can store and process image data inside EdgeX by applying some of the services provided by EdgeX Foundry. Based on the proposed technique, a service pipeline for object detection services used core in the field of autonomous driving was created for experiments and performance evaluation, and then compared and analyzed with existing methods.
https://doi.org/10.3745/KTCCS.2023.12.11.319 인용 PDF

ONNX-based Runtime Performance Analysis: YOLO and ResNet (ONNX 기반 런타임 성능 분석: YOLO와 ResNet)

Jeong-Hyeon Kim;Da-Eun Lee;Su-Been Choi;Kyung-Koo Jun
- The Journal of Bigdata
- /
- v.9 no.1
- /
- pp.89-100
- /
- 2024
In the field of computer vision, models such as You Look Only Once (YOLO) and ResNet are widely used due to their real-time performance and high accuracy. However, to apply these models in real-world environments, factors such as runtime compatibility, memory usage, computing resources, and real-time conditions must be considered. This study compares the characteristics of three deep model runtimes: ONNX Runtime, TensorRT, and OpenCV DNN, and analyzes their performance on two models. The aim of this paper is to provide criteria for runtime selection for practical applications. The experiments compare runtimes based on the evaluation metrics of time, memory usage, and accuracy for vehicle license plate recognition and classification tasks. The experimental results show that ONNX Runtime excels in complex object detection performance, OpenCV DNN is suitable for environments with limited memory, and TensorRT offers superior execution speed for complex models.
https://doi.org/10.36498/kbigdt.2024.9.1.89 인용 PDF

Domain Adaptive Fruit Detection Method based on a Vision-Language Model for Harvest Automation (작물 수확 자동화를 위한 시각 언어 모델 기반의 환경적응형 과수 검출 기술)

Changwoo Nam;Jimin Song;Yongsik Jin;Sang Jun Lee
- IEMEK Journal of Embedded Systems and Applications
- /
- v.19 no.2
- /
- pp.73-81
- /
- 2024
Recently, mobile manipulators have been utilized in agriculture industry for weed removal and harvest automation. This paper proposes a domain adaptive fruit detection method for harvest automation, by utilizing OWL-ViT model which is an open-vocabulary object detection model. The vision-language model can detect objects based on text prompt, and therefore, it can be extended to detect objects of undefined categories. In the development of deep learning models for real-world problems, constructing a large-scale labeled dataset is a time-consuming task and heavily relies on human effort. To reduce the labor-intensive workload, we utilized a large-scale public dataset as a source domain data and employed a domain adaptation method. Adversarial learning was conducted between a domain discriminator and feature extractor to reduce the gap between the distribution of feature vectors from the source domain and our target domain data. We collected a target domain dataset in a real-like environment and conducted experiments to demonstrate the effectiveness of the proposed method. In experiments, the domain adaptation method improved the AP50 metric from 38.88% to 78.59% for detecting objects within the range of 2m, and we achieved 81.7% of manipulation success rate.
https://doi.org/10.14372/IEMEK.2024.19.2.73 인용 PDF

Research on Ocular Data Analysis and Eye Tracking in Divers

Ye Jun Lee;Yong Kuk Kim;Da Young Kim;Jeongtack Min;Min-Kyu Kim
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.8
- /
- pp.43-51
- /
- 2024
This paper proposes a method for acquiring and analyzing ocular data using a special-purpose diver mask targeted at divers who primarily engage in underwater activities. This involves tracking the user's gaze with the help of a custom-built ocular dataset and a YOLOv8-nano model developed for this purpose. The model achieved an average processing time of 45.52ms per frame and successfully recognized states of eyes being open or closed with 99% accuracy. Based on the analysis of the ocular data, a gaze tracking algorithm was developed that can map to real-world coordinates. The validation of this algorithm showed an average error rate of about 1% on the x-axis and about 6% on the y-axis.
https://doi.org/10.9708/jksci.2024.29.08.043 인용 PDF HTML

Search Result 6, Processing Time 0.017 seconds

Unveiling the Unseen: A Review on current trends in Open-World Object Detection (오픈 월드 객체 감지의 현재 트렌드에 대한 리뷰)

Audio Marker Detection of the implementation for Effective Augmented Reality (효과적인 증강현실 구현을 위한 오디오 마커 검출)

A Study on the Image/Video Data Processing Methods for Edge Computing-Based Object Detection Service (에지 컴퓨팅 기반 객체탐지 서비스를 위한 이미지/동영상 데이터 처리 기법에 관한 연구)

ONNX-based Runtime Performance Analysis: YOLO and ResNet (ONNX 기반 런타임 성능 분석: YOLO와 ResNet)

Domain Adaptive Fruit Detection Method based on a Vision-Language Model for Harvest Automation (작물 수확 자동화를 위한 시각 언어 모델 기반의 환경적응형 과수 검출 기술)

Research on Ocular Data Analysis and Eye Tracking in Divers

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)