• Title/Summary/Keyword: Vision-based Manipulation

Search Result 28, Processing Time 0.02 seconds

Domain Adaptive Fruit Detection Method based on a Vision-Language Model for Harvest Automation (작물 수확 자동화를 위한 시각 언어 모델 기반의 환경적응형 과수 검출 기술)

  • Changwoo Nam;Jimin Song;Yongsik Jin;Sang Jun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.2
    • /
    • pp.73-81
    • /
    • 2024
  • Recently, mobile manipulators have been utilized in agriculture industry for weed removal and harvest automation. This paper proposes a domain adaptive fruit detection method for harvest automation, by utilizing OWL-ViT model which is an open-vocabulary object detection model. The vision-language model can detect objects based on text prompt, and therefore, it can be extended to detect objects of undefined categories. In the development of deep learning models for real-world problems, constructing a large-scale labeled dataset is a time-consuming task and heavily relies on human effort. To reduce the labor-intensive workload, we utilized a large-scale public dataset as a source domain data and employed a domain adaptation method. Adversarial learning was conducted between a domain discriminator and feature extractor to reduce the gap between the distribution of feature vectors from the source domain and our target domain data. We collected a target domain dataset in a real-like environment and conducted experiments to demonstrate the effectiveness of the proposed method. In experiments, the domain adaptation method improved the AP50 metric from 38.88% to 78.59% for detecting objects within the range of 2m, and we achieved 81.7% of manipulation success rate.

Distortion Removal and False Positive Filtering for Camera-based Object Position Estimation (카메라 기반 객체의 위치인식을 위한 왜곡제거 및 오검출 필터링 기법)

  • Sil Jin;Jimin Song;Jiho Choi;Yongsik Jin;Jae Jin Jeong;Sang Jun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.1
    • /
    • pp.1-8
    • /
    • 2024
  • Robotic arms have been widely utilized in various labor-intensive industries such as manufacturing, agriculture, and food services, contributing to increasing productivity. In the development of industrial robotic arms, camera sensors have many advantages due to their cost-effectiveness and small sizes. However, estimating object positions is a challenging problem, and it critically affects to the robustness of object manipulation functions. This paper proposes a method for estimating the 3D positions of objects, and it is applied to a pick-and-place task. A deep learning model is utilized to detect 2D bounding boxes in the image plane, and the pinhole camera model is employed to compute the object positions. To improve the robustness of measuring the 3D positions of objects, we analyze the effect of lens distortion and introduce a false positive filtering process. Experiments were conducted on a real-world scenario for moving medicine bottles by using a camera-based manipulator. Experimental results demonstrated that the distortion removal and false positive filtering are effective to improve the position estimation precision and the manipulation success rate.

Design of Vision-based Interaction Tool for 3D Interaction in Desktop Environment (데스크탑 환경에서의 3차원 상호작용을 위한 비전기반 인터랙션 도구의 설계)

  • Choi, Yoo-Joo;Rhee, Seon-Min;You, Hyo-Sun;Roh, Young-Sub
    • The KIPS Transactions:PartB
    • /
    • v.15B no.5
    • /
    • pp.421-434
    • /
    • 2008
  • As computer graphics, virtual reality and augmented reality technologies have been developed, in many application areas based on those techniques, interaction for 3D space is required such as selection and manipulation of an 3D object. In this paper, we propose a framework for a vision-based 3D interaction which enables to simulate functions of an expensive 3D mouse for a desktop environment. The proposed framework includes a specially manufactured interaction device using three-color LEDs. By recognizing position and color of the LED from video sequences, various events of the mouse and 6 DOF interactions are supported. Since the proposed device is more intuitive and easier than an existing 3D mouse which is expensive and requires skilled manipulation, it can be used without additional learning or training. In this paper, we explain methods for making a pointing device using three-color LEDs which is one of the components of the proposed framework, calculating 3D position and orientation of the pointer and analyzing color of the LED from video sequences. We verify accuracy and usefulness of the proposed device by showing a measurement result of an error of the 3D position and orientation.

Manipulator with Camera for Mobile Robots (모바일 로봇을 위한 카메라 탑재 매니퓰레이터)

  • Lee Jun-Woo;Choe, Kyoung-Geun;Cho, Hun-Hee;Jeong, Seong-Kyun;Bong, Jae-Hwan
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.507-514
    • /
    • 2022
  • Mobile manipulators are getting lime light in the field of home automation due to their mobility and manipulation capabilities. In this paper, we developed a small size manipulator system that can be mounted on a mobile robot as a preliminary study to develop a mobile manipulator. The developed manipulator has four degree-of-freedom. At the end-effector of manipulator, there are a camera and a gripper to recognize and manipulate the object. One of four degree-of-freedom is linear motion in vertical direction for better interaction with human hands which are located higher than the mobile manipulator. The developed manipulator was designed to dispose the four actuators close to the base of the manipulator to reduce rotational inertia of the manipulator, which improves stability of manipulation and reduces the risk of rollover. The developed manipulator repeatedly performed a pick and place task and successfully manipulate the object within the workspace of manipulator.

Automatic Registration of Two Parts using Robot with Multiple 3D Sensor Systems

  • Ha, Jong-Eun
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.4
    • /
    • pp.1830-1835
    • /
    • 2015
  • In this paper, we propose an algorithm for the automatic registration of two rigid parts using multiple 3D sensor systems on a robot. Four sets of structured laser stripe system consisted of a camera and a visible laser stripe is used for the acquisition of 3D information. Detailed procedures including extrinsic calibration among four 3D sensor systems and hand/eye calibration of 3D sensing system on robot arm are presented. We find a best pose using search-based pose estimation algorithm where cost function is proposed by reflecting geometric constraints between sensor systems and target objects. A pose with minimum gap and height difference is found by greedy search. Experimental result using demo system shows the robustness and feasibility of the proposed algorithm.

Single Camera Omnidirectional Stereo Imaging System (단일 카메라 전방향 스테레오 영상 시스템)

  • Yi, Soo-Yeong;Choi, Byung-Wook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.4
    • /
    • pp.400-405
    • /
    • 2009
  • A new method for the catadioptric omnidirectional stereo vision with single camera is presented in this paper. The proposed method uses a concave lens with a convex mirror. Since the optical part of the proposed method is simple and commercially available, the resultant omnidirectional stereo system becomes versatile and cost-effective. The closed-form solution for 3D distance computation is presented based on the simple optics including the reflection and the reflection of the convex mirror and the concave lens. The compactness of the system and the simplicity of the image processing make the omnidirectional stereo system appropriate for real-time applications such as autonomous navigation of a mobile robot or the object manipulation. In order to verify the feasibility of the proposed method, an experimental prototype is implemented.

Human Vision System based Adaptive Watermarking Algorithm (시각적 특성에 기반한 적응적 워터마킹 알고리즘)

  • 전영민;고일주;김계영
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.6
    • /
    • pp.101-109
    • /
    • 2004
  • This paper proposes an adaptive watermarking algerian technique concerning the visual characteristics of human. To embed watermark to an image, features such as contrast, brightness, and texture are used. The propose method adaptively select blocks and determine the position and intensity of watermark to be applied, concerning the visual characteristics of a human. The experiment involves cropping, image enhancement, low pass filtering, and JPEG compression, which are compared on detectability of watermark against image manipulation and attack

Dynamic Manipulation of a Virtual Object in Marker-less AR system Based on Both Human Hands

  • Chun, Jun-Chul;Lee, Byung-Sung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.4 no.4
    • /
    • pp.618-632
    • /
    • 2010
  • This paper presents a novel approach to control the augmented reality (AR) objects robustly in a marker-less AR system by fingertip tracking and hand pattern recognition. It is known that one of the promising ways to develop a marker-less AR system is using human's body such as hand or face for replacing traditional fiducial markers. This paper introduces a real-time method to manipulate the overlaid virtual objects dynamically in a marker-less AR system using both hands with a single camera. The left bare hand is considered as a virtual marker in the marker-less AR system and the right hand is used as a hand mouse. To build the marker-less system, we utilize a skin-color model for hand shape detection and curvature-based fingertip detection from an input video image. Using the detected fingertips the camera pose are estimated to overlay virtual objects on the hand coordinate system. In order to manipulate the virtual objects rendered on the marker-less AR system dynamically, a vision-based hand control interface, which exploits the fingertip tracking for the movement of the objects and pattern matching for the hand command initiation, is developed. From the experiments, we can prove that the proposed and developed system can control the objects dynamically in a convenient fashion.

Deep Learning Based Tank Aiming line Alignment System (딥러닝 기반 전차 조준선 정렬 시스템)

  • Jeong, Gyu-Been;Park, Jae-Hyo;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.25 no.2
    • /
    • pp.285-290
    • /
    • 2021
  • The existing aiming inspection use foreign-made aiming inspection equipment. However, the quantity is insufficient and the difficult to maintain. So it takes a lot of time to inspect the target. This system can reduces the time of aiming inspection and be maintained and distributed smoothly because it is a domestic product. In this paper, we develop a system that can detect targets and monitor shooting results through a target detection deep learning model. The system is capable of real-time detection of targets and has significantly increased the identification rate through several preprocessing of distant targets. In addition, a graphical user interface is configured to facilitate user camera manipulation and storage and management of training result data. Therefore the system can replace the currently used aiming inspection equipment and non-fire training.

Investigating Smart TV Gesture Interaction Based on Gesture Types and Styles

  • Ahn, Junyoung;Kim, Kyungdoh
    • Journal of the Ergonomics Society of Korea
    • /
    • v.36 no.2
    • /
    • pp.109-121
    • /
    • 2017
  • Objective: This study aims to find suitable types and styles for gesture interaction as remote control on smart TVs. Background: Smart TV is being developed rapidly in the world, and gesture interaction has a wide range of research areas, especially based on vision techniques. However, most studies are focused on the gesture recognition technology. Also, not many previous studies of gestures types and styles on smart TVs were carried out. Therefore, it is necessary to check what users prefer in terms of gesture types and styles for each operation command. Method: We conducted an experiment to extract the target user manipulation commands required for smart TVs and select the corresponding gestures. To do this, we looked at gesture styles people use for every operation command, and checked whether there are any gesture styles they prefer over others. Through these results, this study was carried out with a process selecting smart TV operation commands and gestures. Results: Eighteen TV commands have been used in this study. With agreement level as a basis, we compared the six types of gestures and five styles of gestures for each command. As for gesture type, participants generally preferred a gesture of Path-Moving type. In the case of Pan and Scroll commands, the highest agreement level (1.00) of 18 commands was shown. As for gesture styles, the participants preferred a manipulative style in 11 commands (Next, Previous, Volume up, Volume down, Play, Stop, Zoom in, Zoom out, Pan, Rotate, Scroll). Conclusion: By conducting an analysis on user-preferred gestures, nine gesture commands are proposed for gesture control on smart TVs. Most participants preferred Path-Moving type and Manipulative style gestures based on the actual operations. Application: The results can be applied to a more advanced form of the gestures in the 3D environment, such as a study on VR. The method used in this study will be utilized in various domains.