• Title/Summary/Keyword: Image Detection

Search Result 5,657, Processing Time 0.036 seconds

Grasping a Target Object in Clutter with an Anthropomorphic Robot Hand via RGB-D Vision Intelligence, Target Path Planning and Deep Reinforcement Learning (RGB-D 환경인식 시각 지능, 목표 사물 경로 탐색 및 심층 강화학습에 기반한 사람형 로봇손의 목표 사물 파지)

  • Ryu, Ga Hyeon;Oh, Ji-Heon;Jeong, Jin Gyun;Jung, Hwanseok;Lee, Jin Hyuk;Lopez, Patricio Rivera;Kim, Tae-Seong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.363-370
    • /
    • 2022
  • Grasping a target object among clutter objects without collision requires machine intelligence. Machine intelligence includes environment recognition, target & obstacle recognition, collision-free path planning, and object grasping intelligence of robot hands. In this work, we implement such system in simulation and hardware to grasp a target object without collision. We use a RGB-D image sensor to recognize the environment and objects. Various path-finding algorithms been implemented and tested to find collision-free paths. Finally for an anthropomorphic robot hand, object grasping intelligence is learned through deep reinforcement learning. In our simulation environment, grasping a target out of five clutter objects, showed an average success rate of 78.8%and a collision rate of 34% without path planning. Whereas our system combined with path planning showed an average success rate of 94% and an average collision rate of 20%. In our hardware environment grasping a target out of three clutter objects showed an average success rate of 30% and a collision rate of 97% without path planning whereas our system combined with path planning showed an average success rate of 90% and an average collision rate of 23%. Our results show that grasping a target object in clutter is feasible with vision intelligence, path planning, and deep RL.

D4AR - A 4-DIMENSIONAL AUGMENTED REALITY - MODEL FOR AUTOMATION AND VISUALIZATION OF CONSTRUCTION PROGRESS MONITORING

  • Mani Golparvar-Fard;Feniosky Pena-Mora
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.30-31
    • /
    • 2009
  • Early detection of schedule delay in field construction activities is vital to project management. It provides the opportunity to initiate remedial actions and increases the chance of controlling such overruns or minimizing their impacts. This entails project managers to design, implement, and maintain a systematic approach for progress monitoring to promptly identify, process and communicate discrepancies between actual and as-planned performances as early as possible. Despite importance, systematic implementation of progress monitoring is challenging: (1) Current progress monitoring is time-consuming as it needs extensive as-planned and as-built data collection; (2) The excessive amount of work required to be performed may cause human-errors and reduce the quality of manually collected data and since only an approximate visual inspection is usually performed, makes the collected data subjective; (3) Existing methods of progress monitoring are also non-systematic and may also create a time-lag between the time progress is reported and the time progress is actually accomplished; (4) Progress reports are visually complex, and do not reflect spatial aspects of construction; and (5) Current reporting methods increase the time required to describe and explain progress in coordination meetings and in turn could delay the decision making process. In summary, with current methods, it may be not be easy to understand the progress situation clearly and quickly. To overcome such inefficiencies, this research focuses on exploring application of unsorted daily progress photograph logs - available on any construction site - as well as IFC-based 4D models for progress monitoring. Our approach is based on computing, from the images themselves, the photographer's locations and orientations, along with a sparse 3D geometric representation of the as-built scene using daily progress photographs and superimposition of the reconstructed scene over the as-planned 4D model. Within such an environment, progress photographs are registered in the virtual as-planned environment, allowing a large unstructured collection of daily construction images to be interactively explored. In addition, sparse reconstructed scenes superimposed over 4D models allow site images to be geo-registered with the as-planned components and consequently, a location-based image processing technique to be implemented and progress data to be extracted automatically. The result of progress comparison study between as-planned and as-built performances can subsequently be visualized in the D4AR - 4D Augmented Reality - environment using a traffic light metaphor. In such an environment, project participants would be able to: 1) use the 4D as-planned model as a baseline for progress monitoring, compare it to daily construction photographs and study workspace logistics; 2) interactively and remotely explore registered construction photographs in a 3D environment; 3) analyze registered images and quantify as-built progress; 4) measure discrepancies between as-planned and as-built performances; and 5) visually represent progress discrepancies through superimposition of 4D as-planned models over progress photographs, make control decisions and effectively communicate those with project participants. We present our preliminary results on two ongoing construction projects and discuss implementation, perceived benefits and future potential enhancement of this new technology in construction, in all fronts of automatic data collection, processing and communication.

  • PDF

A Study on Transport Robot for Autonomous Driving to a Destination Based on QR Code in an Indoor Environment (실내 환경에서 QR 코드 기반 목적지 자율주행을 위한 운반 로봇에 관한 연구)

  • Se-Jun Park
    • Journal of Platform Technology
    • /
    • v.11 no.2
    • /
    • pp.26-38
    • /
    • 2023
  • This paper is a study on a transport robot capable of autonomously driving to a destination using a QR code in an indoor environment. The transport robot was designed and manufactured by attaching a lidar sensor so that the robot can maintain a certain distance during movement by detecting the distance between the camera for recognizing the QR code and the left and right walls. For the location information of the delivery robot, the QR code image was enlarged with Lanczos resampling interpolation, then binarized with Otsu Algorithm, and detection and analysis were performed using the Zbar library. The QR code recognition experiment was performed while changing the size of the QR code and the traveling speed of the transport robot while the camera position of the transport robot and the height of the QR code were fixed at 192cm. When the QR code size was 9cm × 9cm The recognition rate was 99.7% and almost 100% when the traveling speed of the transport robot was less than about 0.5m/s. Based on the QR code recognition rate, an experiment was conducted on the case where the destination is only going straight and the destination is going straight and turning in the absence of obstacles for autonomous driving to the destination. When the destination was only going straight, it was possible to reach the destination quickly because there was little need for position correction. However, when the destination included a turn, the time to arrive at the destination was relatively delayed due to the need for position correction. As a result of the experiment, it was found that the delivery robot arrived at the destination relatively accurately, although a slight positional error occurred while driving, and the applicability of the QR code-based destination self-driving delivery robot was confirmed.

  • PDF

Design of Port Security System Using Deep Learning and Object Features (딥러닝과 객체 특징점을 활용한 항만 보안시스템 설계)

  • Wang, Tae-su;Kim, Minyoung;Jang, Jongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.50-53
    • /
    • 2022
  • Recently, there have been cases in which counterfeit foreign ships have entered and left domestic ports several times. Vessels have a ship-specific serial number given by the International Maritime Organization (IMO) to identify the vessel, and IMO marking is mandatory on all ships built since 2004. In the case of airports and ports, which are representative logistics platforms, a security system is essential, but it is difficult to establish a security system at a port and there are many blind spots, which can cause security problems due to insufficient security systems. In this paper, a port security system is designed using deep learning object recognition and OpenCV. The security system process extracts the IMO number of the ship after recognizing the object when entering the ship, determines whether it is the same ship through feature point matching for ships with entry records, and stores the ship image and IMO number in the entry/exit DB for the first arrival vessel. Through the system of this paper, port security can be strengthened by improving the efficiency and system of port logistics by increasing the efficiency of port management personnel and reducing incidental costs caused by unauthorized entry.

  • PDF

Comprehensive analysis of deep learning-based target classifiers in small and imbalanced active sonar datasets (소량 및 불균형 능동소나 데이터세트에 대한 딥러닝 기반 표적식별기의 종합적인 분석)

  • Geunhwan Kim;Youngsang Hwang;Sungjin Shin;Juho Kim;Soobok Hwang;Youngmin Choo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.329-344
    • /
    • 2023
  • In this study, we comprehensively analyze the generalization performance of various deep learning-based active sonar target classifiers when applied to small and imbalanced active sonar datasets. To generate the active sonar datasets, we use data from two different oceanic experiments conducted at different times and ocean. Each sample in the active sonar datasets is a time-frequency domain image, which is extracted from audio signal of contact after the detection process. For the comprehensive analysis, we utilize 22 Convolutional Neural Networks (CNN) models. Two datasets are used as train/validation datasets and test datasets, alternatively. To calculate the variance in the output of the target classifiers, the train/validation/test datasets are repeated 10 times. Hyperparameters for training are optimized using Bayesian optimization. The results demonstrate that shallow CNN models show superior robustness and generalization performance compared to most of deep CNN models. The results from this paper can serve as a valuable reference for future research directions in deep learning-based active sonar target classification.

Changes in the Riverbed Landforms Due to the Artificial Regulation of Water Level in the Yeongsan River (인위적인 보 수위조절로 인한 영산강 하도 지형 변화)

  • Lim, Young Shin;Kim, Jin Kwan
    • Journal of The Geomorphological Association of Korea
    • /
    • v.27 no.1
    • /
    • pp.1-19
    • /
    • 2020
  • A river bed which is submerged in water at high flow and becomes part of the river at low flow, serves as a bridge between the river and the land. The channel bar creates a unique ecosystem with vegetation adapted to the particular environment and the water pool forms a wetland that plays a very important role in the environment. To evaluate anthropogenic impacts on the river bed in the Middle Yeongsangang River, the fluvial landforms in the stream channel were analyzed using multi-temporal remotely-sensed images. In the aerial photograph of 2005 taken before the construction of the large weirs, oxbow lakes, mid-channel bars, point bars, and natural wetlands between the artificial levees were identified. Multiple bars divided the flow of stream water to cause the braided pattern in a particular section. After the construction of the Seungchon weir, aerial photographs of 2013 and 2015 revealed that most of the fluvial landforms disappeared due to the dredging of its riverbed and water level control(maintenance at 7.5El.m). Sentinel-2 images were analyzed to identify differences between before and after the opening of weir gate. Change detection was performed with the near infrared and shortwave infrared spectral bands to effectively distinguish water surfaces from land. As a result, water surface area of the main stream of the Yeongsangang River decreased by 40% from 1.144km2 to 0.692km2. A large mid-channel bar that has been deposited upstream of the weir was exposed during low water levels, which shows the obvious influence of weir on the river bed. Newly formed unvegetated point bars that were deposited on the inside of a meander bend were identified from the remotely sensed images. As the maintenance period of the weir gate opening was extended, various habitats were created by creating pools and riffles around the channel bars. Considering the ecological and hydrological functions of the river bed, it is expected that the increase in bar areas through weir gate opening will reduce the artificial interference effect of the weir.

Optimization-based Deep Learning Model to Localize L3 Slice in Whole Body Computerized Tomography Images (컴퓨터 단층촬영 영상에서 3번 요추부 슬라이스 검출을 위한 최적화 기반 딥러닝 모델)

  • Seongwon Chae;Jae-Hyun Jo;Ye-Eun Park;Jin-Hyoung, Jeong;Sung Jin Kim;Ahnryul Choi
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.5
    • /
    • pp.331-337
    • /
    • 2023
  • In this paper, we propose a deep learning model to detect lumbar 3 (L3) CT images to determine the occurrence and degree of sarcopenia. In addition, we would like to propose an optimization technique that uses oversampling ratio and class weight as design parameters to address the problem of performance degradation due to data imbalance between L3 level and non-L3 level portions of CT data. In order to train and test the model, a total of 150 whole-body CT images of 104 prostate cancer patients and 46 bladder cancer patients who visited Gangneung Asan Medical Center were used. The deep learning model used ResNet50, and the design parameters of the optimization technique were selected as six types of model hyperparameters, data augmentation ratio, and class weight. It was confirmed that the proposed optimization-based L3 level extraction model reduced the median L3 error by about 1.0 slices compared to the control model (a model that optimized only 5 types of hyperparameters). Through the results of this study, accurate L3 slice detection was possible, and additionally, we were able to present the possibility of effectively solving the data imbalance problem through oversampling through data augmentation and class weight adjustment.

Intelligent Motion Pattern Recognition Algorithm for Abnormal Behavior Detections in Unmanned Stores (무인 점포 사용자 이상행동을 탐지하기 위한 지능형 모션 패턴 인식 알고리즘)

  • Young-june Choi;Ji-young Na;Jun-ho Ahn
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.73-80
    • /
    • 2023
  • The recent steep increase in the minimum hourly wage has increased the burden of labor costs, and the share of unmanned stores is increasing in the aftermath of COVID-19. As a result, theft crimes targeting unmanned stores are also increasing, and the "Just Walk Out" system is introduced to prevent such thefts, and LiDAR sensors, weight sensors, etc. are used or manually checked through continuous CCTV monitoring. However, the more expensive sensors are used, the higher the initial cost of operating the store and the higher the cost in many ways, and CCTV verification is difficult for managers to monitor around the clock and is limited in use. In this paper, we would like to propose an AI image processing fusion algorithm that can solve these sensors or human-dependent parts and detect customers who perform abnormal behaviors such as theft at low costs that can be used in unmanned stores and provide cloud-based notifications. In addition, this paper verifies the accuracy of each algorithm based on behavior pattern data collected from unmanned stores through motion capture using mediapipe, object detection using YOLO, and fusion algorithm and proves the performance of the convergence algorithm through various scenario designs.

Detection Fastener Defect using Semi Supervised Learning and Transfer Learning (준지도 학습과 전이 학습을 이용한 선로 체결 장치 결함 검출)

  • Sangmin Lee;Seokmin Han
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.91-98
    • /
    • 2023
  • Recently, according to development of artificial intelligence, a wide range of industry being automatic and optimized. Also we can find out some research of using supervised learning for deteceting defect of railway in domestic rail industry. However, there are structures other than rails on the track, and the fastener is a device that binds the rail to other structures, and periodic inspections are required to prevent safety accidents. In this paper, we present a method of reducing cost for labeling using semi-supervised and transfer model trained on rail fastener data. We use Resnet50 as the backbone network pretrained on ImageNet. At first we randomly take training data from unlabeled data and then labeled that data to train model. After predict unlabeled data by trained model, we adopted a method of adding the data with the highest probability for each class to the training data by a predetermined size. Futhermore, we also conducted some experiments to investigate the influence of the number of initially labeled data. As a result of the experiment, model reaches 92% accuracy which has a performance difference of around 5% compared to supervised learning. This is expected to improve the performance of the classifier by using relatively few labels without additional labeling processes through the proposed method.

Accuracy of posteroanterior cephalogram landmarks and measurements identification using a cascaded convolutional neural network algorithm: A multicenter study

  • Sung-Hoon Han;Jisup Lim;Jun-Sik Kim;Jin-Hyoung Cho;Mihee Hong;Minji Kim;Su-Jung Kim;Yoon-Ji Kim;Young Ho Kim;Sung-Hoon Lim;Sang Jin Sung;Kyung-Hwa Kang;Seung-Hak Baek;Sung-Kwon Choi;Namkug Kim
    • The korean journal of orthodontics
    • /
    • v.54 no.1
    • /
    • pp.48-58
    • /
    • 2024
  • Objective: To quantify the effects of midline-related landmark identification on midline deviation measurements in posteroanterior (PA) cephalograms using a cascaded convolutional neural network (CNN). Methods: A total of 2,903 PA cephalogram images obtained from 9 university hospitals were divided into training, internal validation, and test sets (n = 2,150, 376, and 377). As the gold standard, 2 orthodontic professors marked the bilateral landmarks, including the frontozygomatic suture point and latero-orbitale (LO), and the midline landmarks, including the crista galli, anterior nasal spine (ANS), upper dental midpoint (UDM), lower dental midpoint (LDM), and menton (Me). For the test, Examiner-1 and Examiner-2 (3-year and 1-year orthodontic residents) and the Cascaded-CNN models marked the landmarks. After point-to-point errors of landmark identification, the successful detection rate (SDR) and distance and direction of the midline landmark deviation from the midsagittal line (ANS-mid, UDM-mid, LDM-mid, and Me-mid) were measured, and statistical analysis was performed. Results: The cascaded-CNN algorithm showed a clinically acceptable level of point-to-point error (1.26 mm vs. 1.57 mm in Examiner-1 and 1.75 mm in Examiner-2). The average SDR within the 2 mm range was 83.2%, with high accuracy at the LO (right, 96.9%; left, 97.1%), and UDM (96.9%). The absolute measurement errors were less than 1 mm for ANS-mid, UDM-mid, and LDM-mid compared with the gold standard. Conclusions: The cascaded-CNN model may be considered an effective tool for the auto-identification of midline landmarks and quantification of midline deviation in PA cephalograms of adult patients, regardless of variations in the image acquisition method.