• Title/Summary/Keyword: Instance Segmentation

Search Result 67, Processing Time 0.03 seconds

Cases of Artificial Intelligence Development in the Construction field According to the Artificial Intelligence Development Method (인공지능 개발방식에 따른 건설 분야 인공지능 개발사례)

  • Heo, Seokjae;Chung, Lan
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2021.11a
    • /
    • pp.217-218
    • /
    • 2021
  • The development of artificial intelligence in the field of construction and construction is revitalizing. The performance and development techniques of artificial intelligence are changing rapidly, but if you look at the cases of domestic construction sites, they are using technologies from 5 to 7 years ago. It is right to follow a stable method in consideration of commercialization, but the previous AI development method requires more manpower and time to develop than the current technology. In addition, in order to actively utilize artificial intelligence technology, customized artificial intelligence is required to be applied to ever-changing changes in construction sites. it is the reality As a result, even if good AI technology is secured at the construction site, it is reluctant to introduce it because there is no advantage in terms of time and cost compared to the existing method to apply it only to some processes. Currently, an AI technique with a faster development process and accurate recognition has been developed to cope with a fluid situation, so it will be important to understand and introduce the rapidly changing AI development method.

  • PDF

Stitching speed improvement method using YOLACT (YOLACT를 이용한 스티칭 속도 개선 방안)

  • Go, Sung-Young;Rhee, Seong-Bae;Park, Seong-Hwan;Kim, Kyu-Heon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.10-13
    • /
    • 2020
  • 최근 초고화질 영상, 가상현실 등 프리미엄 콘텐츠에 대한 요구가 커지면서 360° VR과 8K TV 등의 시장이 확대되고 있다. 360° VR 영상을 만드는 데에 스티칭 기술이 사용되고 있고, 8K 영상을 촬영할 수 있는 장비는 매우 제한적이기 때문에 스티칭 기술을 통해 콘텐츠를 확보하려는 노력이 이어지고 있다. 스티칭 기술은 여러 영상을 합성하여 기존 카메라의 좁은 시야각 문제를 해결하고 보다 넓은 시야각의 영상을 만드는 기술이다. 최근에는 해당 분야에 관한 연구가 진행됨에 따라 이미지를 넘어 동영상 스티칭에 대한 연구가 주로 진행되고 있다, 기존 동영상 스티칭 방식은 이미지 스티칭 방식을 프레임마다 반복하기 때문에 시간이 오래 걸린다는 단점이 있다. 컴퓨터 비전 분야에서는 딥러닝을 활용하여 객체가 존재할 것으로 예측되는 부분에 사각형 모양의 경계 상자(Bounding box)를 생성하는 객체 탐지(Object detection) 분야에 관한 많은 연구가 이루어져 왔고 이를 기반으로 객체의 경계선을 검출하여 해당 영역만을 구분하는 객체 분할(Instance segmentation)에 대한 연구 또한 진행 중이다. 본 논문에서는 앞서 말한 스티칭 속도 문제를 해결하기 위하여 빠른 속도로 객체 분할이 가능한 YOLACT를 이용하여 스티칭 속도를 개선하는 방안을 제안한다.

  • PDF

Analysis of the Effect of Compressed Sensing on Mask R-CNN Based Object Detection (압축센싱이 Mask R-CNN 기반의 객체검출에 미치는 영향 분석)

  • Moon, Hansol;Kwon, Hyemin;Lee, Chang-kyo;Seo, Jeongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.97-99
    • /
    • 2022
  • Recently, the amount of data is increasing with the development of industries and technologies. Research on the processing and transmission of large amounts of data is attracting attention. Therefore, in this paper, compressed sensing was used to reduce the amount of data and its effect on Mask R-CNN algorithm was analyzed. We confirmed that as the compressed sensing rate increases, the amount of data in the image and the resolution decreases. However, it was confirmed that there was no significant degradation in the performance of object detection.

  • PDF

Preliminary Study for Image-Based Measurement Model in a Construction Site (이미지 기반 건설현장 수치 측정 모델 기초연구)

  • Yoon, Sebeen;Kang, Mingyun;Kim, Chang-Won;Lim, Hyunsu;Yoo, Wi Sung;Kim, Taehoon
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2023.05a
    • /
    • pp.287-288
    • /
    • 2023
  • The inspection work at construction sites is one of the important supervisory tasks, which involves verifying that the building is being constructed by the numerical values specified in the design drawings. The conventional measuring method for inspection involves using tools or equipment such as rulers directly by the personnel at the site, and it is usually confirmed by vision. Therefore, this study proposes an model to measure numerical values on images of the construction site. Through the case study to measure the installation interval of jack supports, the proposed algorithm was verified the effiect and validity. The results of this study suggest that it can support inspection work even in the office, which may have been overlooked by on-site inspectors, and contribute to the digitization of inspection work at construction sites.

  • PDF

A Study on the Attributes Classification of Agricultural Land Based on Deep Learning Comparison of Accuracy between TIF Image and ECW Image (딥러닝 기반 농경지 속성분류를 위한 TIF 이미지와 ECW 이미지 간 정확도 비교 연구)

  • Kim, Ji Young;Wee, Seong Seung
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.65 no.6
    • /
    • pp.15-22
    • /
    • 2023
  • In this study, We conduct a comparative study of deep learning-based classification of agricultural field attributes using Tagged Image File (TIF) and Enhanced Compression Wavelet (ECW) images. The goal is to interpret and classify the attributes of agricultural fields by analyzing the differences between these two image formats. "FarmMap," initiated by the Ministry of Agriculture, Food and Rural Affairs in 2014, serves as the first digital map of agricultural land in South Korea. It comprises attributes such as paddy, field, orchard, agricultural facility and ginseng cultivation areas. For the purpose of comparing deep learning-based agricultural attribute classification, we consider the location and class information of objects, as well as the attribute information of FarmMap. We utilize the ResNet-50 instance segmentation model, which is suitable for this task, to conduct simulated experiments. The comparison of agricultural attribute classification between the two images is measured in terms of accuracy. The experimental results indicate that the accuracy of TIF images is 90.44%, while that of ECW images is 91.72%. The ECW image model demonstrates approximately 1.28% higher accuracy. However, statistical validation, specifically Wilcoxon rank-sum tests, did not reveal a significant difference in accuracy between the two images.

Real-time 3D multi-pedestrian detection and tracking using 3D LiDAR point cloud for mobile robot

  • Ki-In Na;Byungjae Park
    • ETRI Journal
    • /
    • v.45 no.5
    • /
    • pp.836-846
    • /
    • 2023
  • Mobile robots are used in modern life; however, object recognition is still insufficient to realize robot navigation in crowded environments. Mobile robots must rapidly and accurately recognize the movements and shapes of pedestrians to navigate safely in pedestrian-rich spaces. This study proposes real-time, accurate, three-dimensional (3D) multi-pedestrian detection and tracking using a 3D light detection and ranging (LiDAR) point cloud in crowded environments. The pedestrian detection quickly segments a sparse 3D point cloud into individual pedestrians using a lightweight convolutional autoencoder and connected-component algorithm. The multi-pedestrian tracking identifies the same pedestrians considering motion and appearance cues in continuing frames. In addition, it estimates pedestrians' dynamic movements with various patterns by adaptively mixing heterogeneous motion models. We evaluate the computational speed and accuracy of each module using the KITTI dataset. We demonstrate that our integrated system, which rapidly and accurately recognizes pedestrian movement and appearance using a sparse 3D LiDAR, is applicable for robot navigation in crowded spaces.

Training Performance Analysis of Semantic Segmentation Deep Learning Model by Progressive Combining Multi-modal Spatial Information Datasets (다중 공간정보 데이터의 점진적 조합에 의한 의미적 분류 딥러닝 모델 학습 성능 분석)

  • Lee, Dae-Geon;Shin, Young-Ha;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.2
    • /
    • pp.91-108
    • /
    • 2022
  • In most cases, optical images have been used as training data of DL (Deep Learning) models for object detection, recognition, identification, classification, semantic segmentation, and instance segmentation. However, properties of 3D objects in the real-world could not be fully explored with 2D images. One of the major sources of the 3D geospatial information is DSM (Digital Surface Model). In this matter, characteristic information derived from DSM would be effective to analyze 3D terrain features. Especially, man-made objects such as buildings having geometrically unique shape could be described by geometric elements that are obtained from 3D geospatial data. The background and motivation of this paper were drawn from concept of the intrinsic image that is involved in high-level visual information processing. This paper aims to extract buildings after classifying terrain features by training DL model with DSM-derived information including slope, aspect, and SRI (Shaded Relief Image). The experiments were carried out using DSM and label dataset provided by ISPRS (International Society for Photogrammetry and Remote Sensing) for CNN-based SegNet model. In particular, experiments focus on combining multi-source information to improve training performance and synergistic effect of the DL model. The results demonstrate that buildings were effectively classified and extracted by the proposed approach.

Detection of Plastic Greenhouses by Using Deep Learning Model for Aerial Orthoimages (딥러닝 모델을 이용한 항공정사영상의 비닐하우스 탐지)

  • Byunghyun Yoon;Seonkyeong Seong;Jaewan Choi
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.2
    • /
    • pp.183-192
    • /
    • 2023
  • The remotely sensed data, such as satellite imagery and aerial photos, can be used to extract and detect some objects in the image through image interpretation and processing techniques. Significantly, the possibility for utilizing digital map updating and land monitoring has been increased through automatic object detection since spatial resolution of remotely sensed data has improved and technologies about deep learning have been developed. In this paper, we tried to extract plastic greenhouses into aerial orthophotos by using fully convolutional densely connected convolutional network (FC-DenseNet), one of the representative deep learning models for semantic segmentation. Then, a quantitative analysis of extraction results had performed. Using the farm map of the Ministry of Agriculture, Food and Rural Affairsin Korea, training data was generated by labeling plastic greenhouses into Damyang and Miryang areas. And then, FC-DenseNet was trained through a training dataset. To apply the deep learning model in the remotely sensed imagery, instance norm, which can maintain the spectral characteristics of bands, was used as normalization. In addition, optimal weights for each band were determined by adding attention modules in the deep learning model. In the experiments, it was found that a deep learning model can extract plastic greenhouses. These results can be applied to digital map updating of Farm-map and landcover maps.

A Comparison of Deep Reinforcement Learning and Deep learning for Complex Image Analysis

  • Khajuria, Rishi;Quyoom, Abdul;Sarwar, Abid
    • Journal of Multimedia Information System
    • /
    • v.7 no.1
    • /
    • pp.1-10
    • /
    • 2020
  • The image analysis is an important and predominant task for classifying the different parts of the image. The analysis of complex image analysis like histopathological define a crucial factor in oncology due to its ability to help pathologists for interpretation of images and therefore various feature extraction techniques have been evolved from time to time for such analysis. Although deep reinforcement learning is a new and emerging technique but very less effort has been made to compare the deep learning and deep reinforcement learning for image analysis. The paper highlights how both techniques differ in feature extraction from complex images and discusses the potential pros and cons. The use of Convolution Neural Network (CNN) in image segmentation, detection and diagnosis of tumour, feature extraction is important but there are several challenges that need to be overcome before Deep Learning can be applied to digital pathology. The one being is the availability of sufficient training examples for medical image datasets, feature extraction from whole area of the image, ground truth localized annotations, adversarial effects of input representations and extremely large size of the digital pathological slides (in gigabytes).Even though formulating Histopathological Image Analysis (HIA) as Multi Instance Learning (MIL) problem is a remarkable step where histopathological image is divided into high resolution patches to make predictions for the patch and then combining them for overall slide predictions but it suffers from loss of contextual and spatial information. In such cases the deep reinforcement learning techniques can be used to learn feature from the limited data without losing contextual and spatial information.

Crack Inspection and Mapping of Concrete Bridges using Integrated Image Processing Techniques (통합 이미지 처리 기술을 이용한 콘크리트 교량 균열 탐지 및 매핑)

  • Kim, Byunghyun;Cho, Soojin
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.1
    • /
    • pp.18-25
    • /
    • 2021
  • In many developed countries, such as South Korea, efficiently maintaining the aging infrastructures is an important issue. Currently, inspectors visually inspect the infrastructure for maintenance needs, but this method is inefficient due to its high costs, long logistic times, and hazards to the inspectors. Thus, in this paper, a novel crack inspection approach for concrete bridges is proposed using integrated image processing techniques. The proposed approach consists of four steps: (1) training a deep learning model to automatically detect cracks on concrete bridges, (2) acquiring in-situ images using a drone, (3) generating orthomosaic images based on 3D modeling, and (4) detecting cracks on the orthmosaic image using the trained deep learning model. Cascade Mask R-CNN, a state-of-the-art instance segmentation deep learning model, was trained with 3235 crack images that included 2415 hard negative images. We selected the Tancheon overpass, located in Seoul, South Korea, as a testbed for the proposed approach, and we captured images of pier 34-37 and slab 34-36 using a commercial drone. Agisoft Metashape was utilized as a 3D model generation program to generate an orthomosaic of the captured images. We applied the proposed approach to four orthomosaic images that displayed the front, back, left, and right sides of pier 37. Using pixel-level precision referencing visual inspection of the captured images, we evaluated the trained Cascade Mask R-CNN's crack detection performance. At the coping of the front side of pier 37, the model obtained its best precision: 94.34%. It achieved an average precision of 72.93% for the orthomosaics of the four sides of the pier. The test results show that this proposed approach for crack detection can be a suitable alternative to the conventional visual inspection method.